This final project presents my analysis of sales and methods of payment for electronics, fashions, entertainment, and other products offered by a superstore. I used many SQL commands including Create Table, Select, From, Where, Group By, Order By, Limit, Left Join, and Extract.
This project focused on creating data frames, filtering data, grouping data, merging, and displaying data. Furthermore, it also includes creating new columns in which specific conditions can be applied. The data is used to solve business problems within a superstore.
The first problem statement is determining the prizes taken from the Top 5 products from the Mobiles & Tablet Category. Second, the data is processed to fulfill the requirement to check whether there is a decrease in the sales of the Others Category in 2022. The task also requires the display of the top 20 products that have the highest decrease. Third, I utilize the data to process the Customer ID and Registered Data of the consumers who have checked out but have not yet made payment. Fourth, the data is sorted and analyzed to compare the average daily sales on the weekends and those on the weekdays in the time range of 3 months.
Quick iteration and reusability of metric calculations for powerful data exploration.
At Looker, we want to make it easier for data analysts to service the needs of the data-hungry users in their organizations. We believe too much of their time is spent responding to ad hoc data requests and not enough time is spent building, experimenting, and embellishing a robust model of the business. Worse yet, business users are starving for data, but are forced to make important decisions without access to data that could guide them in the right direction. Looker addresses both of these problems with a YAML-based modeling language called LookML.
This paper walks through a number of data modeling examples, demonstrating how to use LookML to generate, alter, and update reports—without the need to rewrite any SQL. With LookML, you build your business logic, defining your important metrics once and then reusing them throughout a model—allowing quick, rapid iteration of data exploration, while also ensuring the accuracy of the SQL that’s generated. Small updates are quick and can be made immediately available to business users to manipulate, iterate, and transform in any way they see fit.
This document outlines an inventory management system project. It includes sections on the disadvantages of the old manual system, advantages of the new computerized system, hardware and software requirements, data flow diagrams, entity relationship diagrams, data dictionaries, form designs, and data reports. Key entities in the system include items, suppliers, purchase orders, and customer bills. The new system aims to automate the inventory management process and make it more efficient by reducing time and paperwork compared to the old manual system.
This document outlines an inventory management system project. It includes sections on the disadvantages of the old manual system, advantages of the new computerized system, hardware and software requirements, data flow diagrams, entity relationship diagrams, data dictionaries, form designs, and data reports. Key entities in the system include items, suppliers, purchase orders, and customer bills. The new system aims to automate the inventory management process and make it more efficient by reducing time and paperwork compared to the old manual system.
Queries allow users to extract specific information from one or more database tables. There are different ways to create queries, including using design view, a wizard, or SQL view. Queries can include calculations, formatting, parameters, and summaries to provide flexible reporting of essential data.
This document provides steps for updating ADS (Analytical Data Store) and KXEN models. It involves checking data availability, running various ADS projects to populate tables, performing sanity checks on table counts, and applying KXEN models to score different customer segments. The key steps are: 1) Check source data and run preliminary ADS, 2) Populate base tables and run additional ADS in sequence, 3) Perform sanity checks on table counts, 4) Apply KXEN models to score segments, changing settings for each segment.
Oracle provides several analytical functions that allow for powerful data analysis using SQL. These include group functions that aggregate data over groups or windows, as well as window functions like ROW_NUMBER, RANK, and LAG that analyze data relative to the current row. ROLLUP and CUBE extensions to the GROUP BY clause enable calculation of subtotals across multiple dimensions of data with a single query.
This document describes the design and implementation of a data mart for an airline company. It begins by introducing the source databases - Agent Detail (AD) and Ticket Order Detail (TOD) - and describing the tables and attributes within each. It then covers normalization of the data model into third normal form. Finally, it presents the star schema design for the data mart, with dimensions for Customer, Agent, and Time, and facts that connect to each dimension. The data mart is intended to enable business intelligence and reporting to help management with strategic decision making.
This project focused on creating data frames, filtering data, grouping data, merging, and displaying data. Furthermore, it also includes creating new columns in which specific conditions can be applied. The data is used to solve business problems within a superstore.
The first problem statement is determining the prizes taken from the Top 5 products from the Mobiles & Tablet Category. Second, the data is processed to fulfill the requirement to check whether there is a decrease in the sales of the Others Category in 2022. The task also requires the display of the top 20 products that have the highest decrease. Third, I utilize the data to process the Customer ID and Registered Data of the consumers who have checked out but have not yet made payment. Fourth, the data is sorted and analyzed to compare the average daily sales on the weekends and those on the weekdays in the time range of 3 months.
Quick iteration and reusability of metric calculations for powerful data exploration.
At Looker, we want to make it easier for data analysts to service the needs of the data-hungry users in their organizations. We believe too much of their time is spent responding to ad hoc data requests and not enough time is spent building, experimenting, and embellishing a robust model of the business. Worse yet, business users are starving for data, but are forced to make important decisions without access to data that could guide them in the right direction. Looker addresses both of these problems with a YAML-based modeling language called LookML.
This paper walks through a number of data modeling examples, demonstrating how to use LookML to generate, alter, and update reports—without the need to rewrite any SQL. With LookML, you build your business logic, defining your important metrics once and then reusing them throughout a model—allowing quick, rapid iteration of data exploration, while also ensuring the accuracy of the SQL that’s generated. Small updates are quick and can be made immediately available to business users to manipulate, iterate, and transform in any way they see fit.
This document outlines an inventory management system project. It includes sections on the disadvantages of the old manual system, advantages of the new computerized system, hardware and software requirements, data flow diagrams, entity relationship diagrams, data dictionaries, form designs, and data reports. Key entities in the system include items, suppliers, purchase orders, and customer bills. The new system aims to automate the inventory management process and make it more efficient by reducing time and paperwork compared to the old manual system.
This document outlines an inventory management system project. It includes sections on the disadvantages of the old manual system, advantages of the new computerized system, hardware and software requirements, data flow diagrams, entity relationship diagrams, data dictionaries, form designs, and data reports. Key entities in the system include items, suppliers, purchase orders, and customer bills. The new system aims to automate the inventory management process and make it more efficient by reducing time and paperwork compared to the old manual system.
Queries allow users to extract specific information from one or more database tables. There are different ways to create queries, including using design view, a wizard, or SQL view. Queries can include calculations, formatting, parameters, and summaries to provide flexible reporting of essential data.
This document provides steps for updating ADS (Analytical Data Store) and KXEN models. It involves checking data availability, running various ADS projects to populate tables, performing sanity checks on table counts, and applying KXEN models to score different customer segments. The key steps are: 1) Check source data and run preliminary ADS, 2) Populate base tables and run additional ADS in sequence, 3) Perform sanity checks on table counts, 4) Apply KXEN models to score segments, changing settings for each segment.
Oracle provides several analytical functions that allow for powerful data analysis using SQL. These include group functions that aggregate data over groups or windows, as well as window functions like ROW_NUMBER, RANK, and LAG that analyze data relative to the current row. ROLLUP and CUBE extensions to the GROUP BY clause enable calculation of subtotals across multiple dimensions of data with a single query.
This document describes the design and implementation of a data mart for an airline company. It begins by introducing the source databases - Agent Detail (AD) and Ticket Order Detail (TOD) - and describing the tables and attributes within each. It then covers normalization of the data model into third normal form. Finally, it presents the star schema design for the data mart, with dimensions for Customer, Agent, and Time, and facts that connect to each dimension. The data mart is intended to enable business intelligence and reporting to help management with strategic decision making.
Company segmentation - an approach with RCasper Crause
We classify companies based on how their stocks trade using their daily stock returns (percentage movement from one day to the next). This analysis will help your organization determine which companies are related to each other (competitors and have similar attributes).
paytm_mall_epurchase_data data analysis.ankita222345
Completed an internship given by psyliq. where I need to find out the insights for paytm epurchase. in which i used Mysql workbench as well as Excel to showcase my project insights.
June 10, 2010 BDPA Charlotte Program Meeting Presentation.
Presenter:
Markus Beamer, BDPA Charlotte President Elect
Topic:
Intelligent Data Strategies - Intro to Data Marts and Data Warehouses
This document provides an in-depth reference on set analysis in QlikView. It begins by acknowledging that set analysis is a difficult subject, even for experienced users. The document then covers key aspects of set analysis syntax, including identifiers, operators, modifiers, and element lists. It provides many examples of how to use set analysis to filter charts and calculations based on specific field selections. The goal is to serve as a complete cheat sheet for set analysis in QlikView, with definitions, examples, and tips for effectively using this complex topic.
Cover PageComplete and copy the following to Word for your cover p.docxfaithxdunce63732
Cover PageComplete and copy the following to Word for your cover page. Be sure that the document is stapled properly. Do not use a plastic cover or folder.In the Footer of the Word documents, add the Now() function to show what day and time the documents were printed.Submit the Excel file to CANVAS as: lastname_firstname.xls. Hand-in the Word document immediately prior to Exam 1.Although students are encouraged to ask questions for clarification, this exercise is intended to be well within the capability of students at the 3000 level and studentsshould be able to complete the project with minimal assistance. Instructions are included on each worksheet but feel free to request clarification.ACG 3401 Accounting Information SystemsExcel AssignmentSubmitted By:Name Last: First:<-- Only use this for cover page.Spring 2015By submitting this document, I affirm that the work is the product of my own efforts withoutthe assistance of another person and that I have not given assistance to another student.<-- You must sign for the submission to be graded.Signature of student
InstructionsINSTRUCTIONS:This is an .xls file and should not be changed to another filetype in order to preserve macros.Follow the instructions on each worksheet. Copy results to MS Word and include pages numbers.The page numbers for each exercise are given below (at bottom of this worksheet).Appearance counts. Be sure that results are presented professionally and are readable.Three worksheets are data files and are referenced in the instructions. These are named Product Data, Industry Data, and Data Worksheet.Create range names for the following:Remember ranges should not include the headers (field names). Be careful to insure you have selected the entire range for that field.(Click F3 to view the range names - click these to insert to formula or you can type them in directly.)You may need to create range names other than these.From the Industry Data Worksheet, create range names for the following:1) Employees2) Sales3) Address4) Name5) State6) ZIPFrom the Data Worksheet, create range names for the following:1) Cash2) Company3) EBIT4) Eff_Tax_Rate5) Exchange6) SICCreate a range name for the entire Product Data table but include the headers. I used the name 'Product'.Tab ColorsGreenDatabases to be used.YellowExamplesBlueInstructions to perform graded exercisesWhen copying portions of the worksheet to your MS Word document, you will find the "Snipping Tool" very helpful.Checklist for Submitted Documents (Be sure that all documents are formatted properly and readable)Page No.naCover page with name and section number (stapled)ResultsFormulas1Horizontal and Vertical AnalysisYesYes2Financial Ratio Analysis - Results and Formulas.YesYes3VlookupYesYes3HLookupYesYes4DataTableYesYes4DropDown Box - Result Only.YesNo5Dfunctions - Results and Formulas.YesYes6Functions1YesYes7Functions2YesYes8Annual Income StatementYesYes9Macro.
Ground Breakers Romania: Explain the explain_planMaria Colgan
This session was delivered as part of the EMEA Ground Breakers tour in Romania, Oct. 2019. The execution plan for a SQL statement can often seem complicated and hard to understand. Determining if the execution plan you are looking at is the best plan you could get or attempting to improve a poorly performing execution plan can be a daunting task even for the most experienced DBA or developer. This session examines the different aspects of an execution plan, from selectivity to parallel execution and explains what information you should be gleaming from the plan and how it affects the execution. It offers insight into what caused the Optimizer to make the decision it did as well as a set of corrective measures that can be used to improve each aspect of the plan.
The document summarizes the results of analyzing sales data from Sam's Club stores. Key findings include:
- Total sales across all stores was $64.6 million, with the top-selling store bringing in $5.8 million.
- Sales were highest on Sundays ($13.7 million) and Saturdays ($9.3 million).
- The most common member type, "V", accounted for $6 million in sales.
- On average, members visited stores 2.1 times and spent $81.81 per visit, purchasing 8 items.
- Peak shopping hours were between 3-6pm on weekdays and earlier on weekends.
Order Management provides tools to manage sales orders and streamline the order fulfillment process from order entry to shipment. It includes functions like order promising, order capture, transportation management, and integration with EDI, XML, and web storefronts. This can help businesses reduce costs, improve order accuracy, and increase on-time delivery rates. Order and line information is stored in tables like OE_ORDER_HEADERS_ALL, OE_ORDER_LINES_ALL, and MTL_ONHAND_QUANTITIES to track items, pricing, statuses and fulfillment progress.
The document provides step-by-step instructions for customizing the check printing report in Oracle R12. It discusses developing customized templates, modifying code to include additional data, and setting up payment profiles and formats to display data using the customized templates. Key steps include: 1) Developing customized templates; 2) Adding code to retrieve additional data; 3) Creating template definitions, payment formats, documents, and profiles linked to the customized templates. This allows payments to be generated using the customized templates and layouts while retaining the option to use the standard templates.
Second presentation on domain-driven design. In this presentation tactical designs are presented, describing what value objects, entities, aggregates, domain events and domain services are (and how they can be implemented)
The document provides explanations of various SQL concepts including cross join, order by, distinct, union and union all, truncate and delete, compute clause, data warehousing, data marts, fact and dimension tables, snowflake schema, ETL processing, BCP, DTS, multidimensional analysis, and bulk insert. It also discusses the three primary ways of storing information in OLAP: MOLAP, ROLAP, and HOLAP.
This document provides an overview of online analytical processing (OLAP). It defines OLAP as a process for analyzing multidimensional data to help decision makers. OLAP uses data warehouses to store historical data in a structured format. It allows for analytical queries and operations like aggregation, roll-up, drill-down and slicing and dicing of data. SQL extensions and OLAP functions further aid analysis. OLAP systems can be MOLAP, ROLAP or HOLAP based on their architecture and data storage methods. Commercial OLAP systems include IBM, Oracle and Microsoft products.
This presentation includes, the steps to create a database in SQL Server, Importing data to the tables, Populating data into Dimension and Fact tables using SSIS, Report generation using SSRS, Data Presentation using Tableau, and also using the Adventure works dataset to differentiate between graph database and DBMSs with AirBnB Database.
This document contains examples of SQL queries with explanations and formatting guidelines. The queries select data from tables to return item details, revenues, and ratios. Comments provide best practices for formatting, commenting code, and modeling data. The document also discusses data structures, common table expressions, views, functions and technologies.
Part 3 of the SQL Tuning workshop examines the different aspects of an execution plan, from cardinality estimates to parallel execution and explains what information you should be gleaming from the plan and how it affects the execution. It offers insight into what caused the Optimizer to make the decision it did as well as a set of corrective measures that can be used to improve each aspect of the plan.
This document describes data flow diagrams and Jackson Structured Programming. It provides details on how to construct DFDs, including leveled DFDs for large systems. It explains how DFDs differ from flowcharts by focusing on data flow rather than control flow. The document also provides an example DFD for a payroll system. It then describes Jackson Structured Programming and how to develop the data structure diagram, program structure diagram, and list operations and conditions. An example JSP is provided for an accounting processing system.
The random forest model generated 182 decision trees from the training data to classify whether users will continue their session or not, with an out-of-bag error rate of 34.17%. Important features were identified using the Gini index. The random forest model was able to successfully build a rule-based classification model with over 70% accuracy on the test data to identify if a user will continue or leave a session based on their behavior metrics.
This document describes a data warehouse and business intelligence project for analyzing Starbucks store data. It discusses extracting data from various structured, semi-structured, and unstructured sources, transforming the data using SQL and R, and loading it into a star schema data warehouse with fact and dimension tables. The data warehouse is then used for business queries and analysis in Tableau, with case studies examining city revenue, visitor and beverage sales by city, and city ratings based on food and beverage counts. The analysis finds that New York City generally has the highest revenue, visitor counts, and ratings.
1 – Implementing the Decorator Design Pattern (with St.docxhoney725342
1
– Implementing the Decorator Design Pattern
(with Strategy Pattern and Factory Class)
)
PROBLEM
You are to design and implement code based on the Decorator pattern for generating an
appropriate receipt for a customer buying items at a particular Best Buy store. The general format of a
receipt is as follows:
Basic Receipt
Store Header (store street address, state abbreviation, phone number, store number)
Date of Sale
Itemized Purchases
Total Sale (without sales tax)
Amount Due (with added sales tax)
Dynamically-Added Items
Tax Computation object (based on state residing in)
Optional Secondary Headers (“Greetings”) to be printed at the very top of the receipt,
e.g.,
- “Happy Holidays from Best Buy”
- “Summer Sales are Hot at Best Buy”
Relevant Rebate Forms (to be printed at the end of the receipt)
Promotional Coupons (to be printed at the end of the receipt)
e.g., “10% off next purchase of $100 or more” coupon
APPROACH
We will assume that the code is written as part of the software used by all Best Buy stores around the
country. Therefore, the information in the Store Header will vary depending on the particular store's
location. In addition, the amount of sales tax (if any) is determined by the state that the store resides in,
and included in a receipt by use of the Strategy pattern. The added items to be displayed on each receipt
(i.e.,. greeting secondary header, rebate form or coupon) will be handled by use of the Decorator pattern.
Finally, the receipt will be constructed by use of the Factory class pattern.
Basic Receipt
The information for the basic receipt should be stored in a BasicReceipt object (see below). The instance
variables of a BasicReceipt should contain the date of sale, a PurchasedItems object, the total sale
(without tax) and amount due (with added tax), each of type float. In addition, following the Strategy
design pattern, there should be an instance variable of (interface) type TaxComputation that can
be assigned the appropriate tax computation object (e.g., NJTaxComputation) for the state that the
store resides in.
2
Determining Sales Tax
We will implement tax computation classes so that receipts can be generated for one of five possible
states: Alabama (4%), Delaware (no sales tax), Georgia (4%), Maryland (6%) and Missouri (4.225%). Note
that a number of states have a “sales tax holidays” in which certain items are not taxed (e.g., clothers,
computers) in preparation of the new school year. These holidays normally last for 2-4 days over at
weekend at the end of summer (e.g., the first weekend in August). Use the information provided in
Wikipedia about states with a tax holiday (here) to implement a state computation object that would
properly compute the sales tax of a given state, for a certain set of purchased items, for the current or
any future year. (Note that we will assume that anything purchased in Best Buy is a computer-related
item.)
Adding ...
This document provides tips for managing spreadsheets and extracting information from data. It recommends using Google Sheets to collaborate on spreadsheets with others. It also outlines various spreadsheet functions for summarizing data, extracting text, concatenating strings, and looking up values. Conditional formatting is suggested to highlight important information. Pivot tables are presented as a way to summarize tables with filters and aggregations.
This dashboard aims to evaluate the monthly sales achievement per month for mobiles and tablets, computing, and appliances in a superstore. The tool that is used in this project is Looker Studio.
This task presents SQL basic commands which I used to create a new table with its data. I queried the sales data of furniture, office supplies, and technology.
More Related Content
Similar to Final Project SQL - Elyada Wigati Pramaresti.pptx
Company segmentation - an approach with RCasper Crause
We classify companies based on how their stocks trade using their daily stock returns (percentage movement from one day to the next). This analysis will help your organization determine which companies are related to each other (competitors and have similar attributes).
paytm_mall_epurchase_data data analysis.ankita222345
Completed an internship given by psyliq. where I need to find out the insights for paytm epurchase. in which i used Mysql workbench as well as Excel to showcase my project insights.
June 10, 2010 BDPA Charlotte Program Meeting Presentation.
Presenter:
Markus Beamer, BDPA Charlotte President Elect
Topic:
Intelligent Data Strategies - Intro to Data Marts and Data Warehouses
This document provides an in-depth reference on set analysis in QlikView. It begins by acknowledging that set analysis is a difficult subject, even for experienced users. The document then covers key aspects of set analysis syntax, including identifiers, operators, modifiers, and element lists. It provides many examples of how to use set analysis to filter charts and calculations based on specific field selections. The goal is to serve as a complete cheat sheet for set analysis in QlikView, with definitions, examples, and tips for effectively using this complex topic.
Cover PageComplete and copy the following to Word for your cover p.docxfaithxdunce63732
Cover PageComplete and copy the following to Word for your cover page. Be sure that the document is stapled properly. Do not use a plastic cover or folder.In the Footer of the Word documents, add the Now() function to show what day and time the documents were printed.Submit the Excel file to CANVAS as: lastname_firstname.xls. Hand-in the Word document immediately prior to Exam 1.Although students are encouraged to ask questions for clarification, this exercise is intended to be well within the capability of students at the 3000 level and studentsshould be able to complete the project with minimal assistance. Instructions are included on each worksheet but feel free to request clarification.ACG 3401 Accounting Information SystemsExcel AssignmentSubmitted By:Name Last: First:<-- Only use this for cover page.Spring 2015By submitting this document, I affirm that the work is the product of my own efforts withoutthe assistance of another person and that I have not given assistance to another student.<-- You must sign for the submission to be graded.Signature of student
InstructionsINSTRUCTIONS:This is an .xls file and should not be changed to another filetype in order to preserve macros.Follow the instructions on each worksheet. Copy results to MS Word and include pages numbers.The page numbers for each exercise are given below (at bottom of this worksheet).Appearance counts. Be sure that results are presented professionally and are readable.Three worksheets are data files and are referenced in the instructions. These are named Product Data, Industry Data, and Data Worksheet.Create range names for the following:Remember ranges should not include the headers (field names). Be careful to insure you have selected the entire range for that field.(Click F3 to view the range names - click these to insert to formula or you can type them in directly.)You may need to create range names other than these.From the Industry Data Worksheet, create range names for the following:1) Employees2) Sales3) Address4) Name5) State6) ZIPFrom the Data Worksheet, create range names for the following:1) Cash2) Company3) EBIT4) Eff_Tax_Rate5) Exchange6) SICCreate a range name for the entire Product Data table but include the headers. I used the name 'Product'.Tab ColorsGreenDatabases to be used.YellowExamplesBlueInstructions to perform graded exercisesWhen copying portions of the worksheet to your MS Word document, you will find the "Snipping Tool" very helpful.Checklist for Submitted Documents (Be sure that all documents are formatted properly and readable)Page No.naCover page with name and section number (stapled)ResultsFormulas1Horizontal and Vertical AnalysisYesYes2Financial Ratio Analysis - Results and Formulas.YesYes3VlookupYesYes3HLookupYesYes4DataTableYesYes4DropDown Box - Result Only.YesNo5Dfunctions - Results and Formulas.YesYes6Functions1YesYes7Functions2YesYes8Annual Income StatementYesYes9Macro.
Ground Breakers Romania: Explain the explain_planMaria Colgan
This session was delivered as part of the EMEA Ground Breakers tour in Romania, Oct. 2019. The execution plan for a SQL statement can often seem complicated and hard to understand. Determining if the execution plan you are looking at is the best plan you could get or attempting to improve a poorly performing execution plan can be a daunting task even for the most experienced DBA or developer. This session examines the different aspects of an execution plan, from selectivity to parallel execution and explains what information you should be gleaming from the plan and how it affects the execution. It offers insight into what caused the Optimizer to make the decision it did as well as a set of corrective measures that can be used to improve each aspect of the plan.
The document summarizes the results of analyzing sales data from Sam's Club stores. Key findings include:
- Total sales across all stores was $64.6 million, with the top-selling store bringing in $5.8 million.
- Sales were highest on Sundays ($13.7 million) and Saturdays ($9.3 million).
- The most common member type, "V", accounted for $6 million in sales.
- On average, members visited stores 2.1 times and spent $81.81 per visit, purchasing 8 items.
- Peak shopping hours were between 3-6pm on weekdays and earlier on weekends.
Order Management provides tools to manage sales orders and streamline the order fulfillment process from order entry to shipment. It includes functions like order promising, order capture, transportation management, and integration with EDI, XML, and web storefronts. This can help businesses reduce costs, improve order accuracy, and increase on-time delivery rates. Order and line information is stored in tables like OE_ORDER_HEADERS_ALL, OE_ORDER_LINES_ALL, and MTL_ONHAND_QUANTITIES to track items, pricing, statuses and fulfillment progress.
The document provides step-by-step instructions for customizing the check printing report in Oracle R12. It discusses developing customized templates, modifying code to include additional data, and setting up payment profiles and formats to display data using the customized templates. Key steps include: 1) Developing customized templates; 2) Adding code to retrieve additional data; 3) Creating template definitions, payment formats, documents, and profiles linked to the customized templates. This allows payments to be generated using the customized templates and layouts while retaining the option to use the standard templates.
Second presentation on domain-driven design. In this presentation tactical designs are presented, describing what value objects, entities, aggregates, domain events and domain services are (and how they can be implemented)
The document provides explanations of various SQL concepts including cross join, order by, distinct, union and union all, truncate and delete, compute clause, data warehousing, data marts, fact and dimension tables, snowflake schema, ETL processing, BCP, DTS, multidimensional analysis, and bulk insert. It also discusses the three primary ways of storing information in OLAP: MOLAP, ROLAP, and HOLAP.
This document provides an overview of online analytical processing (OLAP). It defines OLAP as a process for analyzing multidimensional data to help decision makers. OLAP uses data warehouses to store historical data in a structured format. It allows for analytical queries and operations like aggregation, roll-up, drill-down and slicing and dicing of data. SQL extensions and OLAP functions further aid analysis. OLAP systems can be MOLAP, ROLAP or HOLAP based on their architecture and data storage methods. Commercial OLAP systems include IBM, Oracle and Microsoft products.
This presentation includes, the steps to create a database in SQL Server, Importing data to the tables, Populating data into Dimension and Fact tables using SSIS, Report generation using SSRS, Data Presentation using Tableau, and also using the Adventure works dataset to differentiate between graph database and DBMSs with AirBnB Database.
This document contains examples of SQL queries with explanations and formatting guidelines. The queries select data from tables to return item details, revenues, and ratios. Comments provide best practices for formatting, commenting code, and modeling data. The document also discusses data structures, common table expressions, views, functions and technologies.
Part 3 of the SQL Tuning workshop examines the different aspects of an execution plan, from cardinality estimates to parallel execution and explains what information you should be gleaming from the plan and how it affects the execution. It offers insight into what caused the Optimizer to make the decision it did as well as a set of corrective measures that can be used to improve each aspect of the plan.
This document describes data flow diagrams and Jackson Structured Programming. It provides details on how to construct DFDs, including leveled DFDs for large systems. It explains how DFDs differ from flowcharts by focusing on data flow rather than control flow. The document also provides an example DFD for a payroll system. It then describes Jackson Structured Programming and how to develop the data structure diagram, program structure diagram, and list operations and conditions. An example JSP is provided for an accounting processing system.
The random forest model generated 182 decision trees from the training data to classify whether users will continue their session or not, with an out-of-bag error rate of 34.17%. Important features were identified using the Gini index. The random forest model was able to successfully build a rule-based classification model with over 70% accuracy on the test data to identify if a user will continue or leave a session based on their behavior metrics.
This document describes a data warehouse and business intelligence project for analyzing Starbucks store data. It discusses extracting data from various structured, semi-structured, and unstructured sources, transforming the data using SQL and R, and loading it into a star schema data warehouse with fact and dimension tables. The data warehouse is then used for business queries and analysis in Tableau, with case studies examining city revenue, visitor and beverage sales by city, and city ratings based on food and beverage counts. The analysis finds that New York City generally has the highest revenue, visitor counts, and ratings.
1 – Implementing the Decorator Design Pattern (with St.docxhoney725342
1
– Implementing the Decorator Design Pattern
(with Strategy Pattern and Factory Class)
)
PROBLEM
You are to design and implement code based on the Decorator pattern for generating an
appropriate receipt for a customer buying items at a particular Best Buy store. The general format of a
receipt is as follows:
Basic Receipt
Store Header (store street address, state abbreviation, phone number, store number)
Date of Sale
Itemized Purchases
Total Sale (without sales tax)
Amount Due (with added sales tax)
Dynamically-Added Items
Tax Computation object (based on state residing in)
Optional Secondary Headers (“Greetings”) to be printed at the very top of the receipt,
e.g.,
- “Happy Holidays from Best Buy”
- “Summer Sales are Hot at Best Buy”
Relevant Rebate Forms (to be printed at the end of the receipt)
Promotional Coupons (to be printed at the end of the receipt)
e.g., “10% off next purchase of $100 or more” coupon
APPROACH
We will assume that the code is written as part of the software used by all Best Buy stores around the
country. Therefore, the information in the Store Header will vary depending on the particular store's
location. In addition, the amount of sales tax (if any) is determined by the state that the store resides in,
and included in a receipt by use of the Strategy pattern. The added items to be displayed on each receipt
(i.e.,. greeting secondary header, rebate form or coupon) will be handled by use of the Decorator pattern.
Finally, the receipt will be constructed by use of the Factory class pattern.
Basic Receipt
The information for the basic receipt should be stored in a BasicReceipt object (see below). The instance
variables of a BasicReceipt should contain the date of sale, a PurchasedItems object, the total sale
(without tax) and amount due (with added tax), each of type float. In addition, following the Strategy
design pattern, there should be an instance variable of (interface) type TaxComputation that can
be assigned the appropriate tax computation object (e.g., NJTaxComputation) for the state that the
store resides in.
2
Determining Sales Tax
We will implement tax computation classes so that receipts can be generated for one of five possible
states: Alabama (4%), Delaware (no sales tax), Georgia (4%), Maryland (6%) and Missouri (4.225%). Note
that a number of states have a “sales tax holidays” in which certain items are not taxed (e.g., clothers,
computers) in preparation of the new school year. These holidays normally last for 2-4 days over at
weekend at the end of summer (e.g., the first weekend in August). Use the information provided in
Wikipedia about states with a tax holiday (here) to implement a state computation object that would
properly compute the sales tax of a given state, for a certain set of purchased items, for the current or
any future year. (Note that we will assume that anything purchased in Best Buy is a computer-related
item.)
Adding ...
This document provides tips for managing spreadsheets and extracting information from data. It recommends using Google Sheets to collaborate on spreadsheets with others. It also outlines various spreadsheet functions for summarizing data, extracting text, concatenating strings, and looking up values. Conditional formatting is suggested to highlight important information. Pivot tables are presented as a way to summarize tables with filters and aggregations.
Similar to Final Project SQL - Elyada Wigati Pramaresti.pptx (20)
This dashboard aims to evaluate the monthly sales achievement per month for mobiles and tablets, computing, and appliances in a superstore. The tool that is used in this project is Looker Studio.
This task presents SQL basic commands which I used to create a new table with its data. I queried the sales data of furniture, office supplies, and technology.
Through my task, I learned about how to work with the Google Sheet. This task covers data extraction, number formatting, conditional formatting, how to remove duplicate data, and data validation. The data presents the sales and consumer segment of office supplies, furniture, and technology in the United States.
This work explains the Basic Statistics for Data Analysis which includes the type of data, measure of centric (mean, median, etc.), measure of distribution (variance, deviation standard), quartile, percentile, and outliers. In this task, I used statistics to analyze voucher redeems, the service-level agreements, and compare payment with living costs.
Improvement as Data Analyst presents business problems, different problem-solving tools (5 Why, Action Priority Chart, Fishbone, and Flow Mapping), and data analysis process.
First Session - Kickstart Career as Data Analyst presents the definition of data, 5 parameters of big data, why many companies today need data, and different data-related jobs including data engineer, data analyst, and data scientist.
Hello everyone! This is my Excel Portfolio which covers all of my tasks during my intensive bootcamp. Through this bootcamp, I learned about Basic Formula and Functions, Data Cleaning, Data Validation, Conditional Formatting, Data Visualization, VLookup and Match, Pivot, Dashboard Reporting, and Macro VBA. This portfolio presents my analysis of sales, revenue, profit, and popular marketplace of office supplies and furniture. Hopefully, this will help me to open my career path.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
2. Data analyzed in this task are collected from Tokopedia (not the original data). The dataset is:
order_detail:
1. id → the unique number of order/id_order
2. customer_id → the unique number of customer
3. order_date → date when the transaction is carried out
4. sku_id → the unique number of a product (sku is stock
keeping unit)
5. price → the amount of money given in payment for
something
6. qty_ordered → the number of items purchased by customers
7. before_discount → the total price value of products
8. discount_amount → the discount value of the total product.
9. after_discount → the value of the total price after
aggregated by the discount
10. is_gross → shows that customers have not yet paid the
orders
11. is_valid → shows that customers have paid the orders
12. is_net → shows that the transaction is finished
13. payment_id → the unique number of payment method
3. sku_detail:
1. id → the unique number of a product (it can be used as a key for
joining)
2. sku_name → the name of the product
3. base_price → the price that is shown in the tagging
4. cogs → cost of selling one product
5. category → product category
customer_detail:
1. id → the unique number of a customer
2. registered_date → the date when a customer sign up as a member
Payment_detail:
1. id → the unique number of a payment
2. payment_method → the method of payment applied during
transaction
5. Business Questions
1. During the transactions that are carried out in 2021, in which month does the total transaction value
(after_discount) reach its peak? Use is_valid = 1 to filter the data. Source table: order_detail.
2. During the transactions that occurred in 2022, which category generated the largest transaction? Use
is_valid = 1 to filter the data. Source table: order_detail, sku_detail.
3. Compare the transaction value for each category in 2021 with those in 2022. Identify which category
that showed an increase, and which category experienced a decrease in the value of transaction. Use
is_valid = 1 to filter the data. Source table: order_detail, sku_detail.
4. Identify the top 5 most popular methods of payment used in 2022 (based on total unique order) Use
is_valid = 1 to filter the data. Source table: order_detail, payment_method.
5. Sort the 5 products based on their value of transactions: Samsung, Apple, Sony, Huawei, Lenovo. Use
is_valid = 1 to filter the data. Source table: order_detail, payment_method.
15. 1. During the transactions that are carried out in 2021, in which month does the total transaction value
(after_discount) reach its peak? Use is_valid = 1 to filter the data. Source table: order_detail.
SELECT
to_char (order_date,'Month') AS month_transaction,
-- to_char is used to format the year, month, and date. AS is used as an alias for
the data that is taken from the order_date.
SUM(after_discount) AS total_transaction
-- calculate the total sum from ‘after_discount’. AS is used as an alias for the
calculation result.
FROM
order_detail
WHERE
is_valid = 1
AND EXTRACT (year FROM order_date) = 2021
-- Extract the month from order_date in 2021
GROUP BY
1
-- Group the data based on the result. We choose the month that shows the
highest total transactions.
ORDER BY
2 DESC
-- orders the result in descending form or from highest to the lowest. 2 is the
sum after_discount
LIMIT 1
Query to take the data from the database and present the result
Function to specify the dataset from which the data is taken
Function to filter the data
Limit the output. 1 means it gives the highest transaction
value as it only presents the first row.
16. August is the month with the highest total transaction value in
2021. Its total transaction is 227,862,744.
Result
17. 2. During the transactions that occurred in 2022, which category generated the largest transaction? Use
is_valid = 1 to filter the data. Source table: order_detail, sku_detail.
SELECT
category,
SUM (after_discount) AS total_transaction
-- calculate the total sum from ‘after_discount’. AS is used as an alias for the
calculation result.
FROM
sku_detail AS sd
LEFT JOIN order_detail AS od
ON sd.id = od.sku_id
-- Left join means returning all records from sku_detail and the matched
records from the order_detail.
WHERE
is_valid = 1
AND EXTRACT (YEAR FROM order_date) = 2022
-- Extract the month from order_date in 2022
GROUP BY
1
-- Group the data based on the result. We choose the category that generates
the highest total transactions.
ORDER BY
2 DESC
-- orders the result in descending form or from highest to the lowest. 2 is the
sum after_discount
LIMIT 1
Query to take the data from the database and present the result
Function to filter the data
The category of product in sku_detail
Function to specify the dataset from which the data is taken
AS is the alias of sku_detail
AS is the alias of order_detail
Limit the output. 1 means it gives the highest transaction
value as it only presents the first row.
18. Mobiles and tablets are categories with the highest total transaction value in
2022. Its total transaction is 918,451,576
Result
19. 3. Compare the transaction value for each category in 2021 with those in 2022. Identify which category
that showed an increase, and which category experienced a decrease in the value of transaction. Use
is_valid = 1 to filter the data. Source table: order_detail, sku_detail.
-- The comparison of transactions from each category in 2021 and 2022
SELECT
category,
SUM(CASE WHEN to_char (order_date,'yyyy-mm-dd') BETWEEN '2021-01-01'
AND '2021-12-31' THEN od.after_discount END) total_sales_2021,
-- CASE WHEN is the alternative function of IF-ELSE. If the data falls within the
range of dates above, then there will be value after_discount. END function is
added because if the data falls outside the range of dates above, then there will
be no value after_discount.
SUM(CASE WHEN to_char (order_date,'yyyy-mm-dd') BETWEEN '2022-01-01'
AND '2022-12-31' THEN od.after_discount END) total_sales_2022
FROM
order_detail AS od
LEFT JOIN
sku_detail AS sd
ON sd.id = od.sku_id
-- Left join means returning all records from order_detail and the matched
records from the sku_detail.
WHERE
is_valid = 1
GROUP BY 1
ORDER BY 2 DESC
Query to take the data from the database and present the result
The category of product in sku_detail
AS is the alias of order_detail
Function to filter the data
Group by category
Orders the result in descending form or from highest to the lowest. 2 is the
sum after_discount
Function to specify the dataset from which the data is taken
20. -- Categories that show growth and categories that show slump
WITH
full_transaction as (
-- In this part, the WITH function is added. It is because, in this part, there are
some conditional functions. WITH helps to define the first functions before
reading the main queries.
SELECT
category,
SUM(CASE WHEN to_char (order_date,'yyyy-mm-dd') BETWEEN '2021-01-01'
AND '2021-12-31' THEN od.after_discount END) total_sales_2021,
SUM(CASE WHEN to_char (order_date,'yyyy-mm-dd') BETWEEN '2022-01-01'
AND '2022-12-31' THEN od.after_discount END) total_sales_2022
-- CASE WHEN is the alternative function of IF-ELSE. If the data falls within the
range of dates above, then there will be value after_discount. END function is
added because if the data falls outside the range of dates above, then there will
be no value after_discount.
FROM
order_detail AS od
LEFT JOIN
sku_detail AS sd
ON sd.id = od.sku_id
WHERE
is_valid = 1
GROUP BY 1
ORDER BY 2
DESC
)
AS is the alias of order_detail
Left join means returning all records from order_detail and the matched
records from the sku_detail.
Function to filter the data
Orders the result in descending form or from highest to the lowest. 2 is the
sum after_discount
Function to specify the dataset from which the data is taken
21. SELECT
full_transaction.*,
total_sales_2022 - total_sales_2021 AS growth_value
-- The star symbol means that all columns in full_transaction are retrieved. Then
create one column where total_sales_2022 is minus by total_sales_2021. The
result will be known as growth_value.
FROM
full_transaction
ORDER BY
4 DESC
Function to specify the dataset from which the data is taken
23. Result
Categories that show increase and categories that show decrease. Decrease is shown by
minus in the growth_value.
24. 4. Identify the top 5 most popular methods of payment used in 2022 (based on total unique order) Use
is_valid = 1 to filter the data. Source table: order_detail, payment_method.
SELECT
payment_method,
COUNT (DISTINCT od.id) AS total_payment
--COUNT is used because we want to count the number of unique orders. As we
need to count total orders, we need to implement DISTINCT. When we use
DISTINCT, although there are 5 records, we can identify that it is actually one
transaction.
FROM
order_detail AS od
LEFT JOIN
payment_detail AS pd
ON pd.id = od.payment_id
-- Left join means returning all records from order_detail and the matched
records from the payment_detail.
WHERE
EXTRACT (Year FROM order_date) = 2022
AND
is_valid = 1
GROUP BY
1
ORDER BY
2 DESC
LIMIT 5
Function to specify the dataset from which the data is taken
Function to filter the data
Group by payment method
Limit the output. 5 means it gives 5 most used payment methods
which are located in the 5 first rows.
26. 5. Sort the 5 products based on their value of transactions: Samsung, Apple, Sony, Huawei, Lenovo. Use
is_valid = 1 to filter the data. Source table: order_detail, payment_method.
WITH full_transaction AS (
SELECT
CASE
WHEN (sku_name) like '%samsung%' THEN 'Samsung'
WHEN (sku_name) LIKE '%apple%' THEN 'Apple'
WHEN (sku_name) LIKE '%iphone%' THEN 'Apple'
WHEN (sku_name) LIKE '%imac%' THEN 'Apple'
WHEN (sku_name) LIKE '%macbook%' THEN 'Apple'
WHEN (sku_name) LIKE '%sony%' THEN 'Sonny'
WHEN (sku_name) LIKE '%huawei%' THEN 'Huawei'
WHEN (sku_name) LIKE '%huawei%' THEN 'Huawei'
WHEN (sku_name) LIKE '%lenovo%' THEN 'Lenovo'
END product_name,
SUM(after_discount) total_sales
FROM
order_detail AS od
LEFT JOIN sku_detail as sd ON sd.id = od.sku_id
WHERE
to_char (order_date,'yyyy-mm-dd') BETWEEN '2022-01-01' AND '2022-12-31'
AND is_valid = 1
GROUP BY 1
)
SELECT
full_transaction.*
FROM full_transaction
WHERE
product_name NOTNULL
ORDER BY
2 DESC
*Syntax explanation is on
the next page
27. Syntax Explanation
WITH full_transaction AS (
SELECT
CASE
WHEN (sku_name) LIKE '%samsung%' THEN 'Samsung'
WHEN (sku_name) LIKE '%apple%' THEN 'Apple'
WHEN (sku_name) LIKE '%iphone%' THEN 'Apple'
WHEN (sku_name) LIKE '%imac%' THEN 'Apple'
WHEN (sku_name) LIKE '%macbook%' THEN 'Apple'
WHEN (sku_name) LIKE '%sony%' THEN 'Sonny'
WHEN (sku_name) LIKE '%huawei%' THEN 'Huawei'
WHEN (sku_name) LIKE '%huawei%' THEN 'Huawei'
WHEN (sku_name) LIKE '%lenovo%' THEN 'Lenovo’
-- Regular expression function is used to identify the data text, for example, Samsung. The
LIKE function is also utilized as it has the meaning of “similar with”. There is a % symbol on
both sides of Samsung, which has a function to identify any string that contains “Samsung”
word.
END product_name,
SUM(after_discount) total_sales
FROM
order_detail AS od
LEFT JOIN sku_detail as sd ON sd.id = od.sku_id
WHERE
to_char (order_date,'yyyy-mm-dd') BETWEEN '2022-01-01' AND '2022-12-31'
AND is_valid = 1
GROUP BY 1
)
Function to specify the dataset from which the data is taken
Function to filter the data
28. SELECT
full_transaction.*
-- The star symbol means that all columns in full-transaction are
retrieved
FROM full_transaction
WHERE
product_name NOTNULL
-- The second temporary table aims to drop NOTNULL
ORDER BY
2 DESC 2 means product_name and total_sales. DESC
means sort from the highest to the lowest