This document analyzes three years of sales data for an online retail company using tools like Microsoft Azure, Hadoop, Hive, and Spark. It finds that Spark queries were faster than Hive queries. It also uses the Apriori algorithm to determine association rules between product subcategories, containers, and shipping modes. Graphs show trends in sales by province and customer segment. However, no correlation was found between holidays and increased sales. The analysis aims to help the company increase yearly revenues by predicting demand and optimizing pricing.