1. This course is prepared under the Erasmus+ KA-210-YOU Project titled
«Skilling Youth for the Next Generation Air Transport Management»
Machine Learning
Applications in Aviation
Association Analysis
Asst. Prof. Dr. Emircan Özdemir
Eskişehir Technical University
2. • In this chapter, we will discover a powerful method for uncovering meaningful
relationships among variables within aviation data.
• Association Analysis is a pivotal technique designed to reveal hidden patterns and
connections within datasets. It enables us to identify associations or correlations among
variables, contributing to a deeper understanding of complex relationships.
• Association analysis plays a crucial role in aviation by facilitating the discovery of valuable
patterns. Whether it's understanding passenger behavior, optimizing maintenance
processes, or enhancing in-flight services, association analysis provides the tools to
uncover intricate connections.
Association Analysis 2
Introduction
3. • Apriori Algorithm
This classic algorithm identifies frequent itemsets in a dataset. It employs a "bottom-up"
approach, starting with individual items and progressively extending to larger itemsets.
Apriori uses support, confidence, and lift metrics to filter meaningful associations. For
instance, in aviation, it might unveil patterns like frequent co-occurrences of specific
maintenance tasks or services co-purchased by passengers.
Association Analysis 3
Types of Association Rules
Source: https://towardsdatascience.com/underrated-machine-learning-algorithms-apriori-1b1d7a8b7bc
Source: https://www.researchgate.net/publication/8052048_Knowledge_discovery_in_databases_of_biomechanical_variables_Application_to_the_sit_to_stand_motor_task/figures?lo=1
4. • Apriori Algorithm
In this algorithm, dataset should be formatted like in the example
on the right.
Support, Confidence, and Lift measures help assess the
significance and reliability of association rules discovered by the
Apriori algorithm. Higher support indicates more frequent
itemsets, higher confidence indicates stronger rule associations,
and lift measures the strength of the relationship beyond what
would be expected by chance.
Association Analysis 4
Types of Association Rules
5. • Apriori Algorithm
Support measures the frequency of occurrence of an itemset in the dataset. It is calculated
as the number of transactions containing the itemset divided by the total number of
transactions.
Confidence quantifies how often a rule is found to be true. It is calculated as the number of
transactions containing both the antecedent and consequent of the rule divided by the
number of transactions containing the antecedent.
Lift measures how much more likely the consequent is to be true given that the antecedent
is true, compared to when the antecedent and consequent are independent. It is calculated
as the confidence of the rule divided by the support of the consequent.
Association Analysis 5
Types of Association Rules
6. • Eclat Algorithm
Eclat stands for "Equivalence Class Clustering and Bottom-Up Lattice Traversal." It focuses
on finding frequent itemsets by utilizing a depth-first search strategy. Eclat efficiently
discovers associations without generating candidate itemsets explicitly, making it valuable in
scenarios where memory efficiency is critical. In aviation, Eclat could uncover associations
in flight data, such as recurring patterns of in-flight incidents.
This algortihm uses verrical data format instead of the horizontal data format, which is being
used in Apriori algorithm. Therefore, the dataset should be formatted as itemsets to be in
the first column. Transactions will be associated to the items in teh dataset.
Eclat provides only support measures, not lift or confidence.
Association Analysis 6
Types of Association Rules
7. • FP-Growth Algorithm
FP-Growth, or Frequent Pattern Growth, is a tree-based algorithm that avoids the explicit
generation of candidate itemsets. It builds a compact data structure, the FP-tree, to
efficiently mine frequent itemsets. This approach is particularly useful in aviation for
handling large datasets, such as identifying recurrent patterns in aircraft performance or
passenger behavior.
Association Analysis 7
Types of Association Rules
Source: https://www.javatpoint.com/fp-growth-algorithm-in-data-mining
8. • Maintenance Item Co-Occurrences
Association analysis is applied to aviation maintenance data to uncover co-occurrences of
maintenance items. By identifying patterns of items that tend to be addressed together,
airlines can optimize maintenance processes, streamline inventory management, and
improve overall operational efficiency.
• In-Flight Purchasing Patterns
Association rules play a vital role in revealing patterns of in-flight purchasing behavior.
Analyzing transactions and correlating the items passengers tend to purchase together
enables airlines to tailor in-flight services, enhance the passenger experience, and optimize
onboard sales strategies.
Association Analysis 8
Association Analysis Use Cases in Aviation
9. • Baggage Handling Optimization
Association analysis is employed in baggage handling processes to identify associations
and patterns related to baggage items. This optimization helps airlines enhance baggage
handling efficiency, reduce errors, and improve the overall reliability of baggage services,
contributing to a smoother travel experience for passengers.
• Customer Loyalty Programs Optimization
Association analysis is applied to customer data, particularly related to loyalty program
interactions. By identifying associations between various aspects of customer behavior,
airlines can optimize loyalty programs, tailor rewards, and improve customer retention.
Association Analysis 9
Association Analysis Use Cases in Aviation
10. • Personalized Marketing Campaigns
Association rules help uncover associations between passenger profiles, preferences, and
response to marketing campaigns. Airlines can use this information to create personalized
marketing strategies, leading to more targeted and effective promotional efforts.
• Flight Bundling Strategies
Association analysis assists in understanding patterns of passenger choices regarding
bundled services or flight options. Airlines can leverage this insight to design strategic
bundling offers, enhancing the attractiveness of specific services or routes and optimizing
revenue streams.
Association Analysis 10
Association Analysis Use Cases in Aviation
11. • Support and Confidence Thresholds
Setting appropriate support and confidence thresholds is crucial in association analysis.
Support indicates the frequency of an itemset in the dataset, and confidence measures the
reliability of the association rule. Striking the right balance ensures that discovered
associations are meaningful and actionable.
Too low thresholds may lead to numerous, less meaningful associations, while overly high
thresholds may result in overlooking valuable patterns.
Association Analysis 11
Considerations and Challenges in Association
Analysis in Aviation
12. • Handling Large Datasets
The aviation industry often deals with vast datasets containing diverse information.
Efficiently handling large datasets is a challenge in association analysis. Algorithms need to
scale effectively to process and mine associations from extensive data.
This consideration emphasizes the need for robust algorithms and computational resources
to derive meaningful insights from substantial aviation datasets.
Association Analysis 12
Considerations and Challenges in Association
Analysis in Aviation
13. • Data Preprocessing
Effective data preprocessing is paramount in association analysis. Cleaning and preparing
data involve handling missing values, removing duplicates, and ensuring data quality. This
ensures that association rules are derived from reliable and accurate datasets, leading to
more meaningful insights in aviation analytics.
• Interpretation of Rules
Interpreting and validating association rules is a critical best practice. Understanding the
implications of discovered rules and ensuring they align with domain knowledge is
essential. This involves considering the context of the aviation industry and validating
whether the identified associations make sense. Clear interpretation ensures actionable
insights and informed decision-making based on association analysis results.
Association Analysis 13
Best Practices for Association Analysis in Aviation
14. • In RapidMiner, using the Repository window, follow
the path Training Resources-Unsupervised-
Associations and open the Hotel App Association
solution process.
• In this example, the co-occurances are being
analyzed to define the association rules towards
customer loyalty.
• There is no label attribute and several attributes
related to the customer loyalty are taken into
account to create association rules.
• Therefore, association analysis is choosen to reach
this goal.
Association Analysis 14
RapidMiner Example on Association Analysis
15. • In the process window, there are data importing (ETL) operator, data preprocessing
operators (discretize and binominal transformation), FP-Growth model operator, and
Create Association Rules operator. In ETL oeprator, there are several suboperators to
prepare the main dataset. Minimum support parameter was defined as 0.6 in the
parameters window of the model operator.
Association Analysis 15
RapidMiner Example on Association Analysis
16. • For association analysis, the dataset was transformed into binominal structure. As you
can see below, all attributes (values) were recoded as true/false (0/1).
Association Analysis 16
RapidMiner Example on Association Analysis
17. • After you run the model, you can see the frequent itemsets. This list expresses the most
frequent co-occurances in your dataset. You can also see the support values of itemsets.
Association Analysis 17
RapidMiner Example on Association Analysis
18. • In the results, you can also
find the asociation rules.
• For each rule, you can find the
support, confidence and lift
values.
• Also, you can arrange the
minimum limit for confidence
measure to rearrange the
association rules list.
Association Analysis 18
RapidMiner Example on Association Analysis
19. For the best interpretation of results:
• Prioritize association rules with a combination of high support, lift, and confidence. This
ensures that the identified patterns are both prevalent in the dataset and have a strong
influence or predictability.
• Use a threshold approach: Set minimum values for support, lift, and confidence based on
the specific goals of your analysis and the characteristics of your dataset.
• Consider domain knowledge: While quantitative metrics are essential, qualitative insights
from domain experts can enhance the interpretation of association rules and guide
practical applications.
Note:Remember that the interpretation and application of support, lift, and confidence
depend on the specific goals and context of your analysis.
Association Analysis 19
Conclusion
20. • Association Analysis in aviation helps to uncover relationships among variables for
pattern discovery.
• While it is not traditionally associated with aviation, its principles can be applied in various
industries, including aviation, to extract valuable insights.
• Association analysis in aviation can contribute to improved safety, operational efficiency,
customer satisfaction, and overall optimization across various aspects of the industry.
• Applying data mining techniques to aviation data can uncover valuable insights that
support informed decision-making and continuous improvement.
Association Analysis 20
Conclusion