Airline passenger profiling based on fuzzy deep machine learning

Airline Passenger Profiling
Based on
Fuzzy Deep Machine Learning
by Ayman Qaddumi

Profiling History
• Passenger profiling plays a vital part of commercial
aviation security
• In 1996, Northwest Airlines developed the first
computer airline passenger profiling system, named
which is a rule-based screening system utilizing
about 39 pieces of pre-boarding data to identify
high-risk passengers

Profiling History
• After the 9/11 attacks, the Transportation Security
Administration (TSA) proposed CAPPS II
• CAPPS II utilized an additional set of attributes
(e.g., income and demography) retrieved from
governmental databases to calculate a risk score
for each passenger.
• Based on their risk scores, passengers are labeled
as “green”, “yellow”, or “red” security risks and then
subject to correspondingly intrusive scrutiny

Profiling History
• In 2004, the TSA replaced CAPPS II with a next-
generation system “Secure Flight”, which improves the
selection process through a terrorist screening database
• The new system shifted responsibility of comparing
passenger identification with sensitive government data
from the privatized airlines to the federal government
• Since 2010, the TSA has required airlines to share pre-
boarding data with the government to perform centralized
watch list matching.

Problems with
Current Profiling Systems
• Given that normal passengers constitute a much larger fraction,
a small imperfection in classifying them will result in a large
number of false positives due to the base rate fallacy
• In a profiling system with 99% accuracy, the case of identifying
the 11 hijackers of 9/11 in a population of 300 million, would
result in 3 million of those identified as False Positives!.
• Manipulating recorded data by malicious intent can throw off the
profiler as well
• So in general, classical passenger profiling methods are
inefficient in handling the rapidly increasing amounts of
electronic records

Recommended Solution
• Emerging deep learning models combined with highly parallel
computing have exhibited promising performance for feature
exaction and abstraction, but their applications in aviation security
management have rarely been reported.
• Although ‘Fuzzy set theory’ provides a powerful tool for handling
imperfect data and has been applied in risk assessment, profiling
methods based on ‘fuzzy profiling’ are few.
• This paper proposes a deep learning approach to passenger
profiling.
• The center of the approach is a Pythagorean fuzzy deep
Boltzmann machine (PFDBM),

PFDBM
• For training the PFDBM the authors propose a
hybrid algorithm that combines a gradient-based
method and an evolutionary algorithm
• Based on the novel learning model, they develop a
deep neural network (DNN) for classifying normal
passengers and potential attackers.
• They claim their approach provides much higher
learning ability and classification accuracy than
existing profilers

Boltzmann Machines
• Introduced in 1983 as a basic connectionist model
using stochastic networks
• Units have one of two states(binary)
• Bidirectional connections (No arrows)
• Probabilistic state transition

Pythagorean fuzzy set
• Any two items are usually compared by a user as either a ‘match’ or a
‘mismatch’
• In fuzzy logic, the range between a perfect match & a perfect mismatch is
extended. For instance, [0,1] set can be subdivided into [0,0.25,0.5,0.75,1]
• Membership into an extended subset can be expressed in terms of a r(x),
strength of membership and d(x), direction of commitment. (Intuitionistic
Fuzzy Set)
• PFS which is more general than IFS in that the sum of squares of the
membership degree and the nonmembership degree is less than or equal to
1.
• Fuzzified neural networks can handle inputs with fuzzy (labeled) and/or
incomplete features, which are inevitable in passenger profilers

PFDBM Training
• For training the PFDBM to meet the high accuracy
requirements of tasks, such as passenger profiling,
the authors propose a hybrid algorithm integrating
the greedy layer wise training method with
biogeography-based optimization (BBO)

FUZZY DNN FOR
PASSENGER PROFILING
PFDBM-Based DNN for identifying the risk of each individual
passenger. The input to the DNN includes the following.
1) The passenger name record, including the identity
information of the passenger and booking information of the
flight.
2) Travel statistics of the passenger’s flight history from the
Aviation Administration (collected from different airlines).
3) Travel statistics from other transportation methods, such as
railway and marine.

DNN Inputs (continued)
• 4) Travel statistics from the Tourism Administration (collected from travel agencies).
• 5) Criminal records from the Public Security Department.
• 6) Educational records from the Education Department.
• 7) Tax records from the Tax Department.
• 8) Reported information from banks and the Housing Administration.
• 9) Consumption records or patterns from large retailers (the detailed records have
typically been preprocessed, summarized, and/or mined by retailers’ systems).
• 10) (Preprocessed) telecommunication behavior records or patterns from telecom
operators.
• 11) (Authenticated and preprocessed) Internet behavior records or patterns from
Internet operators.

Input Classes Divisions
• Based on their data types, the input features can be divided into the following
four classes.
1. Static binary-valued features, which can be input to the PFDBM directly.
2. Static real-valued features, which are first transformed into binary values
using a Gaussian-Bernoulli RBM (GRBM) and then input to the PFDBM.
GRBM is typically effective for image pixel binary classification
3. Static labeled features, which are first transformed into real values using
membership functions of fuzzy sets defined on the domain of the labels, and
then transformed into binary values using GRBM.
4. Dynamic (temporal) features, which are first transformed into real values
using an additional temporal filter, and then transformed into binary values
using GRBM.

Structure of the
PFDBM-based DNN

PFDBM-based DNN
• The GMM is trained with the expectation maximization algorithm. The higher the output value, the higher the
likelihood of the passenger being an attacker is. The classification is made using a predefined threshold τ . The
claim that the passenger is a (potential) attacker is accepted if z ≥ τ and rejected otherwise. Here, τ can be
tuned according to the inspection capability and the current pressure of terrorism, e.g., it can be turned down after
the airport receives a threatening phone call.
• Thus, the DNN has two preprocessing layers, the PFDBM, and the GMM, as shown in Fig. 6. In our current
implementation, the PFDBM has four layers, and the dimension of the input to its first layer is about 7000 (which
is expected to increase when more external information is available).

Conclusion
• The authors believe that our PFDBM model can be
adapted and applied to many other deep learning tasks
in computer science and management science
• Currently, they are developing a multigrade DNN that
integrates a low-grade DNN that only uses nonsensitive
features. If that passes a threshold, the high-grade model
is used
• This paper does not study the impacts of the passenger
profiling on privacy, which we believe should be seriously
addressed by relevant laws and regulations.

References
• Yu-Jun Zheng, Wei-Guo Sheng, Xing-Ming Sun, and Sheng-Yong Chen,
“Airline Passenger Profiling Based on Fuzzy Deep Machine
Learning”,IEEE TRANSACTIONS ON NEURAL NETWORKS AND
LEARNING SYSTEMS
• H. Cavusoglu, Y. Kwark, B. Mai, and S. Raghunathan, “Passenger profiling
and screening for aviation security in the presence of strategic attackers,”
Decision Anal., vol. 10, no. 1, pp. 63–81, 2013.

Airline passenger profiling based on fuzzy deep machine learning

More Related Content

What's hot

Viewers also liked

Similar to Airline passenger profiling based on fuzzy deep machine learning

Recently uploaded

Airline passenger profiling based on fuzzy deep machine learning