SlideShare a Scribd company logo
Credit Card Fraudulent Transactions
Amy Wong, Daisy Tan, Alejandro Ordaz-Perez, Daniel Thach, Christopher Serrano
CIS 4200-02: Business Intelligence and Data Warehouse
Dr. Lusi Li
May 16, 2024
Introduction
In the digital age, the prevalence of credit card fraud has emerged as a significant challenge for
consumers, businesses, and financial institutions worldwide. With the proliferation of online
shopping and electronic transactions, the risk of fraudulent activities has increased, necessitating
robust detection and prevention strategies. This study aims to explore the patterns and trends
associated with credit card fraudulent transactions, leveraging data analytics to uncover insights
that can inform better security measures.
The data set analyzed in this study encompasses a diverse range of transactions across various
categories, including shopping, groceries, entertainment, and gas, spanning the years 2019 to
2020 in the United States. By examining the distribution of fraudulent activities across these
categories, we seek to identify which types of transactions are most susceptible to fraud.
Additionally, we investigate temporal patterns, such as specific days of the week or times of the
day when fraud is more likely to occur, to understand the behavior of fraudsters and the timing of
their activities.
Furthermore, this study delves into the clustering of fraudulent transactions based on consumer
names and other identifying information, providing a granular view of the characteristics and
behaviors associated with high-risk individuals. Seasonal variations in fraudulent activities are
also examined to ascertain if certain periods, such as holidays or major events, see a surge in
fraud incidents.
The findings from this research aim to enhance the understanding of credit card fraud dynamics,
contributing to the development of more effective fraud detection systems and preventive
measures. By leveraging advanced data analysis techniques, this study provides valuable insights
that can aid stakeholders in mitigating the impact of credit card fraud and safeguarding financial
transactions.
Analysis Questions
1. What is the distribution of fraudulent transactions across different categories, such as
shopping, groceries, entertainment, and gas?
Across different transactions from
multiple categories, personal
grocery shopping-related
transactions took the lead by only
1% above general online shopping,
which was 22% of all related credit
card transactions detected as
fraudulent. However, if the number
of transactions regarding fraud were
aggregated, the value would total
595, a small fraction compared to the total amount of transactions from the dataset.
This is likely because more risks are associated with entering the card data for various sites with
questionable security. In the grocery shopping case, it could be multiple things like skimmers and
phishing interception attempts when customers are making purchases.
● Dimensions: Categories; Fraud transactions; filter: 2019-2020 year, in America
● Measures: Sum of Fraud for credit card transactions in America by category
When put into a visual representation, online groceries and travel, have some of the fewer fraud
cases detected. Surprisingly, entertainment has less fraudulent activity, and gas, which has about
9% of the total fraudulent activity, might be concerning. There is a big division between the top 2
categories for fraud transactions detected compared to the rest. It would be interesting to look at
other dimensions to draw correlations since just judging on distributions would not explain why
online shopping and in-person groceries have such huge leads
2. Are there any specific patterns or trends in the timing of fraudulent transactions, such as
specific days of the week or times during the day when they are more likely to occur?
When trying to observe how there could be a
pattern from the dataset, we could convert it to days of
the week when the fraud was detected. After that, we
can create a pivot table. Yet, this does not give us a good
visualization, so if we were to make a bar chart to graph
the frequency, we could see from the visual that the fraudulent activity was well distributed
throughout the days of the week. This gives up two assumptions to work off of. People
conducting fraudulent activity do not have a preference regarding what day they would conduct
their scams, and there is no hindrance to their effectiveness if these are cases performed by a
group of people with a similar mindset. So, we would have to look at other dimensions to deduce
any particular trends, as the days of the week cannot give us anything more than conjecture on
trends.
Another method is organizing the number of days
these frauds are detected; each transaction can be broken
down into intervals for easier comprehension, so we could
create a VLOOKUP table for each interval. The breakoff
point for night and morning was hard to discern, but after
some research, it is better to assume that morning is when
people are generally active and consume “breakfast.” We then create a table to calculate the
jumper of fraud during each interval; surprisingly, it gives us more to work with. Yet, it would
best be displayed in a line graph to show the observation.
We can see that the fraudulent activities start rising
around noon, peaking in the evening, and falling back
down around night. This information shows that the
targeted hours are often when people are at work or going
to sleep. This could mean that the culprits are either living
in similar time zones or conveniently living in a time zone to take advantage of this.
● Dimensions: Time of day based on the V-lookup table in the previous image, frequency
of fraudulent transaction
● Measures: Fraudulent transaction throughout the 24-hour cycle by sum of time and
instance
Of course, without the descriptive context of how these fraudulent transactions occurred,
it is difficult to determine if these are miscellaneous attempts by the cardholder themselves or the
benefit of the doubt that their transaction was returned as fraud due to invalid funds/credit.
3. Is there any correlation between the location (coordinates) of the merchant and the
frequency of fraudulent transactions?
Dimensions: Filtered by isFraud (0 and 1 value), Represented by the number of transactions
(cc_num)
Measures: AVG of the Merchant’s Coordinates (Longitude and Latitude)
Based on the map visualization, there are a ton of fraudulent transactions that happened
across the country. Representing the fraudulent transactions in purple, we noticed that it is more
cluttered towards the east coast, meaning that there are more fraudulent transactions that
occurred in the area. There are also a handful of fraudulent transactions towards the west coast
but more spread out. This means that it does not happen often towards that area in comparison to
the east coast. Therefore, we believe that there is correlation with the merchants’ location and the
frequency of the fraudulent transactions.
Dimensions: Filtered by isFraud (0 and 1 value), Represented by the number of transactions
(cc_num)
Measures: AVG of the Merchant’s Coordinates (Longitude and Latitude)
4. What is the gender distribution among consumers involved in fraudulent transactions?
To find the gender distribution amongst the consumers that were involved in fraudulent
transactions, we used a double bar chart for this analysis. On the x-axis, it will display the values
that represent whether or not a transaction is fraudulent or legitimate. The y-axis displays the
number of transactions using the credit card number variable. The data was split up into the two
genders, male and female. To get the number of transactions, I used the distinct count measure to
label them on the bar chart.
From this bar chart visualization, we can conclude that women have faced fraudulent
transactions more than men. Highlighted in purple, there are 213 transactions that resulted as
fraudulent for women. For men, there are 201 transactions that were marked as fraudulent.
Dimensions: Represented by isFraud (0 and 1 values), Separated by Gender
Measures: Distinct count of cc_num
5. Are there any particular age groups (based on date of birth) more commonly associated
with fraudulent transactions?
According to the data table, customers’ year of birth ranged from 1925 to 2004, covering
a span of 80 years. A histogram was used to filter transactions marked for fraud and arranged via
age groups of 4 years, totalling 8 groups.
Through this visualization, it is evident that a majority of fraud occurred for customers born
between 1955 to 1974, shown as group 4 and 5. Our analysis also utilized a timeline to filter the
differences between years. The fraud distribution is negatively skewed as younger generations
from 1955 have more access to card payments and online transactions, opening up more chances
for data breaches and fraud. In this case, age is also a determinant factor in whether a credit card
user may be susceptible to fraud. Much older generations do not often purchase or shop online,
while younger generations are more aware of common scams and risks. This leaves the age
group in the middle of our data set who have the short end of the stick for both generations on
either side: they are not familiar with common fraud tactics and are eager to shop and browse
online. This is a fairly big risk factor for credit card companies to assume and cover, especially
with credit card agreements that have fraud protection where the bank assumes the cost of the
fraud and returns some of the lost amount back to the card holder. In this case, to prevent
additional risks, credit card companies should create an accidental fraud clause to specify how
much risk the company is willing to cover and at what point is the card holder solely responsible
for the lost money.
6. Can we identify any recurring patterns or anomalies in the transaction amounts for
fraudulent transactions within each category?
Each category was filtered for fraud and included information such as the maximum and
minimum amounts that were flagged as fraud, as well as the average fraudulent transaction
amount, as shown below.
● Measures: Average of Amount per Category
● Dimension: Category, Is Fraud
The above table illustrates the average amounts spent per transaction flagged and not flagged for
fraud. The column shows which Category and the rows show Average of Amount spent per
category, separated by transactions marked as Fraud. The transactions flagged for fraud differed
in a wider range from normal transactions. As seen in the entertainment category, for example,
average transactions cost around $63 while the fraud average spent in one transaction was $510.
Other categories show the same disparity between average transaction amounts with some more
than 10x the normal amount. Clearly, the bar set to determine whether a transaction is or is not
fraud is in the transaction amounts. Any transaction that exceeded a normal reasonable amount
spent in one purchase was flagged as fraud.
7. Are there any specific merchants or chains that appear more frequently in
fraudulent transactions, and if so, what characteristics do they share?
● Measures: Sum is fraud
● Dimensions: Category, Merchant
The two side by side bar charts above represent the specific merchants that appear more
frequently in fraudulent transactions. To create this visualization we first determined what would
be our columns and rows, which as seen above are Category and Sum is fraud for the columns
and Merchants as the rows. Due to there being many merchants in the CreditCard dataset and
different types categories for shopping in which fraudulent transactions occur. We decided to
further narrow down which merchants that were involved in fraudulent transactions by filtering
by shopping category and sum of fraud. From our previous analysis that can be seen in our
question 1we know that the two shopping categories in which fraud occurs are the grocery_pos
and shopping_net. For the sum of fraud we discovered that although there are a total of 595
instances of fraud among the different categories the Is Fraud measure measures fraud from 0 to
7, 0 indicating no fraud and 7 indicating the highest occurrence of fraud. By means of these
filters we were able to create the visualization above in which we can see that there are specific
merchants who appear more frequently in fraudulent transactions. These merchants being Osinki
Ledner and Leuschke, Rau and Sons, Moen Reinger and Murphy, and Barton Inc for the category
of grocery_pos and Kerluke-Abshire, Gleason-Macejkovic, Fisher-Schowalter, and
Boyer-Reichert for the category of shopping_net. A common characteristic that these merchants
all share is they’re both in the categories that have the highest amount of fraudulent transactions
occurring. Another characteristic they share is that they are all within the ranges of 5 through 7
meaning they are the highest instances of fraudulent transactions.
8. Can we identify any clusters or groups of fraudulent transactions based on consumer
names or other identifying information?
Dimensions: Customer_first_name, transaction_date, category
Measures Filter: Total_amount_sum
Cluster 1 highlights individuals with lower transaction frequencies but varied categories,
including grocery purchases, healthcare services, and minimal travel expenses.
Cluster 2 includes individuals with higher transaction frequencies, significantly engaging in
shopping and grocery purchases. This cluster accounts for substantial fraudulent transactions by
amount, and higher-value fraud transactions.
Each column shows a yearly breakdown for each individual, providing insights into the
changing patterns of fraudulent behavior or consistent trends over time. The table format allows
for a direct comparison across years and categories, illustrating shifts or consistencies in fraud
strategies. This analysis is crucial for developing targeted measures to effectively prevent and
counteract such fraudulent activities.
9. Are there any seasonal variations in the frequency or nature of fraudulent transactions,
such as increased activity around holidays or special events?
Dimensions: none
Measures: transaction_date_month, Total_Sum
Yes. This line chart depicts the amount of fraudulent transactions by month from January 2019 to
May 2020, showcasing notable fluctuations in activity that suggest a pattern aligned with
seasonal trends or specific events. The peaks in the graph, particularly those in March 2019 and
May 2020, indicate significant surges in fraudulent activity, which could potentially correlate
with holidays. seasons, tax filing periods, or major shopping events, known to be opportune
times for fraudsters.

More Related Content

Similar to CIS 4200-02 Group 1 Final Project Report (1).pdf

White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...
White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...
White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...
claytonroot
 
Study: Identifying Fraud and Credit Risk in the Smallest of Small Businesses
Study: Identifying Fraud and Credit Risk in the Smallest of Small BusinessesStudy: Identifying Fraud and Credit Risk in the Smallest of Small Businesses
Study: Identifying Fraud and Credit Risk in the Smallest of Small Businesses
claytonroot
 
H030101043047
H030101043047H030101043047
H030101043047
theijes
 
TestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTest
TestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTest
TestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTest
h9gfhypx97
 
Enterprise Fraud Management
Enterprise Fraud ManagementEnterprise Fraud Management
Enterprise Fraud Management
Manish Desai
 
Product Sheet: Small Business Risk Exchange
Product Sheet: Small Business Risk ExchangeProduct Sheet: Small Business Risk Exchange
Product Sheet: Small Business Risk Exchange
claytonroot
 
note4.pdf.pdf
note4.pdf.pdfnote4.pdf.pdf
note4.pdf.pdf
Rashmibansal15
 
Navigating Fraud and Risks in International Factoring- Essential Insights for...
Navigating Fraud and Risks in International Factoring- Essential Insights for...Navigating Fraud and Risks in International Factoring- Essential Insights for...
Navigating Fraud and Risks in International Factoring- Essential Insights for...
M1NXT
 
RSA Online Fraud Report - July 2014
RSA Online Fraud Report - July 2014RSA Online Fraud Report - July 2014
RSA Online Fraud Report - July 2014
EMC
 
IRJET- Financial Fraud Detection along with Outliers Pattern
IRJET-  	  Financial Fraud Detection along with Outliers PatternIRJET-  	  Financial Fraud Detection along with Outliers Pattern
IRJET- Financial Fraud Detection along with Outliers Pattern
IRJET Journal
 
KF 7032 Big Data And Cloud Computing.docx
KF 7032 Big Data And Cloud Computing.docxKF 7032 Big Data And Cloud Computing.docx
KF 7032 Big Data And Cloud Computing.docx
stirlingvwriters
 
Fraud, Specifically Corporate Fraud, Is A Common Occurrence
Fraud, Specifically Corporate Fraud, Is A Common OccurrenceFraud, Specifically Corporate Fraud, Is A Common Occurrence
Fraud, Specifically Corporate Fraud, Is A Common Occurrence
Christy Davis
 
HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...
HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...
HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...
HospitalityLawyer.com
 
Operation Payback Debrief_eBook_SV
Operation Payback Debrief_eBook_SVOperation Payback Debrief_eBook_SV
Operation Payback Debrief_eBook_SV
Shaun O'keeffe
 
Understand The Types Of Fraud To Help Protect Your Business.pdf
Understand The Types Of Fraud To Help Protect Your Business.pdfUnderstand The Types Of Fraud To Help Protect Your Business.pdf
Understand The Types Of Fraud To Help Protect Your Business.pdf
PROF. PAUL ALLIEU KAMARA
 
3 types of fraud graph analytics can help defeat
3 types of fraud graph analytics can help defeat3 types of fraud graph analytics can help defeat
3 types of fraud graph analytics can help defeat
Linkurious
 
Understand The Types Of Fraud To Help Protect Your Business 12.pdf
Understand The Types Of Fraud To Help Protect Your Business 12.pdfUnderstand The Types Of Fraud To Help Protect Your Business 12.pdf
Understand The Types Of Fraud To Help Protect Your Business 12.pdf
PROF. PAUL ALLIEU KAMARA
 
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of FraudstersSecure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Cognizant
 
Profiling the Fraudster article by Simon Padgett
Profiling the Fraudster article by Simon PadgettProfiling the Fraudster article by Simon Padgett
Profiling the Fraudster article by Simon Padgett
Simon Padgett FCCA MBA
 
How to identify reshipping scams with Neo4j
How to identify reshipping scams with Neo4jHow to identify reshipping scams with Neo4j
How to identify reshipping scams with Neo4j
Linkurious
 

Similar to CIS 4200-02 Group 1 Final Project Report (1).pdf (20)

White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...
White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...
White Paper: Identifying Fraud and Credit Risk in the Smallest of Small Busin...
 
Study: Identifying Fraud and Credit Risk in the Smallest of Small Businesses
Study: Identifying Fraud and Credit Risk in the Smallest of Small BusinessesStudy: Identifying Fraud and Credit Risk in the Smallest of Small Businesses
Study: Identifying Fraud and Credit Risk in the Smallest of Small Businesses
 
H030101043047
H030101043047H030101043047
H030101043047
 
TestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTest
TestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTest
TestTestTestTestTestTestTestTestTestTestTestTestTestTestTestTest
 
Enterprise Fraud Management
Enterprise Fraud ManagementEnterprise Fraud Management
Enterprise Fraud Management
 
Product Sheet: Small Business Risk Exchange
Product Sheet: Small Business Risk ExchangeProduct Sheet: Small Business Risk Exchange
Product Sheet: Small Business Risk Exchange
 
note4.pdf.pdf
note4.pdf.pdfnote4.pdf.pdf
note4.pdf.pdf
 
Navigating Fraud and Risks in International Factoring- Essential Insights for...
Navigating Fraud and Risks in International Factoring- Essential Insights for...Navigating Fraud and Risks in International Factoring- Essential Insights for...
Navigating Fraud and Risks in International Factoring- Essential Insights for...
 
RSA Online Fraud Report - July 2014
RSA Online Fraud Report - July 2014RSA Online Fraud Report - July 2014
RSA Online Fraud Report - July 2014
 
IRJET- Financial Fraud Detection along with Outliers Pattern
IRJET-  	  Financial Fraud Detection along with Outliers PatternIRJET-  	  Financial Fraud Detection along with Outliers Pattern
IRJET- Financial Fraud Detection along with Outliers Pattern
 
KF 7032 Big Data And Cloud Computing.docx
KF 7032 Big Data And Cloud Computing.docxKF 7032 Big Data And Cloud Computing.docx
KF 7032 Big Data And Cloud Computing.docx
 
Fraud, Specifically Corporate Fraud, Is A Common Occurrence
Fraud, Specifically Corporate Fraud, Is A Common OccurrenceFraud, Specifically Corporate Fraud, Is A Common Occurrence
Fraud, Specifically Corporate Fraud, Is A Common Occurrence
 
HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...
HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...
HospitalityLawyer.com | CONVERGE May-June 2013 Issue - Insurance Coverage for...
 
Operation Payback Debrief_eBook_SV
Operation Payback Debrief_eBook_SVOperation Payback Debrief_eBook_SV
Operation Payback Debrief_eBook_SV
 
Understand The Types Of Fraud To Help Protect Your Business.pdf
Understand The Types Of Fraud To Help Protect Your Business.pdfUnderstand The Types Of Fraud To Help Protect Your Business.pdf
Understand The Types Of Fraud To Help Protect Your Business.pdf
 
3 types of fraud graph analytics can help defeat
3 types of fraud graph analytics can help defeat3 types of fraud graph analytics can help defeat
3 types of fraud graph analytics can help defeat
 
Understand The Types Of Fraud To Help Protect Your Business 12.pdf
Understand The Types Of Fraud To Help Protect Your Business 12.pdfUnderstand The Types Of Fraud To Help Protect Your Business 12.pdf
Understand The Types Of Fraud To Help Protect Your Business 12.pdf
 
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of FraudstersSecure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
 
Profiling the Fraudster article by Simon Padgett
Profiling the Fraudster article by Simon PadgettProfiling the Fraudster article by Simon Padgett
Profiling the Fraudster article by Simon Padgett
 
How to identify reshipping scams with Neo4j
How to identify reshipping scams with Neo4jHow to identify reshipping scams with Neo4j
How to identify reshipping scams with Neo4j
 

Recently uploaded

DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
thanhluan21
 
Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...
Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...
Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...
Murugan Solaiyappan
 
How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17
Celine George
 
How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17
Celine George
 
NAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource BookNAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource Book
lakitawilson
 
Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.
DrRavindrakshirsagar1
 
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 SlidesHow to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
Celine George
 
Individual Performance Commitment Review Form-Developmental Plan.docx
Individual Performance Commitment Review Form-Developmental Plan.docxIndividual Performance Commitment Review Form-Developmental Plan.docx
Individual Performance Commitment Review Form-Developmental Plan.docx
monicaaringo1
 
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptxBRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
kambal1234567890
 
NC Public Schools Involved in NCDPI, Zipline Partnership
NC Public Schools Involved in NCDPI, Zipline PartnershipNC Public Schools Involved in NCDPI, Zipline Partnership
NC Public Schools Involved in NCDPI, Zipline Partnership
Mebane Rash
 
C# Interview Questions PDF By ScholarHat.pdf
C# Interview Questions PDF By ScholarHat.pdfC# Interview Questions PDF By ScholarHat.pdf
C# Interview Questions PDF By ScholarHat.pdf
Scholarhat
 
H. A. Roberts: VITAL FORCE - Dr. Niranjan Bapat
H. A. Roberts: VITAL FORCE - Dr. Niranjan BapatH. A. Roberts: VITAL FORCE - Dr. Niranjan Bapat
H. A. Roberts: VITAL FORCE - Dr. Niranjan Bapat
Niranjan Bapat
 
Principles of Roods Approach!!!!!!!.pptx
Principles of Roods Approach!!!!!!!.pptxPrinciples of Roods Approach!!!!!!!.pptx
Principles of Roods Approach!!!!!!!.pptx
ibtesaam huma
 
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUMENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
HappieMontevirgenCas
 
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
Celine George
 
CTD Punjab Police Past Papers MCQs PPSC PDF
CTD Punjab Police Past Papers MCQs PPSC PDFCTD Punjab Police Past Papers MCQs PPSC PDF
CTD Punjab Police Past Papers MCQs PPSC PDF
hammadmughal76316
 
Webinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional SkillsWebinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional Skills
EduSkills OECD
 
2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference
KlettWorldLanguages
 
Odoo 17 Social Marketing - Lead Generation On Facebook
Odoo 17 Social Marketing - Lead Generation On FacebookOdoo 17 Social Marketing - Lead Generation On Facebook
Odoo 17 Social Marketing - Lead Generation On Facebook
Celine George
 
modul ajar kelas x bahasa inggris 2024-2025
modul ajar kelas x bahasa inggris 2024-2025modul ajar kelas x bahasa inggris 2024-2025
modul ajar kelas x bahasa inggris 2024-2025
NurFitriah45
 

Recently uploaded (20)

DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
 
Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...
Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...
Lecture_Notes_Unit4_Chapter_8_9_10_RDBMS for the students affiliated by alaga...
 
How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17
 
How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17
 
NAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource BookNAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource Book
 
Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.
 
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 SlidesHow to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
 
Individual Performance Commitment Review Form-Developmental Plan.docx
Individual Performance Commitment Review Form-Developmental Plan.docxIndividual Performance Commitment Review Form-Developmental Plan.docx
Individual Performance Commitment Review Form-Developmental Plan.docx
 
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptxBRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
BRIGADA ESKWELA OPENING PROGRAM KICK OFF.pptx
 
NC Public Schools Involved in NCDPI, Zipline Partnership
NC Public Schools Involved in NCDPI, Zipline PartnershipNC Public Schools Involved in NCDPI, Zipline Partnership
NC Public Schools Involved in NCDPI, Zipline Partnership
 
C# Interview Questions PDF By ScholarHat.pdf
C# Interview Questions PDF By ScholarHat.pdfC# Interview Questions PDF By ScholarHat.pdf
C# Interview Questions PDF By ScholarHat.pdf
 
H. A. Roberts: VITAL FORCE - Dr. Niranjan Bapat
H. A. Roberts: VITAL FORCE - Dr. Niranjan BapatH. A. Roberts: VITAL FORCE - Dr. Niranjan Bapat
H. A. Roberts: VITAL FORCE - Dr. Niranjan Bapat
 
Principles of Roods Approach!!!!!!!.pptx
Principles of Roods Approach!!!!!!!.pptxPrinciples of Roods Approach!!!!!!!.pptx
Principles of Roods Approach!!!!!!!.pptx
 
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUMENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
 
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
 
CTD Punjab Police Past Papers MCQs PPSC PDF
CTD Punjab Police Past Papers MCQs PPSC PDFCTD Punjab Police Past Papers MCQs PPSC PDF
CTD Punjab Police Past Papers MCQs PPSC PDF
 
Webinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional SkillsWebinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional Skills
 
2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference
 
Odoo 17 Social Marketing - Lead Generation On Facebook
Odoo 17 Social Marketing - Lead Generation On FacebookOdoo 17 Social Marketing - Lead Generation On Facebook
Odoo 17 Social Marketing - Lead Generation On Facebook
 
modul ajar kelas x bahasa inggris 2024-2025
modul ajar kelas x bahasa inggris 2024-2025modul ajar kelas x bahasa inggris 2024-2025
modul ajar kelas x bahasa inggris 2024-2025
 

CIS 4200-02 Group 1 Final Project Report (1).pdf

  • 1. Credit Card Fraudulent Transactions Amy Wong, Daisy Tan, Alejandro Ordaz-Perez, Daniel Thach, Christopher Serrano CIS 4200-02: Business Intelligence and Data Warehouse Dr. Lusi Li May 16, 2024
  • 2. Introduction In the digital age, the prevalence of credit card fraud has emerged as a significant challenge for consumers, businesses, and financial institutions worldwide. With the proliferation of online shopping and electronic transactions, the risk of fraudulent activities has increased, necessitating robust detection and prevention strategies. This study aims to explore the patterns and trends associated with credit card fraudulent transactions, leveraging data analytics to uncover insights that can inform better security measures. The data set analyzed in this study encompasses a diverse range of transactions across various categories, including shopping, groceries, entertainment, and gas, spanning the years 2019 to 2020 in the United States. By examining the distribution of fraudulent activities across these categories, we seek to identify which types of transactions are most susceptible to fraud. Additionally, we investigate temporal patterns, such as specific days of the week or times of the day when fraud is more likely to occur, to understand the behavior of fraudsters and the timing of their activities. Furthermore, this study delves into the clustering of fraudulent transactions based on consumer names and other identifying information, providing a granular view of the characteristics and behaviors associated with high-risk individuals. Seasonal variations in fraudulent activities are also examined to ascertain if certain periods, such as holidays or major events, see a surge in fraud incidents.
  • 3. The findings from this research aim to enhance the understanding of credit card fraud dynamics, contributing to the development of more effective fraud detection systems and preventive measures. By leveraging advanced data analysis techniques, this study provides valuable insights that can aid stakeholders in mitigating the impact of credit card fraud and safeguarding financial transactions.
  • 4. Analysis Questions 1. What is the distribution of fraudulent transactions across different categories, such as shopping, groceries, entertainment, and gas? Across different transactions from multiple categories, personal grocery shopping-related transactions took the lead by only 1% above general online shopping, which was 22% of all related credit card transactions detected as fraudulent. However, if the number of transactions regarding fraud were aggregated, the value would total 595, a small fraction compared to the total amount of transactions from the dataset. This is likely because more risks are associated with entering the card data for various sites with questionable security. In the grocery shopping case, it could be multiple things like skimmers and phishing interception attempts when customers are making purchases.
  • 5. ● Dimensions: Categories; Fraud transactions; filter: 2019-2020 year, in America ● Measures: Sum of Fraud for credit card transactions in America by category When put into a visual representation, online groceries and travel, have some of the fewer fraud cases detected. Surprisingly, entertainment has less fraudulent activity, and gas, which has about 9% of the total fraudulent activity, might be concerning. There is a big division between the top 2 categories for fraud transactions detected compared to the rest. It would be interesting to look at other dimensions to draw correlations since just judging on distributions would not explain why online shopping and in-person groceries have such huge leads
  • 6. 2. Are there any specific patterns or trends in the timing of fraudulent transactions, such as specific days of the week or times during the day when they are more likely to occur? When trying to observe how there could be a pattern from the dataset, we could convert it to days of the week when the fraud was detected. After that, we can create a pivot table. Yet, this does not give us a good visualization, so if we were to make a bar chart to graph the frequency, we could see from the visual that the fraudulent activity was well distributed throughout the days of the week. This gives up two assumptions to work off of. People conducting fraudulent activity do not have a preference regarding what day they would conduct their scams, and there is no hindrance to their effectiveness if these are cases performed by a group of people with a similar mindset. So, we would have to look at other dimensions to deduce any particular trends, as the days of the week cannot give us anything more than conjecture on trends. Another method is organizing the number of days these frauds are detected; each transaction can be broken down into intervals for easier comprehension, so we could create a VLOOKUP table for each interval. The breakoff point for night and morning was hard to discern, but after some research, it is better to assume that morning is when people are generally active and consume “breakfast.” We then create a table to calculate the
  • 7. jumper of fraud during each interval; surprisingly, it gives us more to work with. Yet, it would best be displayed in a line graph to show the observation. We can see that the fraudulent activities start rising around noon, peaking in the evening, and falling back down around night. This information shows that the targeted hours are often when people are at work or going to sleep. This could mean that the culprits are either living in similar time zones or conveniently living in a time zone to take advantage of this. ● Dimensions: Time of day based on the V-lookup table in the previous image, frequency of fraudulent transaction ● Measures: Fraudulent transaction throughout the 24-hour cycle by sum of time and instance
  • 8. Of course, without the descriptive context of how these fraudulent transactions occurred, it is difficult to determine if these are miscellaneous attempts by the cardholder themselves or the benefit of the doubt that their transaction was returned as fraud due to invalid funds/credit. 3. Is there any correlation between the location (coordinates) of the merchant and the frequency of fraudulent transactions? Dimensions: Filtered by isFraud (0 and 1 value), Represented by the number of transactions (cc_num) Measures: AVG of the Merchant’s Coordinates (Longitude and Latitude) Based on the map visualization, there are a ton of fraudulent transactions that happened across the country. Representing the fraudulent transactions in purple, we noticed that it is more cluttered towards the east coast, meaning that there are more fraudulent transactions that
  • 9. occurred in the area. There are also a handful of fraudulent transactions towards the west coast but more spread out. This means that it does not happen often towards that area in comparison to the east coast. Therefore, we believe that there is correlation with the merchants’ location and the frequency of the fraudulent transactions. Dimensions: Filtered by isFraud (0 and 1 value), Represented by the number of transactions (cc_num) Measures: AVG of the Merchant’s Coordinates (Longitude and Latitude) 4. What is the gender distribution among consumers involved in fraudulent transactions? To find the gender distribution amongst the consumers that were involved in fraudulent transactions, we used a double bar chart for this analysis. On the x-axis, it will display the values
  • 10. that represent whether or not a transaction is fraudulent or legitimate. The y-axis displays the number of transactions using the credit card number variable. The data was split up into the two genders, male and female. To get the number of transactions, I used the distinct count measure to label them on the bar chart. From this bar chart visualization, we can conclude that women have faced fraudulent transactions more than men. Highlighted in purple, there are 213 transactions that resulted as fraudulent for women. For men, there are 201 transactions that were marked as fraudulent. Dimensions: Represented by isFraud (0 and 1 values), Separated by Gender Measures: Distinct count of cc_num
  • 11. 5. Are there any particular age groups (based on date of birth) more commonly associated with fraudulent transactions? According to the data table, customers’ year of birth ranged from 1925 to 2004, covering a span of 80 years. A histogram was used to filter transactions marked for fraud and arranged via age groups of 4 years, totalling 8 groups. Through this visualization, it is evident that a majority of fraud occurred for customers born between 1955 to 1974, shown as group 4 and 5. Our analysis also utilized a timeline to filter the differences between years. The fraud distribution is negatively skewed as younger generations from 1955 have more access to card payments and online transactions, opening up more chances for data breaches and fraud. In this case, age is also a determinant factor in whether a credit card user may be susceptible to fraud. Much older generations do not often purchase or shop online, while younger generations are more aware of common scams and risks. This leaves the age group in the middle of our data set who have the short end of the stick for both generations on either side: they are not familiar with common fraud tactics and are eager to shop and browse
  • 12. online. This is a fairly big risk factor for credit card companies to assume and cover, especially with credit card agreements that have fraud protection where the bank assumes the cost of the fraud and returns some of the lost amount back to the card holder. In this case, to prevent additional risks, credit card companies should create an accidental fraud clause to specify how much risk the company is willing to cover and at what point is the card holder solely responsible for the lost money. 6. Can we identify any recurring patterns or anomalies in the transaction amounts for fraudulent transactions within each category? Each category was filtered for fraud and included information such as the maximum and minimum amounts that were flagged as fraud, as well as the average fraudulent transaction amount, as shown below.
  • 13. ● Measures: Average of Amount per Category ● Dimension: Category, Is Fraud The above table illustrates the average amounts spent per transaction flagged and not flagged for fraud. The column shows which Category and the rows show Average of Amount spent per category, separated by transactions marked as Fraud. The transactions flagged for fraud differed
  • 14. in a wider range from normal transactions. As seen in the entertainment category, for example, average transactions cost around $63 while the fraud average spent in one transaction was $510. Other categories show the same disparity between average transaction amounts with some more than 10x the normal amount. Clearly, the bar set to determine whether a transaction is or is not fraud is in the transaction amounts. Any transaction that exceeded a normal reasonable amount spent in one purchase was flagged as fraud. 7. Are there any specific merchants or chains that appear more frequently in fraudulent transactions, and if so, what characteristics do they share? ● Measures: Sum is fraud ● Dimensions: Category, Merchant The two side by side bar charts above represent the specific merchants that appear more frequently in fraudulent transactions. To create this visualization we first determined what would be our columns and rows, which as seen above are Category and Sum is fraud for the columns
  • 15. and Merchants as the rows. Due to there being many merchants in the CreditCard dataset and different types categories for shopping in which fraudulent transactions occur. We decided to further narrow down which merchants that were involved in fraudulent transactions by filtering by shopping category and sum of fraud. From our previous analysis that can be seen in our question 1we know that the two shopping categories in which fraud occurs are the grocery_pos and shopping_net. For the sum of fraud we discovered that although there are a total of 595 instances of fraud among the different categories the Is Fraud measure measures fraud from 0 to 7, 0 indicating no fraud and 7 indicating the highest occurrence of fraud. By means of these filters we were able to create the visualization above in which we can see that there are specific merchants who appear more frequently in fraudulent transactions. These merchants being Osinki Ledner and Leuschke, Rau and Sons, Moen Reinger and Murphy, and Barton Inc for the category of grocery_pos and Kerluke-Abshire, Gleason-Macejkovic, Fisher-Schowalter, and Boyer-Reichert for the category of shopping_net. A common characteristic that these merchants all share is they’re both in the categories that have the highest amount of fraudulent transactions occurring. Another characteristic they share is that they are all within the ranges of 5 through 7 meaning they are the highest instances of fraudulent transactions.
  • 16. 8. Can we identify any clusters or groups of fraudulent transactions based on consumer names or other identifying information? Dimensions: Customer_first_name, transaction_date, category Measures Filter: Total_amount_sum Cluster 1 highlights individuals with lower transaction frequencies but varied categories, including grocery purchases, healthcare services, and minimal travel expenses. Cluster 2 includes individuals with higher transaction frequencies, significantly engaging in shopping and grocery purchases. This cluster accounts for substantial fraudulent transactions by amount, and higher-value fraud transactions.
  • 17. Each column shows a yearly breakdown for each individual, providing insights into the changing patterns of fraudulent behavior or consistent trends over time. The table format allows for a direct comparison across years and categories, illustrating shifts or consistencies in fraud strategies. This analysis is crucial for developing targeted measures to effectively prevent and counteract such fraudulent activities. 9. Are there any seasonal variations in the frequency or nature of fraudulent transactions, such as increased activity around holidays or special events? Dimensions: none Measures: transaction_date_month, Total_Sum Yes. This line chart depicts the amount of fraudulent transactions by month from January 2019 to May 2020, showcasing notable fluctuations in activity that suggest a pattern aligned with seasonal trends or specific events. The peaks in the graph, particularly those in March 2019 and May 2020, indicate significant surges in fraudulent activity, which could potentially correlate
  • 18. with holidays. seasons, tax filing periods, or major shopping events, known to be opportune times for fraudsters.