[패스트캠퍼스] 보험사기예측

•

1 like•90 views

승

승수 이

패스트캠퍼스 데이터 사이언스 SCHOOL 4기 프로젝트_보험사기 예측_ 보험사데이터를 통한 보험사기 여부 예측

Data & Analytics

데이터 사이언스 SCHOOL
5
Abstract
Goal(프로젝트의 목표)
- 보험사 고객의 개인, 보험데이터를 기반으로 고객의 보험사기 여부를 예측
프로젝트 개요
데이터수집 및 분석 방법론
-데이터수집
00생명 빅데이터 공모전 데이터 활용
( feature = 10년간의 고객데이터 target = 보험사기여부 데이터 )
-방법론
Naïve Bayes Classification ( 다항분포 나이브 베이즈, 가우시안 나이브 베이즈)
- feature들 중 카테고리 데이터, 뉴메릭 데이터가 함께 있음
- 카데고리 데이터 -> 다항분포 나이브 베이즈, 뉴메릭 데이터 -> 가우시안 나이브 베이즈
- 두 모형을 합하여 하나의 예측 모형 생성
-변수선택
- 팀원들간의 토론을 통하여 보험사기에 영향을 미칠 것이라 생각되는 변수 선정
- 카테고리 데이터 : 보험사기 그룹과 정상 그룹을 분류 - 그래프를 통해 분포가 다른 변수들을 선정
(나이브 베이지안 가정)
- 뉴메릭 데이터 : 히트맵을 그려 상관이 높은 변수들 제거
- 의사결정 나무를 이용하여 중요도가 높은 변수 선택
-평가 & 개선작업
- ROC커브를 사용하여 모델 평가
- 조건을 만족하는 변수들을 추가, 제거하여 최적의 모델을 선택
-최종 성능평가
- 중요도가 높은 변수들을 선택하였을 때보다 중요도는 낮지만 많은 변수들이 들어 갔을 때 가장
예측률이 높음
- 카테고리 데이터가 나이브 베이즈 가정을 만족하지 못한 한계 -> 더 많은 데이터 확보 필요
보험사기 예측 분석
팀 project / 2017.02 ~ 2016.04

Featured

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

ChatGPT webinar slidesAlireza Esmikhani

More than Just Lines on a Map: Best Practices for U.S Bike RoutesProject for Public Spaces & National Center for Biking and Walking

Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference

Barbie - Brand Strategy PresentationErica Santiago

Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software

Introduction to C Programming LanguageSimplilearn

Featured (20)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

ChatGPT webinar slides

More than Just Lines on a Map: Best Practices for U.S Bike Routes

Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...

Barbie - Brand Strategy Presentation

Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well

Introduction to C Programming Language

[패스트캠퍼스] 보험사기예측

1. 데이터 사이언스 SCHOOL 5 Abstract Goal(프로젝트의 목표) - 보험사 고객의 개인, 보험데이터를 기반으로 고객의 보험사기 여부를 예측 프로젝트 개요 데이터수집 및 분석 방법론 -데이터수집 00생명 빅데이터 공모전 데이터 활용 ( feature = 10년간의 고객데이터 target = 보험사기여부 데이터 ) -방법론 Naïve Bayes Classification ( 다항분포 나이브 베이즈, 가우시안 나이브 베이즈) - feature들 중 카테고리 데이터, 뉴메릭 데이터가 함께 있음 - 카데고리 데이터 -> 다항분포 나이브 베이즈, 뉴메릭 데이터 -> 가우시안 나이브 베이즈 - 두 모형을 합하여 하나의 예측 모형 생성 -변수선택 - 팀원들간의 토론을 통하여 보험사기에 영향을 미칠 것이라 생각되는 변수 선정 - 카테고리 데이터 : 보험사기 그룹과 정상 그룹을 분류 - 그래프를 통해 분포가 다른 변수들을 선정 (나이브 베이지안 가정) - 뉴메릭 데이터 : 히트맵을 그려 상관이 높은 변수들 제거 - 의사결정 나무를 이용하여 중요도가 높은 변수 선택 -평가 & 개선작업 - ROC커브를 사용하여 모델 평가 - 조건을 만족하는 변수들을 추가, 제거하여 최적의 모델을 선택 -최종 성능평가 - 중요도가 높은 변수들을 선택하였을 때보다 중요도는 낮지만 많은 변수들이 들어 갔을 때 가장 예측률이 높음 - 카테고리 데이터가 나이브 베이즈 가정을 만족하지 못한 한계 -> 더 많은 데이터 확보 필요 보험사기 예측 분석 팀 project / 2017.02 ~ 2016.04

[패스트캠퍼스] 보험사기예측

Recommended

Recommended

More Related Content

Featured

Featured (20)

[패스트캠퍼스] 보험사기예측