xgboost를 이해하기 위해서 찾아보다가 내가 궁금한 내용을 따로 정리하였으나, 역시 구체적인 수식은 아직 모르겠다.
요즘 Kaggle에서 유명한 Xgboost가 뭘까?
Ensemble중 하나인 Boosting기법?
Ensemble 유형인 Bagging과 Boosting 차이는?
왜 Ensemble이 low bias, high variance 모델인가?
Bias 와 Variance 관계는?
Boosting 기법은 어떤게 있나?
Xgboost에서 사용하는 CART 알고리즘은?
KGC 2014 가볍고 유연하게 데이터 분석하기 : 쿠키런 사례 중심 , 데브시스터즈Minwoo Kim
1년 7개월 장수 모바일 게임 쿠키런. 많은 유저, 하루에도 쏟아지는 많은 로그. Time To Market 단축이 핵심 역량 중 하나가 되는 모바일 게임 시장. 자주 빠르게 변경되는 스팩, 로그도 마찬가지로 자주 빠르게 변경되는 스키마. 이런 현실속에서 게임 개발과 운영, 데이터 분석까지 병행 하기 위해서 가볍고 유연한 아키텍처로 적당히 빠르게 데이터 분석을 하는 쿠키런 서버팀 사례를 소개합니다.
xgboost를 이해하기 위해서 찾아보다가 내가 궁금한 내용을 따로 정리하였으나, 역시 구체적인 수식은 아직 모르겠다.
요즘 Kaggle에서 유명한 Xgboost가 뭘까?
Ensemble중 하나인 Boosting기법?
Ensemble 유형인 Bagging과 Boosting 차이는?
왜 Ensemble이 low bias, high variance 모델인가?
Bias 와 Variance 관계는?
Boosting 기법은 어떤게 있나?
Xgboost에서 사용하는 CART 알고리즘은?
KGC 2014 가볍고 유연하게 데이터 분석하기 : 쿠키런 사례 중심 , 데브시스터즈Minwoo Kim
1년 7개월 장수 모바일 게임 쿠키런. 많은 유저, 하루에도 쏟아지는 많은 로그. Time To Market 단축이 핵심 역량 중 하나가 되는 모바일 게임 시장. 자주 빠르게 변경되는 스팩, 로그도 마찬가지로 자주 빠르게 변경되는 스키마. 이런 현실속에서 게임 개발과 운영, 데이터 분석까지 병행 하기 위해서 가볍고 유연한 아키텍처로 적당히 빠르게 데이터 분석을 하는 쿠키런 서버팀 사례를 소개합니다.
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
This is the documentation of the study-meeting in lab.
Tha book title is "Hands-On Machine Learning with Scikit-Learn and TensorFlow" and this is the chapter 8.
Feature Engineering for ML - Dmitry Larko, H2O.aiSri Ambati
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/wcFdmQSX6hM
Description:
In this talk, Dmitry shares his approach to feature engineering which he used successfully in various Kaggle competitions. He covers common techniques used to convert your features into numeric representation used by ML algorithms.
Speaker's Bio:
Dmitry has more than 10 years of experience in IT. Starting with data warehousing and BI, now in big data and data science. He has a lot of experience in predictive analytics software development for different domains and tasks. He is also a Kaggle Grandmaster who loves to use his machine learning and data science skills on Kaggle competitions.
DoWhy Python library for causal inference: An End-to-End toolAmit Sharma
As computing systems are more frequently and more actively intervening in societally critical domains such as healthcare, education, and governance, it is critical to correctly predict and understand the causal effects of these interventions. Without an A/B test, conventional machine learning methods, built on pattern recognition and correlational analyses, are insufficient for causal reasoning.
Much like machine learning libraries have done for prediction, "DoWhy" is a Python library that aims to spark causal thinking and analysis. DoWhy provides a unified interface for causal inference methods and automatically tests many assumptions, thus making inference accessible to non-experts.
For a quick introduction to causal inference, check out amit-sharma/causal-inference-tutorial. We also gave a more comprehensive tutorial at the ACM Knowledge Discovery and Data Mining (KDD 2018) conference: causalinference.gitlab.io/kdd-tutorial.
이번 슬라이드는 Graph mining의 기초에 대한 것이다.
고전 문제인 Graph cut에 대한 개념과 수학적인 배경을 설명하고, 이 개념이 clustering (클러스터링)에서 어떻게 사용되는지를 설명한다.
Graph mining, cut, clustring의 기초를 알기에 매우 적합한 자료이다.
Linear Regression
Simple Linear Regression
Multiple Linear Regression
Polynomial Regression
Non-Linear Regression
Support Vector Regression (SVR)
Decision Tree Regression
Random Forest Regression
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
** Python Data Science Training : https://www.edureka.co/python **
This Edureka Video on Logistic Regression in Python will give you basic understanding of Logistic Regression Machine Learning Algorithm with examples. In this video, you will also get to see demo on Logistic Regression using Python. Below are the topics covered in this tutorial:
1. What is Regression?
2. What is Logistic Regression?
3. Why use Logistic Regression?
4. Linear vs Logistic Regression
5. Logistic Regression Use Cases
6. Logistic Regression Example Demo in Python
Subscribe to our channel to get video updates. Hit the subscribe button above.
Machine Learning Tutorial Playlist: https://goo.gl/UxjTxm
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
This is the documentation of the study-meeting in lab.
Tha book title is "Hands-On Machine Learning with Scikit-Learn and TensorFlow" and this is the chapter 8.
Feature Engineering for ML - Dmitry Larko, H2O.aiSri Ambati
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/wcFdmQSX6hM
Description:
In this talk, Dmitry shares his approach to feature engineering which he used successfully in various Kaggle competitions. He covers common techniques used to convert your features into numeric representation used by ML algorithms.
Speaker's Bio:
Dmitry has more than 10 years of experience in IT. Starting with data warehousing and BI, now in big data and data science. He has a lot of experience in predictive analytics software development for different domains and tasks. He is also a Kaggle Grandmaster who loves to use his machine learning and data science skills on Kaggle competitions.
DoWhy Python library for causal inference: An End-to-End toolAmit Sharma
As computing systems are more frequently and more actively intervening in societally critical domains such as healthcare, education, and governance, it is critical to correctly predict and understand the causal effects of these interventions. Without an A/B test, conventional machine learning methods, built on pattern recognition and correlational analyses, are insufficient for causal reasoning.
Much like machine learning libraries have done for prediction, "DoWhy" is a Python library that aims to spark causal thinking and analysis. DoWhy provides a unified interface for causal inference methods and automatically tests many assumptions, thus making inference accessible to non-experts.
For a quick introduction to causal inference, check out amit-sharma/causal-inference-tutorial. We also gave a more comprehensive tutorial at the ACM Knowledge Discovery and Data Mining (KDD 2018) conference: causalinference.gitlab.io/kdd-tutorial.
이번 슬라이드는 Graph mining의 기초에 대한 것이다.
고전 문제인 Graph cut에 대한 개념과 수학적인 배경을 설명하고, 이 개념이 clustering (클러스터링)에서 어떻게 사용되는지를 설명한다.
Graph mining, cut, clustring의 기초를 알기에 매우 적합한 자료이다.
Linear Regression
Simple Linear Regression
Multiple Linear Regression
Polynomial Regression
Non-Linear Regression
Support Vector Regression (SVR)
Decision Tree Regression
Random Forest Regression
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
** Python Data Science Training : https://www.edureka.co/python **
This Edureka Video on Logistic Regression in Python will give you basic understanding of Logistic Regression Machine Learning Algorithm with examples. In this video, you will also get to see demo on Logistic Regression using Python. Below are the topics covered in this tutorial:
1. What is Regression?
2. What is Logistic Regression?
3. Why use Logistic Regression?
4. Linear vs Logistic Regression
5. Logistic Regression Use Cases
6. Logistic Regression Example Demo in Python
Subscribe to our channel to get video updates. Hit the subscribe button above.
Machine Learning Tutorial Playlist: https://goo.gl/UxjTxm
55. 모집단의 성질에 대해 표본을 통해 추정할 수 있는 것처럼,
표본의 성질에 대해서도 재표본을 통해 추정할 수 있다는 것이다.
즉 주어진 표본(샘플)에 대해서, 그 샘플에서 또 다시 샘플(재표본)을
여러번(1,000~10,000번, 혹은 그 이상)추출하여
표본의 평균이나 분산 등이 어떤 분포를 가지는가를 알아낼 수 있다.
Bootstrap
63. Averaging a set of observations reduces variance
Bootstrap
A slight change in sample,
Still a slight change in split
시험지 확인하러 가서 내 점수를 올려도
반 평균에는 큰 영향을 미치지 않는다
73. Tree
& advantages
1. 이해하기 쉽다: 씹고 뜯고 맛보고 즐기고 [White box]
2. 데이터 정제가 크게 필요하지 않다: 바로 넣자
3. numerical, categorical 가리지 않는다: 그냥 넣자
4. 데이터가 어떤 패턴인지 볼 때 편하다: 넣어봐