2. About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
6. The most recommend dataset
• Where the independent variables are numerical and the dependent
variable is categorical
• The advantage of such a dataset also lies in its ease of clustering
• The preferable data type for the dependent variable is binary,
meaning it is either 'YES' or ‘NO
• When the number of independent variables exceeds two or more, the
accuracy will decrease
• The most commonly used algorithm is logistic regression
6