How To Research


Published on

Find Ideas, Do Research,Publish Papers, Earn Degree! Charles Ling (凌晓峰), Professor.

Published in: Education, Technology

How To Research

  1. 1. Find Ideas, Do Research, Publish Papers, Earn Degree! Charles Ling ( 凌晓峰 ) , Professor PhD (U of Pennsylvania) University of Western Ontario , Canada ( 加拿大 西安大略大学 ) [email_address]
  2. 2. How to Find Research Topics and Do Good Research Charles Ling ( 凌晓峰 ) , Professor PhD (U of Pennsylvania) University of Western Ontario , Canada ( 加拿大 西安大略大学 ) [email_address]
  3. 3. <ul><li>Beginner: re-implement a recent, published work </li></ul><ul><li>How to expand research… area itself may have not have breakthrough </li></ul>
  4. 4. To Become a Researcher <ul><li>Research career </li></ul><ul><li>Scientific discovery </li></ul><ul><ul><li>Not commercialization </li></ul></ul><ul><ul><li>Different from engineering </li></ul></ul><ul><li>The life as a researcher – are you suitable </li></ul><ul><ul><li>Creative & critical, focus, diligent, persistent, … </li></ul></ul><ul><li>Graduate students  PhD  researcher (large companies) or professor (universities) </li></ul>
  5. 5. First Step: MSc/PhD Candidates <ul><li>You are becoming a researcher (with support) </li></ul><ul><ul><li>Finding research topics and ideas </li></ul></ul><ul><ul><li>Doing good research </li></ul></ul><ul><ul><li>Writing and publishing papers (peer-reviewed) </li></ul></ul><ul><ul><li>Earning MSc/PhD thesis; applying for grant </li></ul></ul><ul><li>Two things help young researchers to start </li></ul><ul><ul><li>Internet </li></ul></ul><ul><ul><li>Anonymous authorship in paper submission </li></ul></ul><ul><li>Probably best to obtain PhD in North America </li></ul>
  6. 6. 1. Finding Research Ideas <ul><li>(Theoretical) Research: discover new knowledge </li></ul><ul><ul><li>New, sound, significant and better, high impact </li></ul></ul><ul><li>(Applied research: killer applications) </li></ul><ul><li>Read latest proceedings and journals </li></ul><ul><li>Internet makes things much easy </li></ul><ul><ul><li>Google Scholar, authors’ homepage, email authors </li></ul></ul><ul><ul><li> </li></ul></ul><ul><li>Attend latest conferences, chat with people </li></ul><ul><li>Relax: walk on beach, drink some beer… </li></ul>
  7. 7. Finding Research Ideas <ul><li>The “attitude” when you read papers </li></ul><ul><ul><li>拜读,跟踪 </li></ul></ul><ul><li>30% understanding, 70% critical/creative thinking </li></ul><ul><ul><li>What is wrong? [ Example ] </li></ul></ul><ul><ul><li>What else? Why not? How to do better? </li></ul></ul><ul><ul><ul><li>Many possibilities (creativity – no unique correct answer) </li></ul></ul></ul><ul><ul><ul><li>From small to major extension </li></ul></ul></ul><ul><ul><ul><li>“ Surprising effect” (simple but creative) [ Example ] </li></ul></ul></ul><ul><ul><ul><li>High impact: affecting future research & applications </li></ul></ul></ul>NO !!!
  8. 8. Proper Model Selection with Significance Test (ECML’08) <ul><li>Abstract: … A common practice is to use the same evaluation metric as the goal one. However, in several recent studies, it is claimed that … In this paper, we point out a potential problem in the experiment design of those studies, and propose an improved method to test the claim. Our extensive experiments show convincingly that only the goal metric itself can most reliably select the correct models. </li></ul><ul><li>Introduction: … We hope that with the proper model selection method proposed here, we can settle this controversial issue once for all. </li></ul>Back
  9. 9. When does Co-Training Work in Real Data? (PAKDD09) <ul><li>Abstract : … However, little work has been done on how these assumptions can be empirically verified given datasets. In this paper, we first propose novel approaches to verify empirically … We then propose simple heuristic to split a single view of attributes into two views, and discover regularity on … Our empirical results not only coincide well with the previous theoretical findings, but also provide a practical guideline to decide when co-training should work well based on datasets. </li></ul>Back
  10. 10. Finding Research Ideas <ul><li>Reading group: read abstract, then brainstorm </li></ul><ul><ul><li>What is wrong? </li></ul></ul><ul><ul><li>What would I do? </li></ul></ul><ul><ul><li>How would I do better? </li></ul></ul><ul><li>Different levels of discussions </li></ul><ul><li>Taking notes on new ideas; do not criticize </li></ul>
  11. 11. From Ideas to Thesis Topics <ul><li>Start with some small and interesting ideas </li></ul><ul><li>Expand to a broad topic with related problems </li></ul><ul><li>Both supervisor and student play a role </li></ul><ul><li>Pay attention to “hot topics”: invited speech in conferences, special issues of journals </li></ul><ul><li>Make a plan (like a thesis outline) </li></ul><ul><ul><li>Be flexible with the plan </li></ul></ul><ul><ul><li>Stay focus with the plan </li></ul></ul>
  12. 12. An Example: Measures for ML <ul><li>Review: previous work on measure s </li></ul><ul><li>AUC: a single number measure - C riteria for comparing measures - T heorems for AUC and accuracy - Experimental verification </li></ul><ul><li>Optimizing learning algorithms with AUC </li></ul><ul><li>- AUC oriented decision trees </li></ul><ul><li>Comparing AUC with o ther measures </li></ul><ul><li>Constructing new measures </li></ul><ul><li>Discussions and Conclusions </li></ul>
  13. 13. An Example: Measures for ML <ul><li>Review: previous measure - accuracy , AUC, ROC - measures for orders - profit, cost sensitive learning - Recall, precision, combination - many others in NL retrieval , engineering </li></ul><ul><li>AUC: a single number measure - criteria for comparing measures - prove theorems for AUC and accuracy (IJCAI’03) - relation between C and D (5 % measures) **** - Experimental verification (CAI’03) - Comparing algorithms with AUC (ICDM’03) - AUC vs profit: **** (TKDE05) </li></ul>
  14. 14. <ul><li>Optimizing AUC *** - AUC oriented decision trees - AUC oriented SVM - AUC oriented neural networks (submitted) - General theorems about optimizing AUC </li></ul><ul><li>Other measures and comparison **** - (new) Super AUC measure (sub ’06) - compare and connect others measures - rank measures (ECML’05) - Proof or experiments with other measures (NL, …) </li></ul><ul><li>Discussions and Conclusions </li></ul>
  15. 15. 2. Doing Good Research <ul><li>Not an easy task: must make new contributions </li></ul><ul><ul><li>Know well the state-of-the-art (write a survey paper) </li></ul></ul><ul><ul><li>Many people may have tried your (new) ideas but failed (not usually published) </li></ul></ul><ul><ul><li>Usually take a great effort to get it to work </li></ul></ul><ul><ul><li>Must be very diligent and thorough </li></ul></ul><ul><ul><ul><li>Implement yours and others’ algorithms </li></ul></ul></ul><ul><ul><ul><li>Develop new theory and run many experiments </li></ul></ul></ul><ul><ul><ul><li>Very from different aspects: how others challenge me? </li></ul></ul></ul>
  16. 16. 3. Writing and Publishing Papers <ul><li>Conferences papers: yearly, so fast, with submission deadline </li></ul><ul><li>Journal papers: archive, so more complete </li></ul><ul><ul><li>Dealing with rejection and major revision </li></ul></ul><ul><li>Paper writing (another talk) </li></ul><ul><ul><li>English is not our worst enemy </li></ul></ul><ul><ul><li>Convincing results, clear presentation </li></ul></ul><ul><ul><li>Work with your supervisor… </li></ul></ul>
  17. 17. Convincing: Logic of a paper: <ul><ul><li>Problem X is important </li></ul></ul><ul><ul><li>Previous work A, B, … have been done, but they have certain weaknesses </li></ul></ul><ul><ul><li>We propose a new method Z… </li></ul></ul><ul><ul><li>We conduct experiments comparing Z to A and B, and show Z is better </li></ul></ul><ul><ul><li>Why is Z better? Why didn’t C, D work? What are strengths and weaknesses of Z </li></ul></ul><ul><ul><li>Conclusions and future work of Z </li></ul></ul>Charles X. Ling
  18. 18. Clarity: Structure of a Paper <ul><li>Title </li></ul><ul><li>Abstract </li></ul><ul><li>Introduction </li></ul><ul><li>(Review of Previous Work) </li></ul><ul><li>Our Work </li></ul><ul><li>Experiments and Comparisons </li></ul><ul><li>Relation to Previous Work </li></ul><ul><li>Discussions </li></ul><ul><li>Conclusions </li></ul><ul><li>References, Appendix </li></ul>Charles X. Ling A 200-word paper: very high level. Emphasize contributions and significance. Omit details and avoid technical terms A 2-page paper: high level. Emphasize background and motivation A 200-word paper: Summary & future work A 10-word paper Expand on various parts (also top-down structure)
  19. 19. 6 Typical Mistakes in Paper Writing <ul><li>My paper must be hard to understand </li></ul><ul><li>I must be formal </li></ul><ul><li>My paper is big and perfect </li></ul><ul><li>I am modest/the greatest </li></ul><ul><li>English is my only problem </li></ul><ul><li>Reviewers are evil </li></ul><ul><li>(A separate topic) </li></ul>
  20. 20. 4. Towards a PhD Thesis… <ul><li>Start early: survey the area, design a plan </li></ul><ul><li>Submit at least 2 conf papers and 1 journal paper each year </li></ul><ul><li>PhD thesis: a collection of these papers! </li></ul>
  21. 21. 2002.9 2006.6 PKDD 03 IJCAI 04 ICML 04 KDD 04 CAI 03 ICML 05 IJCAI 05 On-line system IEEE TKDE JMLR 04 Survey IJCAI 95-02 JMLR 95-02 KDD 95-02 ECML 95-02 AAAI 95-02 MLJ 95-02 JMLR 95-02 Survey paper PhD Thesis
  22. 22. ( Back )
  23. 23. To Apply PhD in North America <ul><li>Review carefully requirements on website </li></ul><ul><li>Most Canadian universities need: </li></ul><ul><ul><li>Good marks on transcripts – background </li></ul></ul><ul><ul><li>Good English test score – English </li></ul></ul><ul><ul><li>Strong recommendation letters – research </li></ul></ul><ul><ul><li>Usually no GRE is needed </li></ul></ul><ul><li>Don’t be afraid of asking questions </li></ul><ul><li>Make your applications as strong as possible </li></ul><ul><ul><li>Show your research abilities </li></ul></ul>
  24. 24. Applied Research/Building Systems <ul><li>Find a killer application!!! </li></ul><ul><li>Keep in mind the grand vision and end users </li></ul><ul><li>Divide and conquer: sub goals and milestones </li></ul><ul><li>Application-driven research: new algorithms, exp, scaling up, user study, … </li></ul><ul><li>Still publishing, and allowing others to use your system </li></ul><ul><ul><li>Should you worry about being copied? In general, no. </li></ul></ul><ul><ul><ul><li>You are always the best expert, one-step ahead </li></ul></ul></ul><ul><ul><ul><li>For research purpose only, getting feedbacks </li></ul></ul></ul><ul><ul><ul><li>You gain reputation and recognition </li></ul></ul></ul><ul><li>You may want to start up a new company ! </li></ul>
  25. 25. Summary <ul><li>Research: new, significant, high-impact </li></ul><ul><li>Msc and PhD candidates: supervised researchers </li></ul><ul><li>Think critically and creatively </li></ul><ul><li>Start with small idea, survey the area, build it up </li></ul><ul><li>Thesis: have a plan early and focus on it </li></ul><ul><li>Publish yearly in best conferences and journals </li></ul><ul><li>Applied research: killer applications </li></ul><ul><li>Internet and anonymous authorship make it fair </li></ul><ul><li>Researchers/professors are a great career </li></ul>
  26. 26. Wish you success in research!! Thanks!
  27. 27. Advance Data Mining and Applications <ul><li>Prof. Charles X. Ling </li></ul><ul><li>PhD (U of Pennsylvania) </li></ul><ul><li>University of Western Ontario, Canada </li></ul><ul><li>( 西安大略大学 ) </li></ul><ul><li>; </li></ul><ul><li>http://cling/ </li></ul>
  28. 28. Course Plan <ul><li>Wed: Charles Ling, WEKA algorithms and software </li></ul><ul><li>Thurs: Qiang Yang, DM algorithms and applications </li></ul><ul><li>Fri: Qiang Yang, DM algorithms and applications </li></ul><ul><li>Monday: Charles Ling/Qiang Yang: Applications </li></ul><ul><li>Tues: Wrap up and exam! </li></ul>
  29. 29. WEKA <ul><li>Textbook: Data Mining: Practical Machine Learning Tools and Techniques (Second Edition), by Morgan Kaufmann, 2005. </li></ul><ul><li>Software (v4.11): </li></ul><ul><li>Lecture notes </li></ul>
  30. 30. Keys in DM Applications <ul><li>Many simple algorithms work quite well </li></ul><ul><li>Crucial: converting real-world problems into DM problems </li></ul><ul><ul><li>Use WEKA directly </li></ul></ul><ul><ul><li>Modify/improve WEKA </li></ul></ul><ul><li>DM: a lot of science and a bit of art </li></ul><ul><ul><li>Must understanding the working of algorithms </li></ul></ul>
  31. 32. Research Methodologies <ul><li>Prof. Charles X. Ling </li></ul><ul><li>PhD (U of Pennsylvania) </li></ul><ul><li>University of Western Ontario, Canada </li></ul><ul><li>( 西安大略大学 ) </li></ul><ul><li>[email_address] </li></ul><ul><li> </li></ul>
  32. 33. Becoming a Researcher & Research Methodologies <ul><li>Professor Charles X. Ling </li></ul><ul><li>PhD (U of Pennsylvania) </li></ul><ul><li>University of Western Ontario, Canada </li></ul><ul><li>[email_address] </li></ul><ul><li> </li></ul>
  33. 34. What Researchers/Scientists do? <ul><li>Create new ideas, invent, discover… </li></ul><ul><ul><li>Potentially useful in a long run </li></ul></ul><ul><ul><li>Not merely engineering or applications </li></ul></ul><ul><ul><li>Not for short term profit (not products) </li></ul></ul>
  34. 35. Research Support <ul><li>Mostly government grants and funding (NSF, NSERC, NASA, etc.) </li></ul><ul><li>Based purely on merit of your research </li></ul>
  35. 36. Research Environment <ul><li>University and research institutes </li></ul><ul><li>Teaching is light </li></ul><ul><ul><li>Graduate teaching is useful for research </li></ul></ul><ul><li>Academic freedom: you are your own “boss” </li></ul><ul><li>Supported mostly by government funding </li></ul><ul><li>Tenure system: a lot of free time to explore </li></ul><ul><li>Supervise graduate students </li></ul><ul><li>Publish papers </li></ul>
  36. 37. To Become A Researcher… <ul><li>If you are creative and want to try new ideas… </li></ul><ul><li>Must be able to focus and think deeply </li></ul><ul><li>Creativity are important </li></ul><ul><li>Go to get a Msc/PhD in a good university </li></ul><ul><li>Becoming a researcher/scientist/professor </li></ul>