Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction of Data Science


Published on

My class presentation at USC. It gives an introduction about what is data science, machine learning, applications, recommendation system and infrastructure.

Published in: Data & Analytics
  • Follow the link, new dating source: ♥♥♥ ♥♥♥
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ❶❶❶ ❶❶❶
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Introduction of Data Science

  1. 1. Data Science Introduction Jason Geng
  2. 2. Agenda What is big data What is data science Data science applications System infrastructure Case study – recommendation system
  3. 3. Data Scientist Analytics Artificial Intelligence Statistics Natural Language ProcessingFeature Engineering Scientific Method Simulation Data & Text Mining Machine Learning Predictive Modeling Graph Analytics Data Management Data Warehousing Mashups Databases Business Intelligence Big Data Information Retrieval Art & Design Business Mindset Computer Science Visualization Communication Data Product Design Domain Knowledge Ethics Privacy & Security Programming Cloud Computing Distributed Systems Technology & Infrastructure Growth Hacking Social network Public Relation Online ToolsResource
  4. 4. Data Science Applications Recommendation System Self-driving Text Cognition Spam Filtering
  5. 5.
  6. 6. Machine Learning Algorithm Supervised learning Regression Classification Neural network, deep learning Unsupervised learning Clustering
  7. 7. Recommendation System Are a subclass of information filtering system that seek to predict the “rating” or “preference” that a user would give to an item ---- Wikipedia
  8. 8. Case Study
  9. 9. Algorithms Collaborative filtering Content-based recommendation Learning to rank Context-aware recommendation Social network recommendation
  10. 10. Collaborative Filtering Basic Assumption • Users with similar interests have common preference • Sufficiently large number of user preferences are available Main Approaches • User-based • Item-based
  11. 11. User-based Filtering User user- item rating matrix Make user- to-user correlations Find highly correlated users Recommend items to
  12. 12. Item-based Filtering User user-item ratings matrix Make item-to-item correlations Find items that are highly corated Recommend items with highest correlation
  13. 13. Steps in item-based CF Predicted rating for item 2 for user 1
  14. 14. Problem with Collaborative Filtering New user cold start problem New item cold start problem Popularity bias: tend to recommend only popular items Sparsity problem: if there are many items to be recommended, user/rating matrix is sparse and it hard to find the users who have rated the same item