Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

Machine Learning at LINE Slide 1 Machine Learning at LINE Slide 2 Machine Learning at LINE Slide 3 Machine Learning at LINE Slide 4 Machine Learning at LINE Slide 5 Machine Learning at LINE Slide 6 Machine Learning at LINE Slide 7 Machine Learning at LINE Slide 8 Machine Learning at LINE Slide 9 Machine Learning at LINE Slide 10 Machine Learning at LINE Slide 11 Machine Learning at LINE Slide 12 Machine Learning at LINE Slide 13 Machine Learning at LINE Slide 14 Machine Learning at LINE Slide 15 Machine Learning at LINE Slide 16 Machine Learning at LINE Slide 17 Machine Learning at LINE Slide 18 Machine Learning at LINE Slide 19 Machine Learning at LINE Slide 20 Machine Learning at LINE Slide 21 Machine Learning at LINE Slide 22 Machine Learning at LINE Slide 23 Machine Learning at LINE Slide 24 Machine Learning at LINE Slide 25 Machine Learning at LINE Slide 26 Machine Learning at LINE Slide 27 Machine Learning at LINE Slide 28 Machine Learning at LINE Slide 29 Machine Learning at LINE Slide 30 Machine Learning at LINE Slide 31 Machine Learning at LINE Slide 32 Machine Learning at LINE Slide 33 Machine Learning at LINE Slide 34 Machine Learning at LINE Slide 35 Machine Learning at LINE Slide 36 Machine Learning at LINE Slide 37 Machine Learning at LINE Slide 38 Machine Learning at LINE Slide 39 Machine Learning at LINE Slide 40 Machine Learning at LINE Slide 41 Machine Learning at LINE Slide 42 Machine Learning at LINE Slide 43 Machine Learning at LINE Slide 44 Machine Learning at LINE Slide 45 Machine Learning at LINE Slide 46 Machine Learning at LINE Slide 47 Machine Learning at LINE Slide 48 Machine Learning at LINE Slide 49 Machine Learning at LINE Slide 50 Machine Learning at LINE Slide 51 Machine Learning at LINE Slide 52 Machine Learning at LINE Slide 53 Machine Learning at LINE Slide 54 Machine Learning at LINE Slide 55 Machine Learning at LINE Slide 56 Machine Learning at LINE Slide 57
Upcoming SlideShare
What to Upload to SlideShare
Next

18 Likes

Share

Machine Learning at LINE

Haruka Kikuchi
LINE / Machine Learning Team

He is going to talk about how machine learning and recommend engine technologies have been planned and implemented to LINE services with examples and overall pictures.

Data Labs is an independent division which is separated from other business departments under the mission of company-wide use of data. It supports various types of LINE services in many ways including providing high level analysis done by data scientists, providing infrastructures for recommend engines and analysis as well as publishing various reports.

This session will focus on initiatives related to machine learning, in particular:

-How they define the role and responsibility of each team to provide various types of machine learning technologies
-What tactics are being used to provide technologies to diverse services and a large amount of users
-How they utilize trendy technologies such as deep learning.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Machine Learning at LINE

  1. 1. MACHINE LEARNING AT LINE Haruka Kikuchi, Data Labs
  2. 2. Agenda • Who We Are • Infrastructure • ML Examples
  3. 3. WHO WE ARE
  4. 4. ● Approx. 80 people total ● Independent from service/dev depts. ● Aggregate various data ● Provide platforms, tools, BI/reports, and ML solutions e.g. recommender engines, etc. DATA LABS Sticker Data Labs Ad Manga Music Live News
  5. 5. Machine Learning MACHINE LEARNING TEAM Project
 Manager Server-side / Infra Engineer Machine Learning Engineer ● ML engineers (multi-skilled) ● Stats, Math ● Deep Learning, NLP, etc. ● Some members play multiple roles
  6. 6. Services Supports 100+ Trainings per day Runs 1000+ Predictions per day Runs 10+ DAILY OUTPUT By Machine Learning Team
  7. 7. INFRASTRUCTURE
  8. 8. SYSTEM OVERVIEW
  9. 9. SYSTEM OVERVIEW
  10. 10. SYSTEM OVERVIEW
  11. 11. SYSTEM OVERVIEW
  12. 12. SYSTEM OVERVIEW
  13. 13. To Build ML Engines DEVELOPMENT ENVIRONMENT
  14. 14. To Test ML Logics AB TEST TOOLSET
  15. 15. SYSTEM OVERVIEW
  16. 16. ML EXAMPLES
  17. 17. CONTENT RECOMMENDATION
  18. 18. Item2ItemUser2Item STICKER RECOMMENDATIONS
  19. 19. #jobs Approx. 5M #sticker packages 100M+ #users per region < 10 STICKER RECOMMENDATIONS
  20. 20. For Sticker Recommendations COLLABORATIVE FILTERING Item2item User2item Purchase History User Activity Similarity 
 among Items Preference Top-N Items 
 for Each Item Top-M Items 
 for Each User
  21. 21. ML COMPUTATION Preprocessing (ETL) Calc. item2item Calc. user2item
  22. 22. Generated Revenue from The User2item Recommendation (within The Top Page) 25%+ PURCHASE
  23. 23. OTHER CONTENT RECOMMENDATIONS Sticker, etc. MangaNEWS Live Parttime Fortune-tellingMusicStore
  24. 24. USER RECOMMENDATION
  25. 25. RECOMMEND USERS (“LOOK A LIKE” AUDIENCE) To Expand Customers Potential Customers Existing
 Customers 200M #total LINE active users
  26. 26. LOTS OF MODELS Customers (Seed Users) Are Very Different 200M #total LINE active users
  27. 27. Relatively small #seed users 10M z-features subset #features 300 Trained models #daily jobs 100 1M For Training “LOOK A LIKE” AUDIENCE
  28. 28. SPARSE DNN Input z-features Dim: 10M Score (0 - 1) Dim: 1 (scalar) Output To Infer Potential Customers
  29. 29. SPARSE DNN Input Z-features Dim: 10M Score (0 - 1) Dim: 1 (scalar) Output To Infer Potential Customers
  30. 30. ML COMPUTATION Training Preprocessing (ETL) Inference
  31. 31. UX IMPROVEMENT
  32. 32. Label Semantic Tags to Sticker Images STICKER AUTO-SUGGEST
  33. 33. MANUAL LABELING
  34. 34. TAG COLLOCATION
  35. 35. Start from Well-Trained Model TRANSFER LEARNING ImageNet dataset ImageNet Categories Xception Model (trained) Input Output Xception Model Sticker Images Sticker Tags (approx. 350) Additional layers (dense) Input Output Xception Model (tuned)
  36. 36. ML COMPUTATION Train a model Preprocessing (ETL) Inference
  37. 37. EXAMPLES True Positives Labelled and predicted correctly False Positives Not Labelled but predicted to label False Negatives Labelled but missed to predict label
  38. 38. “ ” TP FP FN Not labeled by the creator, 
 but correctly inferred Language agnostic
  39. 39. “ ” TP FP FN
  40. 40. False Positives Are Acceptable to Suggest Potential Sticker Availability RECALL > PRECISION
  41. 41. CONTENT RECOMMENDATION REVISITED
  42. 42. To Cope with Cold Start Problem IMAGE-BASED RECOMMENDATION
  43. 43. TWO SIMILARITIES Expressed as Tags AppearanceSemantics Depends on Sticker Creators
  44. 44. TWO MODELS Sticker Images Sticker Tags (approx. 350) Xception Model (tuned) Input Output Xception Model (tuned) Sticker Images Sticker Creators (1000+) Additional layers (dense) Input Output Additional layers (dense) AppearanceSemantics
  45. 45. Per Sticker Image ONE REPRESENTATION Sticker Images Sticker Tags (approx. 350) Xception Model (tuned) Input Output Xception Model (tuned) Sticker Images Sticker Creators (1000+) Input Output Additional layers (dense)Additional layers (dense) Representation of each sticker image
 (feature vector) concat ( ),
  46. 46. ML COMPUTATION Train Model(s) Preprocessing (ETL) Calc. representations
  47. 47. IMAGE SIMILARITIES Target Origin Similar Less Similar More Semantic Less Semantic
  48. 48. EX. #1
  49. 49. EX. #2
  50. 50. EX. #3
  51. 51. EX. #4
  52. 52. CONCLUSION
  53. 53. ● Work with great infrastructure and people ● Allows us to focus on ML ● Design ML to scale by default ● Z-features (reusable, extensible) ● Computationally efficient algorithms ● Language agnostic algorithms HOW WE SCALE ML PROJECTS
  54. 54. ● Who we are ● Infrastructures ● Datalake + ML cluster ● ML examples ● Sticker recommendations ● DNN examples (“look a like” audience, stickers) PRESENTED
  55. 55. ● AB test in detail (presented separately) ● Audio DNN (poster) ● Sparse DNN, Contextual Bandits (poster) ● DNN on mobile (in progress) NOT PRESENTED
  56. 56. ● Virtually accessible to all the LINE services/data. ● Great coworkers ● All the positions are open ● ML engineer, Server/infra engineer, PM WE’RE HIRING
  57. 57. THANKS!
  • lmatt_bit

    May. 21, 2021
  • yukikomizuno

    Mar. 15, 2020
  • janandd

    Jul. 9, 2019
  • ShinHaya

    Jun. 25, 2019
  • keinatsui

    Jun. 24, 2019
  • morningyrp

    Apr. 14, 2019
  • funkysaTAN

    Mar. 12, 2019
  • ChanidaPhitthayanon

    Mar. 12, 2019
  • hirokiiida165

    Mar. 7, 2019
  • EvansLin

    Feb. 21, 2019
  • navy40

    Jan. 17, 2019
  • lee23118

    Jan. 6, 2019
  • HongYu3

    Dec. 29, 2018
  • silverrin

    Dec. 26, 2018
  • baramon

    Dec. 15, 2018
  • eniton

    Dec. 1, 2018
  • loptar

    Nov. 28, 2018
  • tmy1108jp

    Nov. 27, 2018

Haruka Kikuchi LINE / Machine Learning Team He is going to talk about how machine learning and recommend engine technologies have been planned and implemented to LINE services with examples and overall pictures. Data Labs is an independent division which is separated from other business departments under the mission of company-wide use of data. It supports various types of LINE services in many ways including providing high level analysis done by data scientists, providing infrastructures for recommend engines and analysis as well as publishing various reports. This session will focus on initiatives related to machine learning, in particular: -How they define the role and responsibility of each team to provide various types of machine learning technologies -What tactics are being used to provide technologies to diverse services and a large amount of users -How they utilize trendy technologies such as deep learning.

Views

Total views

90,618

On Slideshare

0

From embeds

0

Number of embeds

354

Actions

Downloads

0

Shares

0

Comments

0

Likes

18

×