The talk covers:
- Why do companies need research?
- What researchers do?
- Research at Yandex
- How did I get there?
- Our group's research: Distributed deep learning over the Internet
2. 2
Who am I?
Alexander Borzunov
• Researcher at Yandex
• NEERC ICPC 2017 prize winner
• Bachelor’s at Ural FU
• Master’s at HSE University +
Yandex School of Data Analysis
3. 3
Plan
• Why do companies need research?
• What researchers do?
• How to get there?
4. 4
Why do companies need research?
Product development:
• Developers address user feedback/business needs
• No time to dive deeply into a problem (e. g. invent a new algorithm)
Research:
• Experts work on problems from a particular area full-time
• Necessary to get innovations in the long term
5. 5
How is it different from universities?
Research in companies:
• More funding
• Access to more compute
• Interaction with product teams
9. 9
What researchers do?
• Follow latest findings and results (e. g. on Twitter)
• Choose promising research directions
10. 10
What researchers do?
• Follow latest findings and results (e. g. on Twitter)
• Choose promising research directions
• Collaborate with each other
11. 11
What researchers do?
• Follow latest findings and results (e. g. on Twitter)
• Choose promising research directions
• Collaborate with each other
• Conduct experiments (you need to write code quickly to evaluate many ideas)
12. 12
What researchers do?
• Follow latest findings and results (e. g. on Twitter)
• Choose promising research directions
• Collaborate with each other
• Conduct experiments
• Design rigorous proofs
13. 13
What researchers do?
If the method works:
• Write a paper for an (international) conference
• Defend it in a discussion with reviewers
• If accepted:
• Travel to a conference ✈️
• Tell the world about it on Twitter, Reddit, Habr, etc. 🌎
• Your results may be adopted by product teams
14. 14
Yandex Research
• Focus: machine learning and related algorithms
• Computer vision, image generation
• Language processing
• Program synthesis with neural nets (e. g. trained on Codeforces solutions)
• Systems for distributed training
• Theory, e. g. continuous optimization
• Publications in top venues such as NeurIPS, ICML, CVPR, ACL
17. 17
How did I get there?
2014 – 2018 Bachelor’s at Ural FU, participated in ICPC
▎ “What’s next?”
▎ “Machine learning – a growing field”
18. 18
Machine learning on “Cats vs. Dogs”
No methods known to get 60% accuracy (random gives 50%)
2007
vs.
19. 19
Machine learning on “Cats vs. Dogs”
No methods known to get 60% accuracy (random gives 50%)
Solved with 98% accuracy
2007
2014
vs.
20. 20
Machine learning on “Cats vs. Dogs”
No methods known to get 60% accuracy (random gives 50%)
Solved with 98% accuracy
Neural nets can draw cats and dogs themselves
(this cat does not exist)
2007
2014
2019
vs.
21. 21
Machine learning in 2021
Neural nets can draw cats and dogs themselves
Neural nets draw pictures matching any text description
2019
2021
22. 22
How did I get there?
2018 – 2020
2019 – 2021
2021 – Now
Master’s at HSE University + Yandex School of Data Analysis
▎ “Self-driving – a product that may change everyday life”
Research Engineer at Yandex Self-Driving
▎ “Research – a place where people invent new things”
Yandex Research
23. 23
What I do?
• Compute needed for training latest neural nets grows quickly
• Popular training methods are designed for high-performance clusters
• Cluster to train GPT-3 costs over $250 million
• Hard to get if you are in a university or a startup
• Solution: distributed training over the Internet (like BitTorrent)
24. 24
First use case: Language models
• Training one large neural net allows to solve many tasks:
• Understanding intents, tone, logical relations from a sentence
• Answering questions
• Extracting entities (locations, persons, etc.)
• Once trained, it is easy to use for your business/research
25. First use case: Language model for Bengali
• TOP-6 language by no. of native speakers
• No good model yet
26. First use case: Language model for Bengali
• We offered people to train one together!
Together with:
• Got a competitive model, state-of-the-art on some tasks
27. Roadblock to scaling: Security
• To train a neural net, you need to average
computations performed by peers on
different data samples
• A troll or competitor may destroy the
model by sending wrong values once
29. 29
Idea #2:
• Peers broadcast hashes of their calculations.
• Then, the system selects “policemen” to validate results of some peers.
• If a policeman accuses someone, we can learn who is right from the hashes.
Secure distributed training
31. 31
Thank you!
Check out our publications and
available positions on
research.yandex.com
I am available for a chat or questions
at the Yandex area
on the 3rd floor terrace until 7 pm 🙂