Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A.Levenchuk -- Machine learning engineering

12,785 views

Published on

Talk "Machine Learning Engineering" by Anatoly Levenchuk at DeepHack in PhyzTech, 2-Feb-2016

Published in: Technology

A.Levenchuk -- Machine learning engineering

  1. 1. Machine Learning Engineering Anatoly Levenchuk Copyright © 2016 by Anatoly Levenchuk. Permission granted to DeepHack and INCOSE to publish and use.
  2. 2. What is machine learning as a human activity? • Ontological question (Aristotle definition: via class-subclass specialization) • Why it is important? – How to pay? [grants, investmetns, charity] – How to teach? [science, engineering, arts/crafts] – How to name and distinguish in communication (hiring – participation in division of labor)? 2 How you name yourself to colleagues, when hacking machine learning system?
  3. 3. Machine learning is a… • Science! MSc. in Machine Learning [BigData] • Research? • Engineering? • Art? Programming is a… • Science? Computer science, MSc. • Research? Computer science, MSc. • Engineering? Software engr., MSc. and MSE! • Art? Master of Art in Mathematics! 3 http://www.computer.org/web/education/professional-competency-certifications https://www.kaggle.com/competitions
  4. 4. Test: Why is my program not working? Why is my program not working? You need to know why? To repair compiler? Software engineer (systems) To advance theory? Computer Scientist You need to program working properly? Software engineer (application) 4
  5. 5. Science Resulting in models, descriptions (theories), ontologies: • M0 – manufacturing (not science!). Programmers are engineers: software is physical system! • M1 – design/applied research (Edison) • M2 – basic research (Einstein) • M3 – philosophical logic/mathematics • There are multiple meta-levels. • Scientists produce these meta-descriptions 5
  6. 6. Engineering • Engineering – discipline, art, skill and profession of acquiring and applying scientific, mathematical, economic, social, and practical knowledge, in order to design and build structures, machines, devices, systems, materials and processes that safely realize improvements to the lives of people. • Engineering is the application of mathematics, empirical evidence and scientific, economic, social, and practical knowledge in order to invent, innovate, design, build, maintain, research, and improve structures, machines, tools, systems, components, materials, and processes. 6 https://en.wikipedia.org/wiki/Engineering https://en.wikipedia.org/wiki/Outline_of_engineering
  7. 7. Data scientists – ML Engineers 7 Model/Theory [metamodel] Engineering/Applied Research Reality/Data/Model Science/Basic Research If it is not about budgeting and social status, it need not to distinguish science and engineering! Practice both of them!
  8. 8. Engineering for science 8 http://blogs.nvidia.com/blog/2016/01/12/accelerating-ai-artificial-intelligence-gpus/ Scientists are mere owner-operators of instruments. Who built the Big Hadron Collider? Experiments order by scientists, builds and carried by engineers, interprets by scientists.
  9. 9. The sunset of the professions, not jobs! 9 • Life-long • Special education • No other professions in a mix • Several years long • Additional training • One competence in the mix Machine learning engineering is not a profession. It is a competency!
  10. 10. Machine learning (systems) engineering • Control (systems) engineering • Machine Learning (systems) engineering 10 ? http://www.payscale.com/research/US/Job=Controls_Engineer/Salary • Systems Engineer (IT) • Cognitive/Machine Intelligence Systems Engineer ?
  11. 11. What about jobs? 11 Algorithms + Data Structures = Programs (Niklaus Wirth) Scientist is not an engineer, data is not a system
  12. 12. Kind of Engineerings • Mechanical engineering • Agriculture engineering • Aerospace engineering – aircraft architecture • Systems engineering • System of systems engineering • … • Software engineering • Control [systems] engineering – control [system] architecture • Knowledge engineering -- architecture • Machine learning [system] engineering • … • Neural engineering • neural network engineering -- neural [network] architecture • Feature engineering -- ??? 12
  13. 13. Systems, Software, Machine Learning Engineerings • Systems engineering [Bell Labs in 1940s, boosted as a profession by NCOSE 1990] • Software engineering [term appeared in 1965, boosted by NATO as a profession in 1968] • Machine learning engineering [term appeared in 2011] 13https://www.google.com/trends/explore#q=machine%20learning%20engineering&cmpt=q&tz=Etc%2FGMT-3
  14. 14. Conversion of engineerings and Disruption of engineerings 14 Software Engineering Machine Learning Engineering Janosh Szepanovits. Convergence: Model- Based Software, Systems And Control Engineering + http://www.infoq.com/presentations/Model-Based-Design-Janos-Sztipanovits Le Bottou – «Machine Learning disrupts software engineering» http://leon.bottou.org/slides/2challenges/2challenges.pdf We can add: • Machine learning disrupts systems engineering • Machine learning disrupts control engineering • … • Machine learning disrupts contemporary engineering
  15. 15. Can we use systems and software engineering wisdom in MLE? Le Bottou http://leon.bottou.org/slides/2challenges/2challenges.pdf • Models as modules: problematic due to weak contracts (models behave differently on different input data) • Learning algorithms as modules: problematic due to output depends on the training data which itself depends on every other module Engineering is not only about modularity and modular synthesis! What about other aspects?! • More attention to left part of V-diagram • Optimizations later • … • What else? 15
  16. 16. Technical Debt 16 Machine Learning: The High-Interest Credit Card of Technical Debt http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf Hidden Technical Debt in Machine Learning Systems http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf • Hack now, pay later (with interest, of course!). • Based on heuristics from software engineering (same approach as our: usage of software and systems engineering wisdom in machine learning engineering). • Set of domain-specific heuristics for machine learning
  17. 17. Bionics and machine learning systems engineering • In short: brain is only an inspiration, not a model for reproducing! • There are other “learning systems engineerings”: e.g. neural engineering (https://en.wikipedia.org/wiki/Neural_engineering). • AGI (artificial general intelligence) is a far goal, but magnet for freaks of all sorts. Better not mention it. • Biologically plausible machine learning is about science, not engineering. 17
  18. 18. Knowledge engineering • Ontology engineering (manually) • Solutions are (manually) programmed. • Example: robot-«butterfly», https://youtu.be/kyvW5sOcZHU, https://youtu.be/V30e77x8BQA – Every type of movement should be programmed anew – Non-adaptable to changes of environment and device – The best science available up today! – Perfect, if CPS perform only one or two movements. Not for robots, definitely! • No learning! 18
  19. 19. Tribes Shallow Learning Big Data Deep Learning Neuro evolution Bayes Army Symbolic 19
  20. 20. Our definition of complexity Complex system – the one that does not fit in the sole engineer’s head, thus collaboration of a team and automation of a knowledge work are mandatory. E.g.: • Aircraft • programming-in-the small vs.programming in the large • VLSI – very large scale integration, more than 1000 transistors on a single chip (now transistor count is more than 20bln. – FPGA Virtex-Ultrascale XCVU440) • Artificial neural network – 16bln. parameters. 20
  21. 21. Comlexity • Systems Engineering • Machine learning 21 Complex system: not fit into one (hundred) heads for its development. Stellarators, Tokamaks, BHC, aerospace and VLSI engineering. IBM Watson (up to 2011): team of 40. Still not very complex from engineering point of view. http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/http://787updates.newairplane.com/787-Suppliers/World-Class-Supplier-Quality
  22. 22. CNN Architecture/complexity Growth 22 1998 2012 9/2014 2/2015 12/2015 9/2014 http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/ LeNet 28*28 LeNet 28*28 VGG 224x224 GoogLeNet 224x224 Inception V3 299x299 Inception BN 224x224
  23. 23. AutoML • Generative design/architecturing of networks • Bayesian convergence • Neuroevolution • Dynamic neural description languages (e.g. Chainer) 23 Automatization of machine learning, CAMLE (computer-aided machine learning engineering) is the main trend of today and tomorrow!
  24. 24. Master Algorithm Pedro Domingos [module/construction] • Symbolic • Evolution • Connectivist • Bayesian • Analogy [No free lunch!] Sarath Chandar [component/function] • multi-task learning • transfer learning • zero-shot/one-shot learning • multi-modal learning • reinforcement learning 24 http://apsarath.github.io/2016/01/19/agi/ http://www.amazon.com/dp/0465065708/
  25. 25. Intellect-stack is only about one aspect of a whole intellect system. Intellect-stack is about Platforms (modules) = «how to make it» Based on Fig.3 ISO 81346-1 -Modules =Components +Allocations 25 Modules and interfaces: platforms/layers Stack
  26. 26. Platform • This is module viewpoint («how to make») • Platform is a technology stack layer • Cohesive set of modules with published API • Can be based on top of other platform 26
  27. 27. Intelligence Platform Stack and machine learning engineering in it 27 Application (domain) Platform Cognitive Architecture Platform Learning Algorithm Platform Computational library General Computer Language CPU GPU/FPGA/Physical computation Drivers GPU/FPGU/Physical computation Accelerator Neurocompiler Neuromorphic driver Neuromorphic chip Disruptionenablers Disruptiondemand Thanks for computer gamers for their disruption demand to give us disruption enabler such as GPU!
  28. 28. Alternative deep learning stack (as viewed by GPU hardware people) 28http://www.nextplatform.com/2015/12/07/gpu-platforms-emerge-for-longer-deep-learning-reach/ • No cognitive and application levels • Languages unimportant • Chassis, backplane, blades importans (separate layer) • No neuromorphic processing
  29. 29. Hardware Acceleration (except GPU) Is this machine learning engineering? No! But… • Algorithm-dependent • Need compilation (drivers) • Speed rules • Power rules • Scale rules • GPU • FPGA • ASIC • Neuromorphic chips • Physical computing 29 http://lighton.io/ • Approximating kernels at the speed of light http://arxiv.org/abs/1510.06664 Analog, optical device, that performs the random projections literally at the speed of light without having to store any matrix in memory. This is achieved using the physical properties of multiple coherent scattering of coherent light in random media. • Towards Trainable Media: Using Waves for Neural Network-style Learning • Bitwise Neural Networks http://arxiv.org/abs/1601.06071 • Conversion of Artificial Recurrent Neural Networks to Spiking Neural Networks for Low-power Neuromorphic Hardware http://arxiv.org/abs/1601.04187 http://arxiv.org/abs/1510.03776
  30. 30. General Computer Language Computer science + Software engineering • Important! Separate layer in intellect-stack! • 2 language problem • experiment and production, like deep learning frameworks (speed) • «Wrappers» in libraries (thresholds in understanding of a full stack up to hardware bottom) • My preference: Julia (http://julialang.org/) • Scientific computing is design goal of Julia, MATLAB-similar syntax • 2 language problem solved (speed of computation as in C, speed of writing code as in Python) • Extensive mathematical function library, Base library and external packages in native Julia • Parallel computing supported (GPU supported too) • Not object-oriented, using multiple dispatch as expression problem solution (good modularity) • Version 0.4.3 now (1.0 expecting in one year) • Caution: slightly more complex than Python, should not be your first computer language… • MXNet deep learning framework have Julia wrapper • DSL for deep learning is not General Computer Language • Probabilistic programming languages -- http://probabilistic-programming.org/wiki/Home • DNN description languges, like in CNTK -- https://github.com/Microsoft/CNTK 30
  31. 31. Computation libraries/frameworks/platforms Not a machine learning engineering! • Computation libraries  Drivers+Hardware (GPU, clusters) • Linear algebra, optimization, autodiff, symbolic computations, etc. • Can be standalone platform, thus differ from machine learning libraries (general algorithms for multiple purposes: bioinformatics, physics, astronomy, engineering, machine learning etc.) • Deep learning frameworks often includes such a library (Torch, Theano, …). • Scikit (NumPy, SciPy, and matplotlib) • Nd4j (n-dimentional arrays for Java) • Julia packages • … • Non-opensource: Mathematica, Maple… 31 Machine learning is “yet another domain modules and DSL” for them!
  32. 32. Learning algorithm frameworks (not systems)! Machine learning engineering! • Gentleman algorithm set (CNN, RNN,…) • Updating with an arxiv.org papers rhythm! • Network description language – DSL for machine learning engineering • Experiments and production (scalable!) • Extensibility (on base of general computing language and scientific computing library: on base of another layer platform in intellect-stack) • Presented as The Machine Learning Platform (including all lower levels assembled and tuned) • There are hundreds of its: no less then «web frameworks» in early web 32 http://www.slideshare.net/yutakashino/ss-56291783 • Google • Facebook • Microsoft • Baidu • IBM • Samsung • … + standard datasets for comparisons and benchmarking + other tribes platforms
  33. 33. Construction (type of modules) in machine learning • Deep learning classics (DSL in deep learning frameworks) • Probabilistic languages http://probabilistic-programming.org/, https://probmods.org/ • Deep learning and Bayesian conversion -- )http://www.nextplatform.com/2015/09/24/deep-learning-and-a-new-bayesian-golden-age/, http://blog.shakirm.com/2015/10/bayesian-reasoning-and-deep-learning/, http://arxiv.org/abs/1512.05287 • Differentiable languages and datatypes http://colah.github.io/posts/2015-09-NN- Types-FP/, http://www.blackboxworkshop.org/pdf/nips2015blackbox_zenna.pdf, http://arxiv.org/abs/1506.02516 • … • Blends and hybrids of many other learning architectures 33 Varieties in representations: in deep learning abstraction is architecturally layered, in other approaches it different!
  34. 34. Algorithm platform + Hardware platform = Algorithm platform (hardware is not visible for a platform user, but still matter!) 34 http://blogs.microsoft.com/next/2016/01/25/microsoft-releases-cntk-its-open-source-deep-learning-toolkit-on-github/
  35. 35. Cognitive systems/architectures Learning, communications, reasoning, planning • Cognitive = knowledge processing. Knowledge is information that is useful in variety of situations. • Cognitive architecture/system is a platform for multiple application systems. • Ensembles of learning algorithms: it is close to cognitive systems engineering • Cognitive systems engineering is a machine learning systems engineering plus something else • Something else: e.g. knowledge engineering: manual coding (formalization) of knowledge. • Machine learning systems engineering is not cognitive systems engineering, it is smaller! 35
  36. 36. Machine Learning and Cognitive Level • «deep learning research is likely to continue its expansion from traditional pattern recognition jobs to full-scale AI tasks involving symbolic manipulation, memory, planning and reasoning. This will be important for reaching to full understanding of natural language and dialogue with humans (i.e., pass the Turing test). Similarly, we are seeing deep learning expanding into the territories of reinforcement learning, control and robotics and that is just the beginning» -- Joshua Bengio https://www.quora.com/Where-is-deep-learning-research-headed 36 If we can learn to reason, plan, model, act – then machine learning engineering will be cognitive systems engineering! Machine intelligence vs. artificial intelligence
  37. 37. Example: MANIC A Minimal Architecture for General Cognition (http://arxiv.org/abs/1508.00019) • Keywords: action, planning, observation, decisions, knowledge, … • Is it keywords for learning systems engineering? 37
  38. 38. Application level of intellect-stack • Killer application for learning systems is here! • Domain specificity and data is here! • End users and money are here! • Systems engineering is here! 38 This chart is only about enterprise AI systems market. https://www.tractica.com/newsroom/press-releases/artificial-intelligence-for-enterprise-applications-to-reach-11-1-billion-in-market-value-by-2024/ If you have no application of interest, there will be no data, no money, no developments, no engineering. Most machine learning engineering is applied. Only small part is machine learning platform development.
  39. 39. Application level: systems engineering • Strategizing and Conceptual design • Requirements engineering • System Architecture • V&V • Configuration management • Machine learning engineers is one of multiple engineers that participate in a cyber- physical system project team. 39 Sensors Consoles http://www.nist.gov/el/nist-releases-draft-framework-cyber-physical-systems-developers.cfm Actuators Monitors
  40. 40. Life cycle stages dictionary 40 Machine learning Systems engineering Conception and requirements Conception and requirements Architecture and Design Architecture and Design Training Manufacturing Transfer learning, ensembling Integration Validation and verification Validation and verification Inference Operation
  41. 41. Stakeholders concerns Domain-specific concern: • Expressivity • Computational efficiency • Trainability • Good generalization (not overfitting) Traditional concerns • Composability – layering, ensembling • Compositiality – transfer learning • Resilience 41
  42. 42. Intellect-stack and machine learning (systems) engineering • Machine learning (systems) engineering cover now only small part of the whole intellect-stack but interact with all levels. • No one Googbookdu can develop all levels in intellect- stack platforms (from hardware accelerators in the bottom up to application on the top) by itself. Maybe except IBM that can span from TrueNorth to IBM Watson applications ;-) • Interfaces from supporting platforms will be stabilizing and… in constant update (like in software engineering APIs: change of everything once in 5 years). • Technology disruption starts with low (enabling) levels of a stack, demand provides from upper level, thus nobody in the middle can ignore developments in other layer platforms. 42
  43. 43. Intellect-Stack 43 Application (domain) Platform Cognitive Architecture Platform Learning Algorithm Platform Computational library General Computer Language CPU GPU/FPGU/Physical computation Drivers GPU/FPGA/Physical computation Accelerator Neurocompiler Neuromorphic driver Neuromorphic chip Disruptionenablers Disruptiondemand Where are you now? Where are you tomorrow?
  44. 44. Thank you! Anatoly Levenchuk, TechInvestLab, president INCOSE Russian chapter, research director https://ru.linkedin.com/in/ailev ailev@asmp.msk.su Blog in Russian: http://ailev.ru 44

×