TensorFlow Extended (TFX) is a platform for deploying and managing machine learning models in production. It represents a machine learning pipeline as a sequence of components that ingest data, validate data quality, transform features, train and evaluate models, and deploy models to a serving system. TFX uses TensorFlow and is open-sourced by Google. It provides tools to track metadata and metrics throughout the pipeline and helps keep models organized as they are updated and deployed over time in a production environment.
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016Taehoon Kim
발표 영상 : https://goo.gl/jrKrvf
데모 영상 : https://youtu.be/exXD6wJLJ6s
Deep Q-Network, Double Q-learning, Dueling Network 등의 기술을 소개하며, hyperparameter, debugging, ensemble 등의 엔지니어링으로 성능을 끌어 올린 과정을 공유합니다.
Keras is a high level framework that runs on top of AI library such as Tensorflow, Theano, or CNTK. The key feature of Keras is that it allow to switch out the underlying library without performing any code changes. Keras contains commonly used neural-network building blocks such as layers, optimizer, activation functions etc and keras has support for convolutional and recurrent neural networks. In addition keras contains datasets and some pre-trained deep learnig applications that make it easier to learn for beginners. Essentially Keras is democrasting deep learning by reducing barrier into deep learning.
Top 5 Python Libraries For Data Science | Python Libraries Explained | Python...Simplilearn
Python is the most widely used programming language today. When it comes to solving Data Science tasks and challenges, Python never ceases to surprise its audience. Most data scientists are already leveraging the power of Python programming every day. Python is easy to learn, easier to debug, widely used, object-oriented, open source, high-performance language and there are many more benefits of using Python programming. Python has been built with extraordinary libraries which are used by programmers everyday in solving the problems. So, now let us talk about the Top 5 Python libraries for Data Science.
Below are the Top 5 Python libraries for Data science:
1. Tensorflow ( 00:29 )
2. Numpy ( 03:01 )
3. Scipy ( 06:38 )
4. Pandas ( 08:20 )
5. Matplotlib ( 11:41 )
This Data Science with Python course will establish your mastery of data science and analytics techniques using Python. With this Python for Data Science Course, you'll learn the essential concepts of Python programming and become an expert in data analytics, machine learning, data visualization, web scraping and natural language processing. Python is a required skill for many data science positions, so jump start your career with this interactive, hands-on course.
Why learn Data Science?
Data Scientists are being deployed in all kinds of industries, creating a huge demand for skilled professionals. A Data Scientist is the pinnacle rank in an analytics organization. Glassdoor has ranked data scientist first in the 25 Best Jobs for 2016, and good data scientists are scarce and in great demand. As a Data Scientist, you will be required to understand the business problem, design the analysis, collect and format the required data, apply algorithms or techniques using the correct tools, and finally make recommendations backed by data.
1. Gain an in-depth understanding of data science processes, data wrangling, data exploration, data visualization, hypothesis building, and testing. You will also learn the basics of statistics.
Install the required Python environment and other auxiliary tools and libraries
2. Understand the essential concepts of Python programming such as data types, tuples, lists, dicts, basic operators and functions
3. Perform high-level mathematical computing using the NumPy package and its large library of mathematical functions
Perform scientific and technical computing using the SciPy package and its sub-packages such as Integrate, Optimize, Statistics, IO and Weave
4. Perform data analysis and manipulation using data structures and tools provided in the Pandas package
5. Gain expertise in machine learning using the Scikit-Learn package
The Data Science with python is recommended for:
1. Analytics professionals who want to work with Python
2. Software professionals looking to get into the field of analytics
3. IT professionals interested in pursuing a career in analytics
Learn more at: https://www.simplilearn.com/
TensorFlow에 대한 분석 내용
- TensorFlow?
- 배경
- DistBelief
- Tutorial - Logistic regression
- TensorFlow - 내부적으로는
- Tutorial - CNN, RNN
- Benchmarks
- 다른 오픈 소스들
- TensorFlow를 고려한다면
- 설치
- 참고 자료
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016Taehoon Kim
발표 영상 : https://goo.gl/jrKrvf
데모 영상 : https://youtu.be/exXD6wJLJ6s
Deep Q-Network, Double Q-learning, Dueling Network 등의 기술을 소개하며, hyperparameter, debugging, ensemble 등의 엔지니어링으로 성능을 끌어 올린 과정을 공유합니다.
Keras is a high level framework that runs on top of AI library such as Tensorflow, Theano, or CNTK. The key feature of Keras is that it allow to switch out the underlying library without performing any code changes. Keras contains commonly used neural-network building blocks such as layers, optimizer, activation functions etc and keras has support for convolutional and recurrent neural networks. In addition keras contains datasets and some pre-trained deep learnig applications that make it easier to learn for beginners. Essentially Keras is democrasting deep learning by reducing barrier into deep learning.
Top 5 Python Libraries For Data Science | Python Libraries Explained | Python...Simplilearn
Python is the most widely used programming language today. When it comes to solving Data Science tasks and challenges, Python never ceases to surprise its audience. Most data scientists are already leveraging the power of Python programming every day. Python is easy to learn, easier to debug, widely used, object-oriented, open source, high-performance language and there are many more benefits of using Python programming. Python has been built with extraordinary libraries which are used by programmers everyday in solving the problems. So, now let us talk about the Top 5 Python libraries for Data Science.
Below are the Top 5 Python libraries for Data science:
1. Tensorflow ( 00:29 )
2. Numpy ( 03:01 )
3. Scipy ( 06:38 )
4. Pandas ( 08:20 )
5. Matplotlib ( 11:41 )
This Data Science with Python course will establish your mastery of data science and analytics techniques using Python. With this Python for Data Science Course, you'll learn the essential concepts of Python programming and become an expert in data analytics, machine learning, data visualization, web scraping and natural language processing. Python is a required skill for many data science positions, so jump start your career with this interactive, hands-on course.
Why learn Data Science?
Data Scientists are being deployed in all kinds of industries, creating a huge demand for skilled professionals. A Data Scientist is the pinnacle rank in an analytics organization. Glassdoor has ranked data scientist first in the 25 Best Jobs for 2016, and good data scientists are scarce and in great demand. As a Data Scientist, you will be required to understand the business problem, design the analysis, collect and format the required data, apply algorithms or techniques using the correct tools, and finally make recommendations backed by data.
1. Gain an in-depth understanding of data science processes, data wrangling, data exploration, data visualization, hypothesis building, and testing. You will also learn the basics of statistics.
Install the required Python environment and other auxiliary tools and libraries
2. Understand the essential concepts of Python programming such as data types, tuples, lists, dicts, basic operators and functions
3. Perform high-level mathematical computing using the NumPy package and its large library of mathematical functions
Perform scientific and technical computing using the SciPy package and its sub-packages such as Integrate, Optimize, Statistics, IO and Weave
4. Perform data analysis and manipulation using data structures and tools provided in the Pandas package
5. Gain expertise in machine learning using the Scikit-Learn package
The Data Science with python is recommended for:
1. Analytics professionals who want to work with Python
2. Software professionals looking to get into the field of analytics
3. IT professionals interested in pursuing a career in analytics
Learn more at: https://www.simplilearn.com/
TensorFlow에 대한 분석 내용
- TensorFlow?
- 배경
- DistBelief
- Tutorial - Logistic regression
- TensorFlow - 내부적으로는
- Tutorial - CNN, RNN
- Benchmarks
- 다른 오픈 소스들
- TensorFlow를 고려한다면
- 설치
- 참고 자료
This presentation discusses matters of AI and machine learning. This presentation was given during the ITU-T workshop on Machine Learning for 5G and beyond, held at ITU HQ in Geneva, Switzerland on 29 Jan 18. More information on the workshop can be found here: https://www.itu.int/en/ITU-T/Workshops-and-Seminars/20180129/Pages/default.aspx
Join our upcoming forums and workshops here: https://www.itu.int/en/ITU-T/Workshops-and-Seminars/Pages/default.aspx
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...Edureka!
( ** Deep Learning Training: https://www.edureka.co/ai-deep-learning-with-tensorflow ** )
This Edureka PyTorch Tutorial (Blog: https://goo.gl/4zxMfU) will help you in understanding various important basics of PyTorch. It also includes a use-case in which we will create an image classifier that will predict the accuracy of an image data-set using PyTorch.
Below are the topics covered in this tutorial:
1. What is Deep Learning?
2. What are Neural Networks?
3. Libraries available in Python
4. What is PyTorch?
5. Use-Case of PyTorch
6. Summary
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Langchain Framework is an innovative approach to linguistic data processing, combining the principles of language sciences, blockchain technology, and artificial intelligence. This deck introduces the groundbreaking elements of the framework, detailing how it enhances security, transparency, and decentralization in language data management. It discusses its applications in various fields, including machine learning, translation services, content creation, and more. The deck also highlights its key features, such as immutability, peer-to-peer networks, and linguistic asset ownership, that could revolutionize how we handle linguistic data in the digital age.
Creating ML models is just the starting of a long journey. In this presentation which was given as a talk on e2e AI talks, I talk about the various challenges in the machine learning life cycle
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck is a comprehensive virtual solution for technology experts. With the help of this PowerPoint theme, you can elucidate the differences between machine intelligence, machine learning, and deep learning. Employ our PPT presentation to cover merits, demerits, learning techniques, and types of supervised machine learning. You can also elucidate the benefits, limitations, and types of unsupervised machine learning. Similarly, cover important aspects related to reinforcement learning. Our AI PowerPoint slideshow also helps you in elaborating back propagation of neural networks. Walk your audience through the expert system in artificial intelligence. Cover examples, features, components, application, benefits, limitations, and other aspects of the expert system. Consolidate the deep learning process, recurrent neural networks, and convolutional neural networks through this PPT template deck. Give a crisp introduction to artificial intelligence. Introduce types, algorithms, trends, and use cases of artificial intelligence. Hit the download icon and begin instant personalization. Our Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck are explicit and effective. They combine clarity and concise expression. https://bit.ly/3nfgjaT
데이터를 둘러싼 정책과, 기업과 기술의 진화는 빠르게 변화하고 있으며, 모든 지향점은 기업들이 다양한 데이터를 활용하여 경쟁력을 확보하고 이를 통해 AI기반의 혁신을 하고자 하는데 있다.
이 과정에서 수 많은 기업의 업무 전무가, 데이터 사이언티스트 등이 다양한 기업의 혁신을 지원할 수 있는 AI 모델을 검증하는 과정을 거치게 됩니다.
하지만, 이렇게 수 많은 AI 모델이 실제 비즈니스에 적용되기 위해서는 인프라, 및 서비스 관점의 기술이 반드시 필요하게 됩니다.
MLOps는 기업에 필요한 혁신적인 아이디어(AI Model)을 적시에 비즈니스 환경에 적용할 수 있도록 지원하는 기술 및 트렌드 입니다.
주요 내용은
- 데이터를 둘러싼 환경의 변화
- 기업의 AI Model 적용시 마주하는 현실
- MLOps가 해결 가능한 문제들
- MLOps의 영역별 주요 기술들
- MLOps 도입 시 기업의 AI 환경은 어떻게 변할까?
- AI 모델을 비즈니스 환경에 적용(배포)한다는 것은?
2021년 12월 코리아 데이터 비즈니스 트렌드(데이터산업진흥원 주최)에서 발표한 내용을 공유 가능한 부분만 정리함.
발표 영상 참고 : https://www.youtube.com/watch?v=lL-QtEzJ3WY
Introduction to Deep Learning, Keras, and TensorFlowSri Ambati
This meetup was recorded in San Francisco on Jan 9, 2019.
Video recording of the session can be viewed here: https://youtu.be/yG1UJEzpJ64
Description:
This fast-paced session starts with a simple yet complete neural network (no frameworks), followed by an overview of activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Next, we'll create a neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. For best results, familiarity with basic vectors and matrices, inner (aka "dot") products of vectors, and rudimentary Python is definitely helpful. If time permits, we'll look at the UAT, CLT, and the Fixed Point Theorem. (Bonus points if you know Zorn's Lemma, the Well-Ordering Theorem, and the Axiom of Choice.)
Oswald's Bio:
Oswald Campesato is an education junkie: a former Ph.D. Candidate in Mathematics (ABD), with multiple Master's and 2 Bachelor's degrees. In a previous career, he worked in South America, Italy, and the French Riviera, which enabled him to travel to 70 countries throughout the world.
He has worked in American and Japanese corporations and start-ups, as C/C++ and Java developer to CTO. He works in the web and mobile space, conducts training sessions in Android, Java, Angular 2, and ReactJS, and he writes graphics code for fun. He's comfortable in four languages and aspires to become proficient in Japanese, ideally sometime in the next two decades. He enjoys collaborating with people who share his passion for learning the latest cool stuff, and he's currently working on his 15th book, which is about Angular 2.
Natural Language Processing (NLP) is often taught at the academic level from the perspective of computational linguists. However, as data scientists, we have a richer view of the world of natural language - unstructured data that by its very nature has important latent information for humans. NLP practitioners have benefitted from machine learning techniques to unlock meaning from large corpora, and in this class we’ll explore how to do that particularly with Python, the Natural Language Toolkit (NLTK), and to a lesser extent, the Gensim Library.
NLTK is an excellent library for machine learning-based NLP, written in Python by experts from both academia and industry. Python allows you to create rich data applications rapidly, iterating on hypotheses. Gensim provides vector-based topic modeling, which is currently absent in both NLTK and Scikit-Learn. The combination of Python + NLTK means that you can easily add language-aware data products to your larger analytical workflows and applications.
Recurrent Neural Network : Multi-Class & Multi Label Text ClassificationAmit Agarwal
This is the content for the 4th International Conference on Data Management, Analytics and Innovation - ICDMAI 2020.
Here we talk & discuss LSTM and Text Classification for Multi-Class & Multi-Label problem statement.
Code : https://github.com/amitbcp/icdmai_2020
PDF version of slides explains the various optimization algorithms used in deep learning and a comparison between them. It also has a brief about the ICML papers "Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers" and "Optimizer Benchmarking Needs to Account for Hyperparameter Tuning."
If you have any queries, you can reach out to me at @RakshithSathish on Twitter or rakshith-sathish on LinkedIn.
Natural language processing techniques transition from machine learning to de...Divya Gera
Natural Language processing, its need, business applications, NLP with machine learning, Text data preprocessing for machine learning, NLP with Deep Learning.
The release of TensorFlow 2.0 comes with a significant number of improvements over its 1.x version, all with a focus on ease of usability and a better user experience. We will give an overview of what TensorFlow 2.0 is and discuss how to get started building models from scratch using TensorFlow 2.0’s high-level api, Keras. We will walk through an example step-by-step in Python of how to build an image classifier. We will then showcase how to leverage a transfer learning to make building a model even easier! With transfer learning, we can leverage other pretrained models such as ImageNet to drastically speed up the training time of our model. TensorFlow 2.0 makes this incredibly simple to do.
Generative AI models, such as ChatGPT and Stable Diffusion, can create new and original content like text, images, video, audio, or other data from simple prompts, as well as handle complex dialogs and reason about problems with or without images. These models are disrupting traditional technologies, from search and content creation to automation and problem solving, and are fundamentally shaping the future user interface to computing devices. Generative AI can apply broadly across industries, providing significant enhancements for utility, productivity, and entertainment. As generative AI adoption grows at record-setting speeds and computing demands increase, on-device and hybrid processing are more important than ever. Just like traditional computing evolved from mainframes to today’s mix of cloud and edge devices, AI processing will be distributed between them for AI to scale and reach its full potential.
In this presentation you’ll learn about:
- Why on-device AI is key
- Full-stack AI optimizations to make on-device AI possible and efficient
- Advanced techniques like quantization, distillation, and speculative decoding
- How generative AI models can be run on device and examples of some running now
- Qualcomm Technologies’ role in scaling on-device generative AI
This session will begin with an introduction to non-relational (NoSQL) databases and compare them with relational (SQL) databases. Learn the fundamentals of Amazon DynamoDB, a fully managed NoSQL database service, and see the DynamoDB console first-hand. See a walk-through demo of building a serverless web application using this high-performance key-value and JSON document store.
TensorFlow Extension (TFX) and Apache Beammarkgrover
Talk on TFX and Beam by Robert Crowe, developer advocate at Google, focussed on TensorFlow.
Learn how the TensorFlow Extended (TFX) project is utilizing Apache Beam to simplify pre- and post-processing for ML pipelines. TFX provides a framework for managing all of necessary pieces of a real-world machine learning project beyond simply training and utilizing models. Robert will provide an overview of TFX, and talk in a little more detail about the pieces of the framework (tf.Transform and tf.ModelAnalysis) which are powered by Apache Beam.
Title
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTorch + XGBoost + Airflow + MLflow + Spark + Jupyter + TPU
Video
https://youtu.be/vaB4IM6ySD0
Description
In this workshop, we build real-world machine learning pipelines using TensorFlow Extended (TFX), KubeFlow, and Airflow.
Described in the 2017 paper, TFX is used internally by thousands of Google data scientists and engineers across every major product line within Google.
KubeFlow is a modern, end-to-end pipeline orchestration framework that embraces the latest AI best practices including hyper-parameter tuning, distributed model training, and model tracking.
Airflow is the most-widely used pipeline orchestration framework in machine learning.
Pre-requisites
Modern browser - and that's it!
Every attendee will receive a cloud instance
Nothing will be installed on your local laptop
Everything can be downloaded at the end of the workshop
Location
Online Workshop
Agenda
1. Create a Kubernetes cluster
2. Install KubeFlow, Airflow, TFX, and Jupyter
3. Setup ML Training Pipelines with KubeFlow and Airflow
4. Transform Data with TFX Transform
5. Validate Training Data with TFX Data Validation
6. Train Models with Jupyter, Keras/TensorFlow 2.0, PyTorch, XGBoost, and KubeFlow
7. Run a Notebook Directly on Kubernetes Cluster with KubeFlow
8. Analyze Models using TFX Model Analysis and Jupyter
9. Perform Hyper-Parameter Tuning with KubeFlow
10. Select the Best Model using KubeFlow Experiment Tracking
11. Reproduce Model Training with TFX Metadata Store and Pachyderm
12. Deploy the Model to Production with TensorFlow Serving and Istio
13. Save and Download your Workspace
Key Takeaways
Attendees will gain experience training, analyzing, and serving real-world Keras/TensorFlow 2.0 models in production using model frameworks and open-source tools.
Related Links
1. PipelineAI Home: https://pipeline.ai
2. PipelineAI Community Edition: http://community.pipeline.ai
3. PipelineAI GitHub: https://github.com/PipelineAI/pipeline
4. Advanced Spark and TensorFlow Meetup (SF-based, Global Reach): https://www.meetup.com/Advanced-Spark-and-TensorFlow-Meetup
5. YouTube Videos: https://youtube.pipeline.ai
6. SlideShare Presentations: https://slideshare.pipeline.ai
7. Slack Support: https://joinslack.pipeline.ai
8. Web Support and Knowledge Base: https://support.pipeline.ai
9. Email Support: support@pipeline.ai
This presentation discusses matters of AI and machine learning. This presentation was given during the ITU-T workshop on Machine Learning for 5G and beyond, held at ITU HQ in Geneva, Switzerland on 29 Jan 18. More information on the workshop can be found here: https://www.itu.int/en/ITU-T/Workshops-and-Seminars/20180129/Pages/default.aspx
Join our upcoming forums and workshops here: https://www.itu.int/en/ITU-T/Workshops-and-Seminars/Pages/default.aspx
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...Edureka!
( ** Deep Learning Training: https://www.edureka.co/ai-deep-learning-with-tensorflow ** )
This Edureka PyTorch Tutorial (Blog: https://goo.gl/4zxMfU) will help you in understanding various important basics of PyTorch. It also includes a use-case in which we will create an image classifier that will predict the accuracy of an image data-set using PyTorch.
Below are the topics covered in this tutorial:
1. What is Deep Learning?
2. What are Neural Networks?
3. Libraries available in Python
4. What is PyTorch?
5. Use-Case of PyTorch
6. Summary
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Langchain Framework is an innovative approach to linguistic data processing, combining the principles of language sciences, blockchain technology, and artificial intelligence. This deck introduces the groundbreaking elements of the framework, detailing how it enhances security, transparency, and decentralization in language data management. It discusses its applications in various fields, including machine learning, translation services, content creation, and more. The deck also highlights its key features, such as immutability, peer-to-peer networks, and linguistic asset ownership, that could revolutionize how we handle linguistic data in the digital age.
Creating ML models is just the starting of a long journey. In this presentation which was given as a talk on e2e AI talks, I talk about the various challenges in the machine learning life cycle
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck is a comprehensive virtual solution for technology experts. With the help of this PowerPoint theme, you can elucidate the differences between machine intelligence, machine learning, and deep learning. Employ our PPT presentation to cover merits, demerits, learning techniques, and types of supervised machine learning. You can also elucidate the benefits, limitations, and types of unsupervised machine learning. Similarly, cover important aspects related to reinforcement learning. Our AI PowerPoint slideshow also helps you in elaborating back propagation of neural networks. Walk your audience through the expert system in artificial intelligence. Cover examples, features, components, application, benefits, limitations, and other aspects of the expert system. Consolidate the deep learning process, recurrent neural networks, and convolutional neural networks through this PPT template deck. Give a crisp introduction to artificial intelligence. Introduce types, algorithms, trends, and use cases of artificial intelligence. Hit the download icon and begin instant personalization. Our Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck are explicit and effective. They combine clarity and concise expression. https://bit.ly/3nfgjaT
데이터를 둘러싼 정책과, 기업과 기술의 진화는 빠르게 변화하고 있으며, 모든 지향점은 기업들이 다양한 데이터를 활용하여 경쟁력을 확보하고 이를 통해 AI기반의 혁신을 하고자 하는데 있다.
이 과정에서 수 많은 기업의 업무 전무가, 데이터 사이언티스트 등이 다양한 기업의 혁신을 지원할 수 있는 AI 모델을 검증하는 과정을 거치게 됩니다.
하지만, 이렇게 수 많은 AI 모델이 실제 비즈니스에 적용되기 위해서는 인프라, 및 서비스 관점의 기술이 반드시 필요하게 됩니다.
MLOps는 기업에 필요한 혁신적인 아이디어(AI Model)을 적시에 비즈니스 환경에 적용할 수 있도록 지원하는 기술 및 트렌드 입니다.
주요 내용은
- 데이터를 둘러싼 환경의 변화
- 기업의 AI Model 적용시 마주하는 현실
- MLOps가 해결 가능한 문제들
- MLOps의 영역별 주요 기술들
- MLOps 도입 시 기업의 AI 환경은 어떻게 변할까?
- AI 모델을 비즈니스 환경에 적용(배포)한다는 것은?
2021년 12월 코리아 데이터 비즈니스 트렌드(데이터산업진흥원 주최)에서 발표한 내용을 공유 가능한 부분만 정리함.
발표 영상 참고 : https://www.youtube.com/watch?v=lL-QtEzJ3WY
Introduction to Deep Learning, Keras, and TensorFlowSri Ambati
This meetup was recorded in San Francisco on Jan 9, 2019.
Video recording of the session can be viewed here: https://youtu.be/yG1UJEzpJ64
Description:
This fast-paced session starts with a simple yet complete neural network (no frameworks), followed by an overview of activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Next, we'll create a neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. For best results, familiarity with basic vectors and matrices, inner (aka "dot") products of vectors, and rudimentary Python is definitely helpful. If time permits, we'll look at the UAT, CLT, and the Fixed Point Theorem. (Bonus points if you know Zorn's Lemma, the Well-Ordering Theorem, and the Axiom of Choice.)
Oswald's Bio:
Oswald Campesato is an education junkie: a former Ph.D. Candidate in Mathematics (ABD), with multiple Master's and 2 Bachelor's degrees. In a previous career, he worked in South America, Italy, and the French Riviera, which enabled him to travel to 70 countries throughout the world.
He has worked in American and Japanese corporations and start-ups, as C/C++ and Java developer to CTO. He works in the web and mobile space, conducts training sessions in Android, Java, Angular 2, and ReactJS, and he writes graphics code for fun. He's comfortable in four languages and aspires to become proficient in Japanese, ideally sometime in the next two decades. He enjoys collaborating with people who share his passion for learning the latest cool stuff, and he's currently working on his 15th book, which is about Angular 2.
Natural Language Processing (NLP) is often taught at the academic level from the perspective of computational linguists. However, as data scientists, we have a richer view of the world of natural language - unstructured data that by its very nature has important latent information for humans. NLP practitioners have benefitted from machine learning techniques to unlock meaning from large corpora, and in this class we’ll explore how to do that particularly with Python, the Natural Language Toolkit (NLTK), and to a lesser extent, the Gensim Library.
NLTK is an excellent library for machine learning-based NLP, written in Python by experts from both academia and industry. Python allows you to create rich data applications rapidly, iterating on hypotheses. Gensim provides vector-based topic modeling, which is currently absent in both NLTK and Scikit-Learn. The combination of Python + NLTK means that you can easily add language-aware data products to your larger analytical workflows and applications.
Recurrent Neural Network : Multi-Class & Multi Label Text ClassificationAmit Agarwal
This is the content for the 4th International Conference on Data Management, Analytics and Innovation - ICDMAI 2020.
Here we talk & discuss LSTM and Text Classification for Multi-Class & Multi-Label problem statement.
Code : https://github.com/amitbcp/icdmai_2020
PDF version of slides explains the various optimization algorithms used in deep learning and a comparison between them. It also has a brief about the ICML papers "Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers" and "Optimizer Benchmarking Needs to Account for Hyperparameter Tuning."
If you have any queries, you can reach out to me at @RakshithSathish on Twitter or rakshith-sathish on LinkedIn.
Natural language processing techniques transition from machine learning to de...Divya Gera
Natural Language processing, its need, business applications, NLP with machine learning, Text data preprocessing for machine learning, NLP with Deep Learning.
The release of TensorFlow 2.0 comes with a significant number of improvements over its 1.x version, all with a focus on ease of usability and a better user experience. We will give an overview of what TensorFlow 2.0 is and discuss how to get started building models from scratch using TensorFlow 2.0’s high-level api, Keras. We will walk through an example step-by-step in Python of how to build an image classifier. We will then showcase how to leverage a transfer learning to make building a model even easier! With transfer learning, we can leverage other pretrained models such as ImageNet to drastically speed up the training time of our model. TensorFlow 2.0 makes this incredibly simple to do.
Generative AI models, such as ChatGPT and Stable Diffusion, can create new and original content like text, images, video, audio, or other data from simple prompts, as well as handle complex dialogs and reason about problems with or without images. These models are disrupting traditional technologies, from search and content creation to automation and problem solving, and are fundamentally shaping the future user interface to computing devices. Generative AI can apply broadly across industries, providing significant enhancements for utility, productivity, and entertainment. As generative AI adoption grows at record-setting speeds and computing demands increase, on-device and hybrid processing are more important than ever. Just like traditional computing evolved from mainframes to today’s mix of cloud and edge devices, AI processing will be distributed between them for AI to scale and reach its full potential.
In this presentation you’ll learn about:
- Why on-device AI is key
- Full-stack AI optimizations to make on-device AI possible and efficient
- Advanced techniques like quantization, distillation, and speculative decoding
- How generative AI models can be run on device and examples of some running now
- Qualcomm Technologies’ role in scaling on-device generative AI
This session will begin with an introduction to non-relational (NoSQL) databases and compare them with relational (SQL) databases. Learn the fundamentals of Amazon DynamoDB, a fully managed NoSQL database service, and see the DynamoDB console first-hand. See a walk-through demo of building a serverless web application using this high-performance key-value and JSON document store.
TensorFlow Extension (TFX) and Apache Beammarkgrover
Talk on TFX and Beam by Robert Crowe, developer advocate at Google, focussed on TensorFlow.
Learn how the TensorFlow Extended (TFX) project is utilizing Apache Beam to simplify pre- and post-processing for ML pipelines. TFX provides a framework for managing all of necessary pieces of a real-world machine learning project beyond simply training and utilizing models. Robert will provide an overview of TFX, and talk in a little more detail about the pieces of the framework (tf.Transform and tf.ModelAnalysis) which are powered by Apache Beam.
Title
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTorch + XGBoost + Airflow + MLflow + Spark + Jupyter + TPU
Video
https://youtu.be/vaB4IM6ySD0
Description
In this workshop, we build real-world machine learning pipelines using TensorFlow Extended (TFX), KubeFlow, and Airflow.
Described in the 2017 paper, TFX is used internally by thousands of Google data scientists and engineers across every major product line within Google.
KubeFlow is a modern, end-to-end pipeline orchestration framework that embraces the latest AI best practices including hyper-parameter tuning, distributed model training, and model tracking.
Airflow is the most-widely used pipeline orchestration framework in machine learning.
Pre-requisites
Modern browser - and that's it!
Every attendee will receive a cloud instance
Nothing will be installed on your local laptop
Everything can be downloaded at the end of the workshop
Location
Online Workshop
Agenda
1. Create a Kubernetes cluster
2. Install KubeFlow, Airflow, TFX, and Jupyter
3. Setup ML Training Pipelines with KubeFlow and Airflow
4. Transform Data with TFX Transform
5. Validate Training Data with TFX Data Validation
6. Train Models with Jupyter, Keras/TensorFlow 2.0, PyTorch, XGBoost, and KubeFlow
7. Run a Notebook Directly on Kubernetes Cluster with KubeFlow
8. Analyze Models using TFX Model Analysis and Jupyter
9. Perform Hyper-Parameter Tuning with KubeFlow
10. Select the Best Model using KubeFlow Experiment Tracking
11. Reproduce Model Training with TFX Metadata Store and Pachyderm
12. Deploy the Model to Production with TensorFlow Serving and Istio
13. Save and Download your Workspace
Key Takeaways
Attendees will gain experience training, analyzing, and serving real-world Keras/TensorFlow 2.0 models in production using model frameworks and open-source tools.
Related Links
1. PipelineAI Home: https://pipeline.ai
2. PipelineAI Community Edition: http://community.pipeline.ai
3. PipelineAI GitHub: https://github.com/PipelineAI/pipeline
4. Advanced Spark and TensorFlow Meetup (SF-based, Global Reach): https://www.meetup.com/Advanced-Spark-and-TensorFlow-Meetup
5. YouTube Videos: https://youtube.pipeline.ai
6. SlideShare Presentations: https://slideshare.pipeline.ai
7. Slack Support: https://joinslack.pipeline.ai
8. Web Support and Knowledge Base: https://support.pipeline.ai
9. Email Support: support@pipeline.ai
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowDatabricks
As machine learning evolves from experimentation to serving production workloads, so does the need to effectively manage the end-to-end training and production workflow including model management, versioning, and serving. Clemens Mewald offers an overview of TensorFlow Extended (TFX), the end-to-end machine learning platform for TensorFlow that powers products across all of Alphabet. Many TFX components rely on the Beam SDK to define portable data processing workflows. This talk motivates the development of a Spark runner for Beam Python.
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...Fei Chen
ML platform meetups are quarterly meetups, where we discuss and share advanced technology on machine learning infrastructure. Companies involved include Airbnb, Databricks, Facebook, Google, LinkedIn, Netflix, Pinterest, Twitter, and Uber.
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...gdgsurrey
What We Will Discuss:
Reviewing progress in the machine learning certification journey
𝗦𝗽𝗲𝗰𝗶𝗮𝗹 𝗔𝗱𝗱𝗶𝘁𝗶𝗼𝗻 - Lightening talk on Training an AI Voice Conversion Model Using Google Colab by Adam Berg
Content Review by Vasudev Maduri
Data Preparation and Processing
Solution Architecture with TensorFlow Extended (TFX)
Data Ingestion Challenges and Solutions
Sample Question Review
Previewing next steps and topics, including course completions and material reviews.
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...Simplilearn
This presentation on TensorFlow will help you in understanding what exactly is TensorFlow and how it is used in Deep Learning. TensorFlow is a software library developed by Google for the purposes of conducting machine learning and deep neural network research. In this tutorial, you will learn the fundamentals of TensorFlow concepts, functions, and operations required to implement deep learning algorithms and leverage data like never before. This TensorFlow tutorial is ideal for beginners who want to pursue a career in Deep Learning. Now, let us deep dive into this TensorFlow tutorial and understand what TensorFlow actually is and how to use it.
Below topics are explained in this TensorFlow presentation:
1. What is Deep Learning?
2. Top Deep Learning Libraries
3. Why TensorFlow?
4. What is TensorFlow?
5. What are Tensors?
6. What is a Data Flow Graph?
7. Program Elements in TensorFlow
8. Use case implementation using TensorFlow
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you’ll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations and the execution pipeline
2. Implement deep learning algorithms, understand neural networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional neural networks, recurrent neural networks, training deep networks and high-level interfaces
4. Build deep learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of artificial neural networks
6. Troubleshoot and improve deep learning models
7. Build your own deep learning project
8. Differentiate between machine learning, deep learning and artificial intelligence
Learn more at: https://www.simplilearn.com
Workshop about TensorFlow usage for AI Ukraine 2016. Brief tutorial with source code example. Described TensorFlow main ideas, terms, parameters. Example related with linear neuron model and learning using Adam optimization algorithm.
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...Gabriel Moreira
For real-world ML systems, it is crucial to have scalable and flexible platforms to build ML workflows. In this workshop, we will demonstrate how to build an ML DevOps pipeline using Kubeflow and TensorFlow Extended (TFX). Kubeflow is a flexible environment to implement ML workflows on top of Kubernetes - an open-source platform for managing containerized workloads and services, which can be deployed either on-premises or on a Cloud platform. TFX has a special integration with Kubeflow and provides tools for data pre-processing, model training, evaluation, deployment, and monitoring.
In this workshop, we will demonstrate a pipeline for training and deploying an RNN-based Recommender System model using Kubeflow.
https://papislatam2019.sched.com/event/OV1M/training-and-deploying-ml-models-with-kubeflow-and-tensorflow-extended-tfx-sponsored-by-cit
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...Gabriel Moreira
Talk presented at PAPIs LATAM 2019 by Gabriel Moreira, Rodrigo Pereira and Fabio Uechi, from CI&T
Summary:
For real-world ML systems, it is crucial to have scalable and flexible platforms to build ML workflows. In this workshop, we will demonstrate how to build an ML DevOps pipeline using Kubeflow and TensorFlow Extended (TFX). Kubeflow is a flexible environment to implement ML workflows on top of Kubernetes - an open-source platform for managing containerized workloads and services, which can be deployed either on-premises or on a Cloud platform. TFX has a special integration with Kubeflow and provides tools for data pre-processing, model training, evaluation, deployment, and monitoring. In our workshop, we will demonstrate a pipeline for training and deploying an RNN-based Recommender System model using Kubeflow.
Overview of TensorFlow For Natural Language Processingananth
TensorFlow open sourced recently by Google is one of the key frameworks that support development of deep learning architectures. In this slideset, part 1, we get started with a few basic primitives of TensorFlow. We will also discuss when and when not to use TensorFlow.
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
This talk describes the scale-out, consistent metadata architecture of Hopsworks and how we use it to support custom metadata and provenance for ML Pipelines with Hopsworks Feature Store, NDB, and ePipe . The talk is here: https://www.youtube.com/watch?v=oPp8PJ9QBnU&feature=emb_logo
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
Specialized tools for machine learning development and model governance are becoming essential. MlFlow is an open source platform for managing the machine learning lifecycle. Just by adding a few lines of code in the function or script that trains their model, data scientists can log parameters, metrics, artifacts (plots, miscellaneous files, etc.) and a deployable packaging of the ML model. Every time that function or script is run, the results will be logged automatically as a byproduct of those lines of code being added, even if the party doing the training run makes no special effort to record the results. MLflow application programming interfaces (APIs) are available for the Python, R and Java programming languages, and MLflow sports a language-agnostic REST API as well. Over a relatively short time period, MLflow has garnered more than 3,300 stars on GitHub , almost 500,000 monthly downloads and 80 contributors from more than 40 companies. Most significantly, more than 200 companies are now using MLflow. We will demo MlFlow Tracking , Project and Model components with Azure Machine Learning (AML) Services and show you how easy it is to get started with MlFlow on-prem or in the cloud.
This slides explains how Convolution Neural Networks can be coded using Google TensorFlow.
Video available at : https://www.youtube.com/watch?v=EoysuTMmmMc
(date is wrong, sorry, it was January 28th, 2023)
So many intelligent robots have come and gone, failing to become a commercial success. We’ve lost Aibo, Romo, Jibo, Baxter, Scout, and even Alexa is reducing staff. I posit that they failed because you can’t talk to them, not really. AI has recently made substantial progress, speech recognition now actually works, and we have neural networks such as ChatGPT that produce astounding natural language. But you can’t just throw a neural network into a robot because there isn’t anybody home in those networks—they are just purposeless mimicry agents with no understanding of what they are saying. This lack of understanding means that they can’t be relied upon because the mistakes they make are ones that no human would make, not even a child. Like a child first learning about the world, a robot needs a mental model of the current real-world situation and must use that model to understand what you say and generate a meaningful response. This model must be composable to represent the infinite possibilities of our world, and to keep up in a conversation, it must be built on the fly using human-like, immediate learning. This talk will cover what is required to build that mental model so that robots can begin to understand the world as well as human children do. Intelligent robots won’t be a commercial success based on their form—they aren’t as cute and cuddly as cats and dogs—but robots can be much more if we can talk to them.
Generating Natural-Language Text with Neural NetworksJonathan Mugan
Automatic text generation enables computers to summarize text, to have conversations in customer-service and other settings, and to customize content based on the characteristics and goals of the human interlocutor. Using neural networks to automatically generate text is appealing because they can be trained through examples with no need to manually specify what should be said when. In this talk, we will provide an overview of the existing algorithms used in neural text generation, such as sequence2sequence models, reinforcement learning, variational methods, and generative adversarial networks. We will also discuss existing work that specifies how the content of generated text can be determined by manipulating a latent code. The talk will conclude with a discussion of current challenges and shortcomings of neural text generation.
There are lots of frameworks for building chatbots, but those abstractions can obscure understanding and hinder application development. In this talk, we will cover building chatbots from the ground up in Python. This can be done with either classic NLP or deep learning. We will cover both approaches, but this talk will focus on how one can build a chatbot using spaCy, pattern matching, and context-free grammars.
From Natural Language Processing to Artificial IntelligenceJonathan Mugan
Overview of natural language processing (NLP) from both symbolic and deep learning perspectives. Covers tf-idf, sentiment analysis, LDA, WordNet, FrameNet, word2vec, and recurrent neural networks (RNNs).
Deep Learning for Natural Language ProcessingJonathan Mugan
Deep Learning represents a significant advance in artificial intelligence because it enables computers to represent concepts using vectors instead of symbols. Representing concepts using vectors is particularly useful in natural language processing, and this talk will elucidate those benefits and provide an understandable introduction to the technologies that make up deep learning. The talk will outline ways to get started in deep learning, and it will conclude with a discussion of the gaps that remain between our current technologies and true computer understanding.
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
Describes deep learning as applied to natural language processing, computer vision, and robot actions. Also discusses what deep learning still can't do.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
2. 22
Moving From Our Hut to the Production Floor
Your model is going to live for a long time. Not just for a demo.
You must know when to update it. The world changes.
You must ensure production data matches training data. Data reflects its origins.
You may need to track multiple model versions. E.g., for different states.
You need to batch the input to serving. One-at-a-time is slow.
3. 33
Interchangeable Parts
and the ML Revolution
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow ServingImage by https://www.flickr.com/photos/36224933@N07
https://creativecommons.org/licenses/by-sa/2.0/deed.en
4. 44
Interchangeable Parts
and the ML Revolution
• TensorFlow Extended (TFX)
• TFX used internally by Google and
recently open sourced
• Represents your pipeline to
production as a sequence of
components
• Building any one model is more
work, but for large endeavors, TFX
helps to keep you organized
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
5. 55
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
6. 66
TensorFlow ExtendedData Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
ML Metadata (MLMD)
• Individual components talk to the
metadata store (MLMD).
• MLMD doesn’t store data itself. It
stores data about data.
https://www.tensorflow.org/tfx/guide
https://www.tensorflow.org/tfx/guide/mlmd
Each component has three pieces
1. Driver (gets information from MLMD)
2. Executor (does what the component does)
3. Publisher (writes results to MLMD)
Organized by library (components in parenthesis)
7. 77
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Pulls in your data and put it into binary format
• Also splits it into train and test
• Protocol Buffers
• tf.Example into a TFRecord file
https://www.tensorflow.org/tutorials/load_data/tfrecord
https://www.tensorflow.org/tfx/guide/examplegen
Data Ingestion:
ExampleGen
8. 88
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Looks at your data and generates a schema, which
you manually update.
It makes sure the data you pass in later during serving
is still in the same format and hasn’t drifted.
Also has a great way to visualize data, FACETS, we will
see later.
https://www.tensorflow.org/tfx/guide/tfdv
TensorFlow Data Validation:
StatisticsGen, SchemaGen,
Example Validator
9. 99
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Example Schema
10. 1010
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Converts your data
• E.g., One-hot encoding, categorical with a vocab
• Part of TensorFlow graph, for better or worse
• Good for transformations that require looking at
all values
TensorFlow Transform:
Transform
Example: tft.scale_to_z_score
subtracts mean and divides by standard deviation
Features come in many types, and TensorFlow
Transform converts them into a format that can be
ingested by a machine learning model.
Nice to have this explicit.
https://www.tensorflow.org/tfx/guide/transform
11. 1111
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Trains the model: Part we are all
familiar with
• Except uses an Estimator
• Can use KERAS
tf.keras.estimator.model_to_estimator()
https://www.tensorflow.org/tfx/guide/trainer
Estimator or Keras Model:
Trainer
12. 1212
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Evaluator Component
• Evaluates the model.
• Uses TensorFlow Model Analysis (TFMA), which we
will see shortly.
• https://www.tensorflow.org/tfx/guide/evaluator
TensorFlow Model Analysis:
Evaluator, Model Validator
Model Validator Component
• You set a baseline (such as the current
serving model) and a metric (such as AUC)
• Marks in the metadata if the model passes
the baseline.
• https://www.tensorflow.org/tfx/guide/modelval
13. 1313
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
For a deeper understanding, see Ice-T’s 1988
hit song, “I’m Your Pusher”
https://www.tensorflow.org/tfx/guide/pusher
Validation Outcomes:
Pusher
• Pushes the model to Serving if it is validated
• I.e., if your new model is better than the
existing model, push it to the model server.
14. 1414
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Uses the model to perform inference
• Called via gRPC APIs or RESTFUL APIs
• Easy to get running with Docker
• You can call a particular version of a model
• Takes care of batching
https://www.tensorflow.org/tfx/guide/serving
TensorFlow Serving
15. 1515
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
16. 1616
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• ML Metadata
• Apache Airflow
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
17. 1717
Metadata StoreData Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
(Model Server)
ML Metadata (MLMD)
https://www.tensorflow.org/tfx/guide/mlmd
24. 2424
Pipeline Management with Apache Airflow
• You can also use Kubeflow https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow_gcp.py
• And Apache Beam https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_beam.py
25. 2525
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
26. 2626
TensorFlow 2.0
• Don’t have to define the graph separately
• More like PyTorch
• There are two ways you can do computation:
• Eager: like PyTorch, just compute
• tf.function: You decorate a function and call it
28. 2828
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
29. 2929
Data
• tf.train.Example is tf.train.Feature protobuf
message, where each value has a name
and a type (tf.train.BytesList,
tf.train.FloatList, tf.train.Int64List)
• TFRecord is a format for storing sequences
of binary records, each record is
tf.train.Example
• tf.data.Dataset can take in TFRecord and
create an iterator for batching
• tf.parse_example unpacks tf.Example into
standard tensors.
https://www.tensorflow.org/tutorials/load_data/tf_records
Features
• tf.feature_column,
where you further
specify what it is, such as
one-hot, vocabulary, and
embeddings and such.
https://www.tensorflow.org/guide/feature_columns
tf.train.Example specifies what it is
for storage, and tf.feature_column is
for the input to a model.
30. 3030
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
31. 3131
To build a model you need
• format of model input
• tf.feature_column
• model architecture and hyperparameters
• tf.estimator
• (or KERAS with tf.keras.estimator.model_to_estimator)
• function to deliver training data
• tf.estimator.TrainSpec from tf.data
• function to deliver eval data
• tf.estimator.EvalSpec from tf.data
• function to deliver serving data
• tf.estimator.FinalExporter
32. 3232
TensorFlow Estimator
• Estimator is a wrapper for regular
TensorFlow that automatically
scales to multiple machines and
automatically outputs results to
TensorBoard
Shout out to model explainability using estimator using boosted trees
https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding
https://www.tensorflow.org/tutorials/estimator/boosted_trees
33. 3333
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
34. 3434
TensorBoard
Plotting of prescribers
Red has more overdose
Green has fewer
3D plots not that useful,
but they look cool
You can even use TensorBoard
from PyTorch
https://pytorch.org/docs/stable/tensorboard.html
35. 3535
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
36. 3636
TensorFlow Data Validation (TFDV)
• We need to understand our data as well as possible.
• TFDV provides tools that make that less difficult.
• Helps to identify bugs in the data by showing you
pictures that don’t look right.
• https://www.tensorflow.org/tfx/data_validation/get_started
40. 4040
In general, we can make sure the distributions are what we would expect.
41. 4141
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
42. 4242
TensorFlow
Model
Analysis
(TFMA)
We can see how well
our model does
by each slice.
We see that this model
does much better for
females than males.
https://www.tensorflow.org/tfx/guide/tfma
43. 4343
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
44. 4444
What-IF Tool
• The What-If Tool applies a model from TensorFlow
Serving to any data you give it.
• https://pair-code.github.io/what-if-tool/index.html
• Change a record and see what the model does
• Find the most similar record with a different classification
• Can be used for fairness. Adjust the model so it is equally
likely to predict “yes” for each group
• https://www.coursera.org/lecture/machine-learning-business-professionals/activity-
applying-fairness-concerns-with-the-what-if-tool-review-0mYda
47. 4747
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
48. 4848
Alternatives (kind of)
• MLflow https://mlflow.org/docs/latest/index.html
• Netflix Metaflow https://github.com/Netflix/metaflow
• Sacred https://github.com/IDSIA/sacred
• Dataiku DSS https://www.dataiku.com/product/
• Polyaxon https://polyaxon.com/
• Facebook Ax https://www.ax.dev/
They all do something a little different,
with pieces straddling different sides of the data science/production divide
49. 4949
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Streamlit Dashboard
• Python Typing, Dataclasses, and Enum
• Conclusion
50. 5050
Streamlit
Dashboard
• Writes to the browser
• Works well for artifacts in the ML pipeline.
https://github.com/streamlit/streamlit/
https://streamlit.io/docs/getting_started.html
51. 5151
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Streamlit Dashboard
• Python Typing, Dataclasses, and Enum
• Conclusion
52. 5252
Typing, Dataclasses,
and Enum
You can build interchangeable
parts right in Python.
Not new of course, but they make
Python a little less wild west.
Output:
[Turtle(size=6, name='Anita'),
Turtle(size=2, name='Anita')]
53. 5353
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
54. 5454
TFX Disadvantages
• Steep learning curve
• Changes constantly (but not while you are watching it)
• Somewhat inflexible, you can create your own
components, but steep learning curve
• No hyperparameter search (yet,
https://github.com/tensorflow/tfx/issues/182)
55. 5555
TFX Advantages
• Set up to scale
• Documents your process through artifacts
• Warm-starting: as new data comes in, you don’t have to
start training over. Keeps models fresh
• Tools to see data and debug problems
• Don’t have to rerun what is already run
56. 5656
Where to Start
• Jupyter notebook tutorial
https://www.tensorflow.org/tfx/tutorials/tfx/components
• Airflow tutorial
https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop
57. 5757
Happy Hour!
6500 River Place Blvd.
Bldg. 3, Suite 120
Austin, TX. 78730
Jonathan Mugan, Ph. D.
Email: jmugan@deumbra.com
58. 5858
Appendix
• Original TFX paper
https://ai.google/research/pubs/pub46484
• Documentation
• https://www.tensorflow.org/tfx
• https://www.tensorflow.org/tfx/tutorials
• https://www.tensorflow.org/tfx/guide
• https://www.tensorflow.org/tfx/api_docs