A new paper published by OpenAI discussing generalization in deep learning and provide an observation that how model & data complexity influence each other.
How artificial intelligence is revolutionizing learning and development pract...Charles Cotter, PhD
How artificial intelligence is revolutionizing and disrupting learning and development practices throughout the ADDIE value chain - analysis, design, development, delivery and evaluation
Computer Graphics and Multimedia lab reportBijoy679
According to NU of Bangladesh for BSC Hons in CSE 6th semester, here you can find the solution about this subject according to the board question respectively.
Machine Learning: Applications, Process and TechniquesRui Pedro Paiva
Machine learning can be applied across many domains such as business, entertainment, medicine, and software engineering. The document outlines the machine learning process which includes data collection, feature extraction, model learning, and evaluation. It also provides examples of machine learning applications in various domains, such as using decision trees to make credit decisions in business, classifying emotions in music for playlist generation in entertainment, and detecting heart murmurs from audio data in medicine.
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
The document summarizes a study on training Vision Transformers (ViTs) by exploring different combinations of data augmentation, regularization techniques, model sizes, and training dataset sizes. Some key findings include: 1) Models trained with extensive data augmentation on ImageNet-1k performed comparably to those trained on the larger ImageNet-21k dataset without augmentation. 2) Transfer learning from pre-trained models was more efficient and achieved better results than training models from scratch, even with extensive compute. 3) Models pre-trained on more data showed better transfer ability, indicating more data yields more generic representations.
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...Chester Chen
Topic:
NVIDIA FLARE: Federated Learning Application Runtime Environment for Developing Robust AI Models
Summary:
Federated learning (FL) enables building robust and generalizable AI models by leveraging diverse datasets from multiple collaborators without moving data. We created NVIDIA FLARE as an open-source SDK to make it easier for data scientists to use FL in their research. The SDK allows existing machine learning and deep learning workflows adapted for distributed learning across enterprises and enables platform developers to build a secure, privacy-preserving offering for multiparty collaboration utilizing homomorphic encryption or differential privacy. The SDK is a lightweight, flexible, and scalable Python package and allows researchers to bring their data science workflows implemented in any training libraries (PyTorch, TensorFlow, or even NumPy), and apply them in real-world FL settings. This talk will introduce the key design principles of NVIDIA FLARE and illustrate use cases (e.g., COVID analysis) with customizable FL workflows that implement different privacy-preserving algorithms.
Speaker: Dr. Holger Roth ( Nvidia)
Holger Roth is a Sr. Applied Research Scientist at NVIDIA focusing on deep learning for medical imaging. He has been working closely with clinicians and academics over the past several years to develop deep learning based medical image computing and computer-aided detection models for radiological applications. He is an Associate Editor for IEEE Transactions of Medical Imaging and holds a Ph.D. from University College London, UK. In 2018, he was awarded the MICCAI Young Scientist Publication Impact Award.
Provides a brief overview of what machine learning is, how it works (theory), how to prepare data for a machine learning problem, an example case study, and additional resources.
This document provides an introduction to XGBoost, including:
1. XGBoost is an important machine learning library that is commonly used by winners of Kaggle competitions.
2. A quick example is shown using XGBoost to predict diabetes based on patient data, achieving good results with only 20 lines of simple code.
3. XGBoost works by creating an ensemble of decision trees through boosting, and focuses on explaining concepts at a high level rather than detailed algorithms.
Prompt engineering is a fundamental concept within the field of artificial intelligence, with particular relevance to natural language processing. It involves the strategic embedding of task descriptions within the input data of an AI system, often in the form of a question or query, as opposed to explicitly providing the task description separately. This approach optimizes the efficiency and effectiveness of AI models by encapsulating the desired outcome within the input context, thereby enabling more streamlined and context-aware responses.
How artificial intelligence is revolutionizing learning and development pract...Charles Cotter, PhD
How artificial intelligence is revolutionizing and disrupting learning and development practices throughout the ADDIE value chain - analysis, design, development, delivery and evaluation
Computer Graphics and Multimedia lab reportBijoy679
According to NU of Bangladesh for BSC Hons in CSE 6th semester, here you can find the solution about this subject according to the board question respectively.
Machine Learning: Applications, Process and TechniquesRui Pedro Paiva
Machine learning can be applied across many domains such as business, entertainment, medicine, and software engineering. The document outlines the machine learning process which includes data collection, feature extraction, model learning, and evaluation. It also provides examples of machine learning applications in various domains, such as using decision trees to make credit decisions in business, classifying emotions in music for playlist generation in entertainment, and detecting heart murmurs from audio data in medicine.
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
The document summarizes a study on training Vision Transformers (ViTs) by exploring different combinations of data augmentation, regularization techniques, model sizes, and training dataset sizes. Some key findings include: 1) Models trained with extensive data augmentation on ImageNet-1k performed comparably to those trained on the larger ImageNet-21k dataset without augmentation. 2) Transfer learning from pre-trained models was more efficient and achieved better results than training models from scratch, even with extensive compute. 3) Models pre-trained on more data showed better transfer ability, indicating more data yields more generic representations.
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...Chester Chen
Topic:
NVIDIA FLARE: Federated Learning Application Runtime Environment for Developing Robust AI Models
Summary:
Federated learning (FL) enables building robust and generalizable AI models by leveraging diverse datasets from multiple collaborators without moving data. We created NVIDIA FLARE as an open-source SDK to make it easier for data scientists to use FL in their research. The SDK allows existing machine learning and deep learning workflows adapted for distributed learning across enterprises and enables platform developers to build a secure, privacy-preserving offering for multiparty collaboration utilizing homomorphic encryption or differential privacy. The SDK is a lightweight, flexible, and scalable Python package and allows researchers to bring their data science workflows implemented in any training libraries (PyTorch, TensorFlow, or even NumPy), and apply them in real-world FL settings. This talk will introduce the key design principles of NVIDIA FLARE and illustrate use cases (e.g., COVID analysis) with customizable FL workflows that implement different privacy-preserving algorithms.
Speaker: Dr. Holger Roth ( Nvidia)
Holger Roth is a Sr. Applied Research Scientist at NVIDIA focusing on deep learning for medical imaging. He has been working closely with clinicians and academics over the past several years to develop deep learning based medical image computing and computer-aided detection models for radiological applications. He is an Associate Editor for IEEE Transactions of Medical Imaging and holds a Ph.D. from University College London, UK. In 2018, he was awarded the MICCAI Young Scientist Publication Impact Award.
Provides a brief overview of what machine learning is, how it works (theory), how to prepare data for a machine learning problem, an example case study, and additional resources.
This document provides an introduction to XGBoost, including:
1. XGBoost is an important machine learning library that is commonly used by winners of Kaggle competitions.
2. A quick example is shown using XGBoost to predict diabetes based on patient data, achieving good results with only 20 lines of simple code.
3. XGBoost works by creating an ensemble of decision trees through boosting, and focuses on explaining concepts at a high level rather than detailed algorithms.
Prompt engineering is a fundamental concept within the field of artificial intelligence, with particular relevance to natural language processing. It involves the strategic embedding of task descriptions within the input data of an AI system, often in the form of a question or query, as opposed to explicitly providing the task description separately. This approach optimizes the efficiency and effectiveness of AI models by encapsulating the desired outcome within the input context, thereby enabling more streamlined and context-aware responses.
Presenting the landscape of AI/ML in 2023 by introducing a quick summary of the last 10 years of its progress, current situation, and looking at things happening behind the scene.
Transfer Learning
What Is Transfer Learning?
How Does Transfer Learning Work?
Why Is Transfer Learning Used?
When Should Transfer Learning Be Used?
Approaches to Transfer Learning
Supervised Machine Learning Techniques common algorithms and its applicationTara ram Goyal
The document provides an introduction to supervised machine learning, including definitions, techniques, and applications. It discusses how supervised machine learning involves training algorithms using labeled input data to make predictions on unlabeled data. Some common supervised learning algorithms mentioned are naive Bayes, decision trees, linear regression, support vector machines, and neural networks. Applications discussed include self-driving cars, online recommendations, fraud detection, and spam filtering. The key difference between supervised and unsupervised learning is that supervised learning uses labeled training data while unsupervised learning does not have pre-existing labels.
This document discusses XGBoost, an optimized distributed gradient boosting library. It begins by explaining what problems XGBoost can solve like binary classification, regression, and ranking. It then discusses the key concepts in XGBoost including boosted trees, GBDT, tree ensembles, and additive training. XGBoost builds an ensemble of trees using gradient boosting and additive training to minimize loss. It provides efficient algorithms for split finding to construct trees level-by-level to maximize the loss drop at each step.
NVIDIA BioBert, an optimized version of BioBert was created specifically for biomedical and clinical domains, providing this community easy access to state-of-the-art NLP models.
This document provides an overview of machine learning. It defines machine learning as a form of artificial intelligence that allows systems to automatically learn and improve from experience without being explicitly programmed. The document then discusses why machine learning is important, how it works by exploring data and identifying patterns with minimal human intervention, and provides examples of machine learning applications like autonomous vehicles. It also summarizes the main types of machine learning: supervised learning, unsupervised learning, reinforcement learning, and deep learning. Finally, it distinguishes machine learning from deep learning and defines data science.
Machine learning works by processing data to discover patterns that can be used to analyze new data. Popular programming languages for machine learning include Python, R, and SQL. There are several types of machine learning including supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and deep learning. Common machine learning tasks involve classification, regression, clustering, dimensionality reduction, and model selection. Machine learning is widely used for applications such as spam filtering, recommendations, speech recognition, and machine translation.
여욱형 / kakao corp. 동영상 기술 파트
---
영상 인코딩의 목적은 좋은 화질을 유지하면서 데이터 크기를 최소화하는 것이다. 인코딩 옵션은 다양한 조합이 가능하며 옵션에 따라 영상의 품질 및 크기도 가변적이다. 지금까지는 영상의 종류 및 특성에 관계없이 해상도별로 고정된 인코딩 옵션을 적용하였다. 모든 컨텐츠를 동일한 옵션으로 인코딩하면 필요한 크기보다 많은 데이터가 생성되고 이로 인해 불필요한 트래픽이 발생할 수 있다. 우리는 딥러닝을 통해 사전에 학습된 모델을 만들고, 샘플링한 원본 영상에서 화질과 연관된 다양한 특징을 분석하여 최적의 인코딩 옵션을 실시간으로 찾는 방식을 제안하고자 한다.
The document discusses computer vision with deep learning. It provides an overview of convolutional neural networks and their use in computer vision applications like image classification and object detection. Specifically, it discusses how CNNs use convolutional layers to learn visual features from images and provide examples of CNNs being used for pipeline defect classification and filler cap quality control.
IRJET- Breast Cancer Prediction using Supervised Machine Learning AlgorithmsIRJET Journal
This document describes a study that used machine learning algorithms to predict breast cancer. The researchers trained decision tree, logistic regression, and random forest models on a breast cancer dataset. They preprocessed the data, which included converting categorical variables to numeric. The random forest model achieved the best prediction accuracy of 98.6%. The study aims to help detect breast cancer at earlier stages to improve treatment outcomes.
The document discusses generative models and their applications in artificial intelligence. Generative adversarial networks (GANs) use two neural networks, a generator and discriminator, that compete against each other. The generator learns to generate new data that looks real by fooling the discriminator, while the discriminator learns to better identify real from fake data. GANs have been used for tasks like image generation and neural style transfer. They show potential to generate art, music and other creative forms through machine learning.
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
Deep Learning and Business Models
Tran Quoc Hoan discusses deep learning and its applications, as well as potential business models. Deep learning has led to significant improvements in areas like image and speech recognition compared to traditional machine learning. Some business models highlighted include developing deep learning frameworks, building hardware optimized for deep learning, using deep learning for IoT applications, and providing deep learning APIs and services. Deep learning shows promise across many sectors but also faces challenges in fully realizing its potential.
How Azure helps to build better business processes and customer experiences w...Maxim Salnikov
The document discusses how Azure helps build better business processes and customer experiences with AI. It provides an overview of Azure OpenAI and its capabilities for various industries like finance, marketing, and HR. The document also includes examples of how companies like CarMax and Strabag SE have used Azure OpenAI to improve efficiency, reduce costs, and provide better customer service.
This document provides an example of hybrid inheritance in C# by defining four classes - A, B, C, and D - where B inherits from A, C inherits from A, and D inherits from B. It demonstrates hybrid inheritance by creating instances of classes C and D, and calling methods on each to output their property values.
This document discusses techniques for improving deep learning models and reducing overfitting, including regularization, batch normalization, and transfer learning. It provides explanations and examples of common regularization techniques like weight decay, dropout, and early stopping. It also explains batch normalization and how it helps speed up training and reduce internal covariate shift. Finally, it introduces transfer learning as a way to utilize pre-trained models on new tasks by freezing earlier layers and fine-tuning later layers.
the deep bootstrap: good online learners are good offline generalizers 논문 리뷰
ICLR 2021 채택
하바드/구글 합동 연구
https://arxiv.org/abs/2010.08127
The deep bootstrap 논문 리뷰
The paper introduces the Deep Bootstrap framework for understanding generalization in deep learning models. It shows that good online learners, which achieve low test error when trained on fresh mini-batches, also generalize well offline. Specifically:
1. The framework decomposes test error into the error of an online learner trained on fresh data, plus a bootstrap error term measuring the difference between online and offline test errors.
2. For models that are good online learners, the bootstrap error is uniformly small up to a stopping time T0, meaning online and offline test errors are nearly equivalent.
3. This implies that algorithms producing good online learners will also yield models with good offline generalization, even when training on fixed data
The document discusses different ensemble techniques for combining multiple machine learning models to improve predictive performance. It explains that no single model is perfect due to limitations in algorithms, data, and other factors. Ensemble methods aim to address this by combining the predictions of multiple models to obtain a stronger ensemble model. Specific techniques covered include bagging, random forests, boosting, and different ways of combining model predictions. Examples are provided to illustrate how these techniques work.
Overfitting and underfitting are modeling errors related to how well a model fits training data. Overfitting occurs when a model is too complex and fits the training data too closely, resulting in poor performance on new data. Underfitting occurs when a model is too simple and does not fit the training data well. The bias-variance tradeoff aims to balance these issues by finding a model complexity that minimizes total error.
Presenting the landscape of AI/ML in 2023 by introducing a quick summary of the last 10 years of its progress, current situation, and looking at things happening behind the scene.
Transfer Learning
What Is Transfer Learning?
How Does Transfer Learning Work?
Why Is Transfer Learning Used?
When Should Transfer Learning Be Used?
Approaches to Transfer Learning
Supervised Machine Learning Techniques common algorithms and its applicationTara ram Goyal
The document provides an introduction to supervised machine learning, including definitions, techniques, and applications. It discusses how supervised machine learning involves training algorithms using labeled input data to make predictions on unlabeled data. Some common supervised learning algorithms mentioned are naive Bayes, decision trees, linear regression, support vector machines, and neural networks. Applications discussed include self-driving cars, online recommendations, fraud detection, and spam filtering. The key difference between supervised and unsupervised learning is that supervised learning uses labeled training data while unsupervised learning does not have pre-existing labels.
This document discusses XGBoost, an optimized distributed gradient boosting library. It begins by explaining what problems XGBoost can solve like binary classification, regression, and ranking. It then discusses the key concepts in XGBoost including boosted trees, GBDT, tree ensembles, and additive training. XGBoost builds an ensemble of trees using gradient boosting and additive training to minimize loss. It provides efficient algorithms for split finding to construct trees level-by-level to maximize the loss drop at each step.
NVIDIA BioBert, an optimized version of BioBert was created specifically for biomedical and clinical domains, providing this community easy access to state-of-the-art NLP models.
This document provides an overview of machine learning. It defines machine learning as a form of artificial intelligence that allows systems to automatically learn and improve from experience without being explicitly programmed. The document then discusses why machine learning is important, how it works by exploring data and identifying patterns with minimal human intervention, and provides examples of machine learning applications like autonomous vehicles. It also summarizes the main types of machine learning: supervised learning, unsupervised learning, reinforcement learning, and deep learning. Finally, it distinguishes machine learning from deep learning and defines data science.
Machine learning works by processing data to discover patterns that can be used to analyze new data. Popular programming languages for machine learning include Python, R, and SQL. There are several types of machine learning including supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and deep learning. Common machine learning tasks involve classification, regression, clustering, dimensionality reduction, and model selection. Machine learning is widely used for applications such as spam filtering, recommendations, speech recognition, and machine translation.
여욱형 / kakao corp. 동영상 기술 파트
---
영상 인코딩의 목적은 좋은 화질을 유지하면서 데이터 크기를 최소화하는 것이다. 인코딩 옵션은 다양한 조합이 가능하며 옵션에 따라 영상의 품질 및 크기도 가변적이다. 지금까지는 영상의 종류 및 특성에 관계없이 해상도별로 고정된 인코딩 옵션을 적용하였다. 모든 컨텐츠를 동일한 옵션으로 인코딩하면 필요한 크기보다 많은 데이터가 생성되고 이로 인해 불필요한 트래픽이 발생할 수 있다. 우리는 딥러닝을 통해 사전에 학습된 모델을 만들고, 샘플링한 원본 영상에서 화질과 연관된 다양한 특징을 분석하여 최적의 인코딩 옵션을 실시간으로 찾는 방식을 제안하고자 한다.
The document discusses computer vision with deep learning. It provides an overview of convolutional neural networks and their use in computer vision applications like image classification and object detection. Specifically, it discusses how CNNs use convolutional layers to learn visual features from images and provide examples of CNNs being used for pipeline defect classification and filler cap quality control.
IRJET- Breast Cancer Prediction using Supervised Machine Learning AlgorithmsIRJET Journal
This document describes a study that used machine learning algorithms to predict breast cancer. The researchers trained decision tree, logistic regression, and random forest models on a breast cancer dataset. They preprocessed the data, which included converting categorical variables to numeric. The random forest model achieved the best prediction accuracy of 98.6%. The study aims to help detect breast cancer at earlier stages to improve treatment outcomes.
The document discusses generative models and their applications in artificial intelligence. Generative adversarial networks (GANs) use two neural networks, a generator and discriminator, that compete against each other. The generator learns to generate new data that looks real by fooling the discriminator, while the discriminator learns to better identify real from fake data. GANs have been used for tasks like image generation and neural style transfer. They show potential to generate art, music and other creative forms through machine learning.
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
Deep Learning and Business Models
Tran Quoc Hoan discusses deep learning and its applications, as well as potential business models. Deep learning has led to significant improvements in areas like image and speech recognition compared to traditional machine learning. Some business models highlighted include developing deep learning frameworks, building hardware optimized for deep learning, using deep learning for IoT applications, and providing deep learning APIs and services. Deep learning shows promise across many sectors but also faces challenges in fully realizing its potential.
How Azure helps to build better business processes and customer experiences w...Maxim Salnikov
The document discusses how Azure helps build better business processes and customer experiences with AI. It provides an overview of Azure OpenAI and its capabilities for various industries like finance, marketing, and HR. The document also includes examples of how companies like CarMax and Strabag SE have used Azure OpenAI to improve efficiency, reduce costs, and provide better customer service.
This document provides an example of hybrid inheritance in C# by defining four classes - A, B, C, and D - where B inherits from A, C inherits from A, and D inherits from B. It demonstrates hybrid inheritance by creating instances of classes C and D, and calling methods on each to output their property values.
This document discusses techniques for improving deep learning models and reducing overfitting, including regularization, batch normalization, and transfer learning. It provides explanations and examples of common regularization techniques like weight decay, dropout, and early stopping. It also explains batch normalization and how it helps speed up training and reduce internal covariate shift. Finally, it introduces transfer learning as a way to utilize pre-trained models on new tasks by freezing earlier layers and fine-tuning later layers.
the deep bootstrap: good online learners are good offline generalizers 논문 리뷰
ICLR 2021 채택
하바드/구글 합동 연구
https://arxiv.org/abs/2010.08127
The deep bootstrap 논문 리뷰
The paper introduces the Deep Bootstrap framework for understanding generalization in deep learning models. It shows that good online learners, which achieve low test error when trained on fresh mini-batches, also generalize well offline. Specifically:
1. The framework decomposes test error into the error of an online learner trained on fresh data, plus a bootstrap error term measuring the difference between online and offline test errors.
2. For models that are good online learners, the bootstrap error is uniformly small up to a stopping time T0, meaning online and offline test errors are nearly equivalent.
3. This implies that algorithms producing good online learners will also yield models with good offline generalization, even when training on fixed data
The document discusses different ensemble techniques for combining multiple machine learning models to improve predictive performance. It explains that no single model is perfect due to limitations in algorithms, data, and other factors. Ensemble methods aim to address this by combining the predictions of multiple models to obtain a stronger ensemble model. Specific techniques covered include bagging, random forests, boosting, and different ways of combining model predictions. Examples are provided to illustrate how these techniques work.
Overfitting and underfitting are modeling errors related to how well a model fits training data. Overfitting occurs when a model is too complex and fits the training data too closely, resulting in poor performance on new data. Underfitting occurs when a model is too simple and does not fit the training data well. The bias-variance tradeoff aims to balance these issues by finding a model complexity that minimizes total error.
This document discusses methods for building machine learning models that can handle concept drift and evolving data distributions when classifying tweets in real-time. It proposes using both a global deep learning model and a local online learning model that incorporates feedback. The local model, which uses an algorithm like Crammer's PA-II, adapts quickly to feedback but is prone to bias towards one class. The document suggests combining the models through online stacking into an ensemble called "glocal" and detecting concept drift periodically to replace outdated models. Handling concept drift and evolving data is important for domains with changing user preferences, markets, or adversarial settings.
MACHINE LEARNING YEAR DL SECOND PART.pptxNAGARAJANS68
The document discusses various concepts related to machine learning models including prediction errors, overfitting, underfitting, bias, variance, hyperparameter tuning, and regularization techniques. It provides explanations of key terms and challenges in machine learning like the curse of dimensionality. Cross-validation methods like k-fold are presented as ways to evaluate model performance on unseen data. Optimization algorithms such as gradient descent and stochastic gradient descent are covered. Regularization techniques like Lasso, Ridge, and Elastic Net are introduced.
Tricking a DNN with adversarial examplesOjasava Paras
Adversarial examples are inputs designed to cause machine learning models to make mistakes. They are difficult for humans to distinguish but can fool models. Neural networks and other models are vulnerable through overfitting or linearity. Attacks work by making small perturbations during backpropagation, changing predictions. Defending against them is challenging as the adversarial crafting process is difficult to model theoretically and models must generalize to all inputs.
The document provides guidelines for training deep neural networks (DNNs). It discusses obtaining large, clean training datasets and using data augmentation. It recommends tanh or ReLU activation functions to avoid problems with sigmoid functions. The number of hidden units and layers should be optimized, and weights initialized randomly. Learning rates can use adaptive methods like Adam. Hyperparameter tuning is best done with random search instead of grid search. Mini-batch training provides faster learning than stochastic methods. Dropout helps prevent overfitting.
Multi task learning stepping away from narrow expert models 7.11.18Cloudera, Inc.
Join this webinar as Friederike Schüür covers:
A conceptual introduction to multi-task learning (MTL), how and why it works
A technical deep dive, from MTL random forests to MTL neural networks
Applications of MTL, from structured data to text and images
The benefits of MTL to organizations, from financial services to healthcare and agriculture
VSSML17 L2. Ensembles and Logistic RegressionsBigML, Inc
Valencian Summer School in Machine Learning 2017 - Day 1
Lecture 2: Ensembles and Logistic Regressions. By Poul Petersen (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
The document discusses hyperparameters and hyperparameter tuning in deep learning models. It defines hyperparameters as parameters that govern how the model parameters (weights and biases) are determined during training, in contrast to model parameters which are learned from the training data. Important hyperparameters include the learning rate, number of layers and units, and activation functions. The goal of training is for the model to perform optimally on unseen test data. Model selection, such as through cross-validation, is used to select the optimal hyperparameters. Training, validation, and test sets are also discussed, with the validation set used for model selection and the test set providing an unbiased evaluation of the fully trained model.
What makes a model simple? Do we know what is likely before we see data? Can we use this to make better models. Existing and new approaches for bringing in more knowledge to solve machine learning problems.
Transfer learning aims to improve learning outcomes for a target task by leveraging knowledge from a related source task. It does this by influencing the target task's assumptions based on what was learned from the source task. This can allow for faster and better generalized learning in the target task. However, there is a risk of negative transfer where performance decreases. To avoid this, methods examine task similarity and reject harmful source knowledge, or generate multiple mappings between source and target to identify the best match. The goal of transfer learning is to start higher, learn faster, and achieve better overall performance compared to learning the target task without transfer.
How to fine-tune and develop your own large language model.pptxKnoldus Inc.
In this session, we will what are large language models, how we can fin-tune a pre-trained LLM with our data, including data preparation, model training, model evaluation.
Super tickets in pre trained language modelsHyunKyu Jeon
This document discusses finding "super tickets" in pre-trained language models through pruning attention heads and feedforward layers. It shows that lightly pruning BERT models can improve generalization without degrading accuracy (phase transition phenomenon). The authors propose a new pruning approach for multi-task fine-tuning of language models called "ticket sharing" where pruned weights are shared across tasks. Experiments on GLUE benchmarks show their proposed super ticket and ticket sharing methods consistently outperform unpruned baselines, with more significant gains on smaller tasks. Analysis indicates pruning reduces model variance and some tasks share more task-specific knowledge than others.
Semi supervised learning machine learning made simpleDevansh16
Video: https://youtu.be/65RV3O4UR3w
Semi-Supervised Learning is a technique that combines the benefits of supervised learning (performance, intuitiveness) with the ability to use cheap unlabeled data (unsupervised learning). With all the cheap data available, Semi Supervised Learning will get bigger in the coming months. This episode of Machine Learning Made Simple will go into SSL, how it works, transduction vs induction, the assumptions SSL algorithms make, and how SSL compares to human learning.
About Machine Learning Made Simple:
Machine Learning Made Simple is a playlist that aims to break down complex Machine Learning and AI topics into digestible videos. With this playlist, you can dive head first into the world of ML implementation and/or research. Feel free to drop any feedback you might have down below.
Taking ML to production requires careful planning and oversight. Key steps include building an override system early for model improvements, establishing blind sets and benchmarks to evaluate new models, and continuously delivering model updates. It is also important to invest in crowdsourcing to acquire ground truth, calibrate models, standardize the model improvement process, and periodically check if the ground truth is still valid as reality changes. Traceability and the ability to reproduce results are critical as the prediction process grows more complex.
Learning visual representation without human labelKai-Wen Zhao
Self supervised learning (SSL) is one of the most fast-growing research topic in recent years. SSL provides algorithm that directly learn visual representation from data itself rather than human manual labels. From theoretical point of view, SSL explores information theory & the nature of large scale dataset.
Learning to discover monte carlo algorithm on spin ice manifoldKai-Wen Zhao
The global update Monte Carlo sampler can be discovered naturally by trained machine using policy gradient method on topologically constrained environment.
Toward Disentanglement through Understand ELBOKai-Wen Zhao
Disentangled representation is the holy grail for representation learning which factorizes human-understandable factors in unsupervised way what help us move forward to interpretable machine learning.
Deep Reinforcement Learning: Q-LearningKai-Wen Zhao
This slide reviews deep reinforcement learning, specially Q-Learning and its variants. We introduce Bellman operator and approximate it with deep neural network. Last but not least, we review the classical paper: DeepMind Atari Game beats human performance. Also, some tips of stabilizing DQN are included.
High Dimensional Data Visualization using t-SNEKai-Wen Zhao
Review of the t-SNE algorithm which helps visualizing the high dimensional data on manifold by projecting them onto 2D or 3D space with metric preserving.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
3. Modern Learning Theory
● Bigger models tend to overfit
○ Bias-Variance trade-off
○ Weight Regularization
○ Augmentation
○ Dropout
○ BatchNorm
○ Early stop
○ Data-dependent regularization (mixup, etc.)
○ ...
4. Modern Learning Theory
● Bigger models tend to overfit
● Bigger models are always better
Reconciling modern machine learning practice and the bias-variance trade-off
5. Modern Learning Theory
● Bigger models tend to overfit
● Bigger models are always better
● Bigger models not good in some regime
https://mltheory.org/deep.pdf
6. Modern Learning Theory
● Bigger models tend to overfit
● Bigger models are always better
● Bigger models not good in some regime
● Even more data hurt!
https://mltheory.org/deep.pdf
7. TL;DR
- Model-wise double descent
- There is a regime where bigger models are worse
- Sample-wise non-monotonicity
- There is a regime where more samples hurts
- Epoch-wise double descent
- There is a regime where training longer reverses overfitting
8. Generalization in Deep Learning Era
- Network can fit `anything` even random noise
- Larger capacity than people imagine before
UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION
9. Generalization in Deep Learning Era
- Over-parameterized network performs
IN SEARCH OF THE REAL INDUCTIVE BIAS : ON THE ROLE OF IMPLICIT REGULARIZATION IN DEEP LEARNING
10. Generalization in Deep Learning Era
- Deep network regulairze itself (has better loss landscape)
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
11. Generalization in Deep Learning Era
SENSITIVITY AND GENERALIZATION IN NEURAL NETWORKS: AN EMPIRICAL STUDY
14. Model-wise double descent
- Model-wise double descent
across different architectures,
datasets, optimizers, and
training procedures
- Also in adversarial training
17. Epoch-wise double descent
Sufficiently large models can undergo a “double descent” behavior where test error first decreases then
increases near the interpolation threshold, and then decreases again.
Increasing the train time increases the EMC—and thus a sufficiently large model transitions from under-
to over-parameterized over the course of training.
18. Epoch-wise double descent
Conventional training is split into two phases:
1. In the first phase, the network learns a function with a small generalization gap
2. In the second phase, the network starts to over-fit the data leading to an increase in test error
Not the complete picture
- Some regimes, the test error decreases again and may achieve a lower value at the end of training
as compared to the first minimum
Reminds
- Information bottleneck
- Lottery ticket hypothesis
24. Conclusion
Take home message :
Model behaves unexpectedly in transition regime
- Training longer reverses overfitting
- Double the training epoch is a technique in some task
(eg. object detection)
- Bigger models are worse
- Fitting training set is an indicator
- Also called Effective Model Complexity (EMC)
- More data hurts
- sticky :(
- Generalization is still the Holy Grail in deep learning
- remains the open question (both exp. & theory)
- Connect data complexity with model complexity is still difficult
- NAS in some sence systematically solve this problem
Know your data & model
- noise level (problem difficulty)
- model capacity (fitting power)