We're in the age of toolbox Machine Learning. What should you know about how to use emerging technologies like pre-trained models, large-language models, fine-tuning, and MLOps solutions to quickly and effectively build AI products.
Using Chef InSpec for Infrastructure SecurityMandi Walls
This document provides an overview of Chef InSpec and how it can be used for infrastructure security assurance. Chef InSpec allows users to create tests for security and compliance related to infrastructure and then run those tests on systems locally or remotely. The document demonstrates how to use Chef InSpec to check for compliance with a security baseline, remediate any issues found using Chef infrastructure automation, and then re-check compliance.
Keep Calm And Serilog Elasticsearch Kibana on .NET Core - 132. Spotkanie WG.N...Maciej Szymczyk
This document discusses logging best practices for complex applications using microservices and distributed systems. It recommends using a structured logging framework like Serilog that logs to Elasticsearch and Kibana for analysis. Demo examples are provided of setting up Serilog to log standard fields like timestamps, messages, user details and exceptions to Elasticsearch. Middleware is also discussed for automatically adding fields like trace IDs.
This document provides an overview of GraphQL basics and how to use it with LeanIX. It describes GraphQL as a query language for APIs that allows clients to request specific data fields and relationships from an application's data model in a single request. The document demonstrates how to access the integrated GraphQL IDE from a LeanIX workspace and compile queries step-by-step using autocomplete and documentation. It shows how GraphQL enables more efficient and flexible data retrieval compared to REST APIs.
This document discusses single page applications (SPAs) and provides an overview of what SPAs are and their advantages compared to traditional websites. It defines SPAs as web applications that fit on a single web page and provide a more desktop-like user experience through features like fluid page transitions without reloads. The document outlines how SPAs move more of the application logic to the client, fetch data on demand, and support features like back/forward buttons and offline use. Examples of SPAs include Gmail and merchant locators.
#UXPA2022 Tales from the Squad: Challenging concepts of how UX research works...UXPA International
Conducting UX research in an agile environment can be challenging. You’re trying to break work down into specific chunks of time, when the nature of research itself can require multiple iterations. U.S. Bank is trying something different. We have removed our researchers from the sprint process for specific products and are now experimenting with a shared backlog of work among related products. The goal is to work on the highest priority work within a product portfolio while still maintaining expertise in the overall topic. Want to see if it’s working? Come and hear from Liz Martin, the manager who is trying to implement this approach and Rocio Werner who’s living it on a day-to-day basis!
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
This document discusses strategies for scaling a Splunk deployment. It begins by describing how customers typically start with a single use case but then need to scale to handle more data and use cases. It then covers strategies for scaling the forwarding, indexing, search, and management components of Splunk. Key topics include load balancing forwarders, using indexer clustering for high availability, scaling search heads by clustering, and using the deployment server and distributed management console for centralized management. The document emphasizes planning storage capacity and I/O when scaling indexers and considering Splunk's application support when scaling search heads.
Responsive Web Design statistics show monthly increases in usage from June to August 2012. Global internet and mobile app usage is also projected to more than double by 2015. Responsive Web Design (RWD) provides an optimal viewing experience across different devices by adjusting layout and resizing content. Common RWD approaches include separate mobile websites, mobile-optimized websites, and responsive layouts with fluid widths. Layout types can be adaptive with fixed widths, responsive with fluid widths, or mixed. Breakpoints help define resolution changes. Wireframes and mobile-first design are recommended. Typography, percentages, testing across devices and resolutions, and examples of responsive sites are also discussed.
Using Chef InSpec for Infrastructure SecurityMandi Walls
This document provides an overview of Chef InSpec and how it can be used for infrastructure security assurance. Chef InSpec allows users to create tests for security and compliance related to infrastructure and then run those tests on systems locally or remotely. The document demonstrates how to use Chef InSpec to check for compliance with a security baseline, remediate any issues found using Chef infrastructure automation, and then re-check compliance.
Keep Calm And Serilog Elasticsearch Kibana on .NET Core - 132. Spotkanie WG.N...Maciej Szymczyk
This document discusses logging best practices for complex applications using microservices and distributed systems. It recommends using a structured logging framework like Serilog that logs to Elasticsearch and Kibana for analysis. Demo examples are provided of setting up Serilog to log standard fields like timestamps, messages, user details and exceptions to Elasticsearch. Middleware is also discussed for automatically adding fields like trace IDs.
This document provides an overview of GraphQL basics and how to use it with LeanIX. It describes GraphQL as a query language for APIs that allows clients to request specific data fields and relationships from an application's data model in a single request. The document demonstrates how to access the integrated GraphQL IDE from a LeanIX workspace and compile queries step-by-step using autocomplete and documentation. It shows how GraphQL enables more efficient and flexible data retrieval compared to REST APIs.
This document discusses single page applications (SPAs) and provides an overview of what SPAs are and their advantages compared to traditional websites. It defines SPAs as web applications that fit on a single web page and provide a more desktop-like user experience through features like fluid page transitions without reloads. The document outlines how SPAs move more of the application logic to the client, fetch data on demand, and support features like back/forward buttons and offline use. Examples of SPAs include Gmail and merchant locators.
#UXPA2022 Tales from the Squad: Challenging concepts of how UX research works...UXPA International
Conducting UX research in an agile environment can be challenging. You’re trying to break work down into specific chunks of time, when the nature of research itself can require multiple iterations. U.S. Bank is trying something different. We have removed our researchers from the sprint process for specific products and are now experimenting with a shared backlog of work among related products. The goal is to work on the highest priority work within a product portfolio while still maintaining expertise in the overall topic. Want to see if it’s working? Come and hear from Liz Martin, the manager who is trying to implement this approach and Rocio Werner who’s living it on a day-to-day basis!
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
This document discusses strategies for scaling a Splunk deployment. It begins by describing how customers typically start with a single use case but then need to scale to handle more data and use cases. It then covers strategies for scaling the forwarding, indexing, search, and management components of Splunk. Key topics include load balancing forwarders, using indexer clustering for high availability, scaling search heads by clustering, and using the deployment server and distributed management console for centralized management. The document emphasizes planning storage capacity and I/O when scaling indexers and considering Splunk's application support when scaling search heads.
Responsive Web Design statistics show monthly increases in usage from June to August 2012. Global internet and mobile app usage is also projected to more than double by 2015. Responsive Web Design (RWD) provides an optimal viewing experience across different devices by adjusting layout and resizing content. Common RWD approaches include separate mobile websites, mobile-optimized websites, and responsive layouts with fluid widths. Layout types can be adaptive with fixed widths, responsive with fluid widths, or mixed. Breakpoints help define resolution changes. Wireframes and mobile-first design are recommended. Typography, percentages, testing across devices and resolutions, and examples of responsive sites are also discussed.
Composition is the most important mathematical idea of the 20th century. How does composition factor into Machine Learning? It's essential for building great products, systems, and teams. In this talk, I give practical suggestions for how to use this idea effectively.
This document discusses best practices for setting up development and test sets for machine learning models. It recommends that the dev and test sets:
1) Should reflect the actual data distribution you want your model to perform well on, rather than just being a random split of your training data.
2) Should come from the same data distribution. Having mismatched dev and test sets makes progress harder to measure.
3) The dev set should be large enough, typically thousands to tens of thousands of examples, to detect small performance differences as models are improved. The test set size depends on desired confidence in overall performance.
Machine Learning has become a must to improve insight, quality and time to market. But it's also been called the 'high interest credit card of technical debt' with challenges in managing both how it's applied and how its results are consumed.
Workshop - The Little Pattern That Could.pdfTobiasGoeschel
The document discusses refactoring a monolithic application to follow Domain-Driven Design (DDD) and microservice principles. It provides exercises and hints to guide refactoring the codebase to use Hexagonal Architecture with separated domains, commands and queries using CQRS, and persistence-oriented repositories. Later exercises discuss improving test speed by isolating dependencies and refactoring for a serverless architecture by splitting the application into individual use cases and replacing the in-memory repository.
This document summarizes the agenda and key topics from a 4-day course on data science for finance. Day 4 focuses on deploying machine learning models in production and providing a recap of the overall course. The presentation discusses challenges in moving models from prototypes to production, and introduces QuSandbox as a platform for adopting data science and AI in enterprises. QuSandbox provides tools for model management, experimentation and deployment through a user portal and APIs.
A workshop to demonstrate how we can apply agile and continuous delivery principles to continuously deliver value in machine learning and data science projects.
Code: https://github.com/davified/ci-workshop-app
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
Alok Singh is a Principal Engineer at IBM CODAIT who has built multiple analytical frameworks and machine learning algorithms. The presentation provides an overview of building predictive models for imbalanced datasets using scikit-learn and XGBoost. It discusses challenges with imbalanced data, evaluation metrics like confusion matrix and ROC curves, and techniques for imbalanced learning including weighted classes, oversampling minorities and undersampling majorities, and SMOTE. The presentation concludes with a hands-on tutorial demonstrating these techniques on an imbalanced bank marketing dataset.
The document discusses how test-driven development (TDD) can lead to better functional design. It explains that TDD focuses on defining requirements before implementation, limits test and code scope to reduce complexity, and makes dependencies explicit upfront. This helps produce code that is loosely coupled, simple to understand and maintain, and easy to test. The document recommends starting with examples or tests, keeping components small with single responsibilities, and designing dependent parts first.
This document discusses challenges in developing computer vision software. It explores how the wrong programming model can fail, such as assuming images fit in memory or that pixels can be represented with 8-bit values. Numerical issues are also discussed, like how floating point arithmetic lacks precision. Examples show how simple operations like image differencing, convolution, and calculating standard deviation can have hidden problems. Overall, the document advocates being suspicious of software results and addresses common issues that can cause vision algorithms to go wrong.
Accelerating Data Science through Feature Platform, Transformers, and GenAIFeatureByte
In this presentation, data science expert and Kaggle grandmaster Xavier Conort defines feature engineering and identifies the pain points that can be addressed by feature platforms. He dives into the features he tends to extract from transactional data and addresses how the magic of transformers and generative AI help produce those features and make them more transparent.
Xavier Conort is currently Co-Founder and CPO of FeatureByte, an AI-based self-service feature platform that helps data scientists turn raw data into fully governed AI pipelines in minutes, at 1/5th the cost.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
Webinar by Igor Kolosov, Automation/Performance Architect, Consultant at GlobalLogic, Kharkiv
Fast and effective analysis of architecture diagrams:
Black-box is not a panacea
Pitfalls of chaotic approach
From chaos to process
How to speed up architecture diagram analysis?
Collecting valuable inputs
Workshop with examples
Reviewing progress in the machine learning certification journey
𝗦𝗽𝗲𝗰𝗶𝗮𝗹 𝗔𝗱𝗱𝗶𝘁𝗶𝗼𝗻 - Short tech talk on How to Network by Qingyue(Annie) Wang
C𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 AI and ML on Google Cloud by Margaret Maynard-Reid
𝗔 𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 𝗠𝗟 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝗳𝗿𝗮𝗺𝗶𝗻𝗴, 𝗺𝗼𝗱𝗲𝗹 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻, 𝗮𝗻𝗱 𝗳𝗮𝗶𝗿𝗻𝗲𝘀𝘀 by Sowndarya Venkateswaran.
A discussion on sample questions to aid certification exam preparation.
An interactive Q&A session to clarify doubts and questions.
Previewing next steps and topics, including course completions and material reviews.
Covering topics like:
CI CD DevOps Jenkins TFS TeamCity Compile Test Package Delpoy
See Disclaimer in the last slide and/or in file comments, if available.
The document provides guidelines for an annotated bibliography assignment aimed at increasing nursing students' knowledge of leadership in nursing practice. Students will select five nurse leaders to research and write one-page summaries for each leader. Each summary must include the leader's roles and responsibilities, accomplishments, barriers to achieving goals, and knowledge gained from reading about the leader. The assignment will help prepare students for a poster presentation on nursing leadership.
This document summarizes 10 ways to improve code based on a presentation by Neal Ford. The techniques discussed include composing methods to perform single tasks, test-driven development to design through tests, using static analysis tools to find bugs, avoiding singletons, applying the YAGNI principle to only build what is needed, questioning conventions, embracing polyglot programming, learning Java nuances, enforcing the single level of abstraction principle, and considering "anti-objects" that go against object-oriented design. Questions from the audience are then addressed.
Notes on Deploying Machine-learning Models at ScaleDeep Kayal
While modeling techniques in machine learning have matured drastically, the deployment of models at scale has been overlooked. These are some learnings that I've had over the years, that I presented at Cognizant in Amsterdam.
Tuning the Untunable - Insights on Deep Learning OptimizationSigOpt
This document discusses techniques for optimizing deep learning models, including hyperparameter optimization. It describes SigOpt's approach which uses software to automate repeatable tasks like training orchestration and model tuning. Experts can then focus on data science tasks. SigOpt utilizes techniques like Bayesian optimization, multitask optimization, and infrastructure orchestration to improve model performance while reducing costs and tuning time.
The presentation discussed managing experiments and feature flags across Optimizely and a software application. It began with an experimentation maturity curve showing increasing levels of experimentation from executional to a culture of experimentation. Examples were given of how Optimizely was used at different levels from managing datafiles to consolidating projects and increasing automated testing. Takeaways included passing datafiles between front-end and back-end for performance, caching datafiles in memcache, and improving quality through easy user testing and automated tests.
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
Looking to implement MLOps using AWS services and Kubeflow? Come and learn about machine learning from the experts of Provectus and Amazon Web Services (AWS)!
Businesses recognize that machine learning projects are important but go beyond just building and deploying models, which is mostly done by organizations. Successful ML projects entail a complete lifecycle involving ML, DevOps, and data engineering and are built on top of ML infrastructure.
AWS and Amazon SageMaker provide a foundation for building infrastructure for machine learning while Kubeflow is a great open source project, which is not given enough credit in the AWS community. In this webinar, we show how to design and build an end-to-end ML infrastructure on AWS.
Agenda
- Introductions
- Case Study: GoCheck Kids
- Overview of AWS Infrastructure for Machine Learning
- Provectus ML Infrastructure on AWS
- Experimentation
- MLOps
- Feature Store
Intended Audience
Technology executives & decision makers, manager-level tech roles, data engineers & data scientists, ML practitioners & ML engineers, and developers
Presenters
- Stepan Pushkarev, Chief Technology Officer, Provectus
- Qingwei Li, ML Specialist Solutions Architect, AWS
Feel free to share this presentation with your colleagues and don't hesitate to reach out to us at info@provectus.com if you have any questions!
REQUEST WEBINAR: https://provectus.com/webinar-mlops-and-reproducible-ml-on-aws-with-kubeflow-and-sagemaker-aug-2020/
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Composition is the most important mathematical idea of the 20th century. How does composition factor into Machine Learning? It's essential for building great products, systems, and teams. In this talk, I give practical suggestions for how to use this idea effectively.
This document discusses best practices for setting up development and test sets for machine learning models. It recommends that the dev and test sets:
1) Should reflect the actual data distribution you want your model to perform well on, rather than just being a random split of your training data.
2) Should come from the same data distribution. Having mismatched dev and test sets makes progress harder to measure.
3) The dev set should be large enough, typically thousands to tens of thousands of examples, to detect small performance differences as models are improved. The test set size depends on desired confidence in overall performance.
Machine Learning has become a must to improve insight, quality and time to market. But it's also been called the 'high interest credit card of technical debt' with challenges in managing both how it's applied and how its results are consumed.
Workshop - The Little Pattern That Could.pdfTobiasGoeschel
The document discusses refactoring a monolithic application to follow Domain-Driven Design (DDD) and microservice principles. It provides exercises and hints to guide refactoring the codebase to use Hexagonal Architecture with separated domains, commands and queries using CQRS, and persistence-oriented repositories. Later exercises discuss improving test speed by isolating dependencies and refactoring for a serverless architecture by splitting the application into individual use cases and replacing the in-memory repository.
This document summarizes the agenda and key topics from a 4-day course on data science for finance. Day 4 focuses on deploying machine learning models in production and providing a recap of the overall course. The presentation discusses challenges in moving models from prototypes to production, and introduces QuSandbox as a platform for adopting data science and AI in enterprises. QuSandbox provides tools for model management, experimentation and deployment through a user portal and APIs.
A workshop to demonstrate how we can apply agile and continuous delivery principles to continuously deliver value in machine learning and data science projects.
Code: https://github.com/davified/ci-workshop-app
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
Alok Singh is a Principal Engineer at IBM CODAIT who has built multiple analytical frameworks and machine learning algorithms. The presentation provides an overview of building predictive models for imbalanced datasets using scikit-learn and XGBoost. It discusses challenges with imbalanced data, evaluation metrics like confusion matrix and ROC curves, and techniques for imbalanced learning including weighted classes, oversampling minorities and undersampling majorities, and SMOTE. The presentation concludes with a hands-on tutorial demonstrating these techniques on an imbalanced bank marketing dataset.
The document discusses how test-driven development (TDD) can lead to better functional design. It explains that TDD focuses on defining requirements before implementation, limits test and code scope to reduce complexity, and makes dependencies explicit upfront. This helps produce code that is loosely coupled, simple to understand and maintain, and easy to test. The document recommends starting with examples or tests, keeping components small with single responsibilities, and designing dependent parts first.
This document discusses challenges in developing computer vision software. It explores how the wrong programming model can fail, such as assuming images fit in memory or that pixels can be represented with 8-bit values. Numerical issues are also discussed, like how floating point arithmetic lacks precision. Examples show how simple operations like image differencing, convolution, and calculating standard deviation can have hidden problems. Overall, the document advocates being suspicious of software results and addresses common issues that can cause vision algorithms to go wrong.
Accelerating Data Science through Feature Platform, Transformers, and GenAIFeatureByte
In this presentation, data science expert and Kaggle grandmaster Xavier Conort defines feature engineering and identifies the pain points that can be addressed by feature platforms. He dives into the features he tends to extract from transactional data and addresses how the magic of transformers and generative AI help produce those features and make them more transparent.
Xavier Conort is currently Co-Founder and CPO of FeatureByte, an AI-based self-service feature platform that helps data scientists turn raw data into fully governed AI pipelines in minutes, at 1/5th the cost.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
Webinar by Igor Kolosov, Automation/Performance Architect, Consultant at GlobalLogic, Kharkiv
Fast and effective analysis of architecture diagrams:
Black-box is not a panacea
Pitfalls of chaotic approach
From chaos to process
How to speed up architecture diagram analysis?
Collecting valuable inputs
Workshop with examples
Reviewing progress in the machine learning certification journey
𝗦𝗽𝗲𝗰𝗶𝗮𝗹 𝗔𝗱𝗱𝗶𝘁𝗶𝗼𝗻 - Short tech talk on How to Network by Qingyue(Annie) Wang
C𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 AI and ML on Google Cloud by Margaret Maynard-Reid
𝗔 𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 𝗠𝗟 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝗳𝗿𝗮𝗺𝗶𝗻𝗴, 𝗺𝗼𝗱𝗲𝗹 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻, 𝗮𝗻𝗱 𝗳𝗮𝗶𝗿𝗻𝗲𝘀𝘀 by Sowndarya Venkateswaran.
A discussion on sample questions to aid certification exam preparation.
An interactive Q&A session to clarify doubts and questions.
Previewing next steps and topics, including course completions and material reviews.
Covering topics like:
CI CD DevOps Jenkins TFS TeamCity Compile Test Package Delpoy
See Disclaimer in the last slide and/or in file comments, if available.
The document provides guidelines for an annotated bibliography assignment aimed at increasing nursing students' knowledge of leadership in nursing practice. Students will select five nurse leaders to research and write one-page summaries for each leader. Each summary must include the leader's roles and responsibilities, accomplishments, barriers to achieving goals, and knowledge gained from reading about the leader. The assignment will help prepare students for a poster presentation on nursing leadership.
This document summarizes 10 ways to improve code based on a presentation by Neal Ford. The techniques discussed include composing methods to perform single tasks, test-driven development to design through tests, using static analysis tools to find bugs, avoiding singletons, applying the YAGNI principle to only build what is needed, questioning conventions, embracing polyglot programming, learning Java nuances, enforcing the single level of abstraction principle, and considering "anti-objects" that go against object-oriented design. Questions from the audience are then addressed.
Notes on Deploying Machine-learning Models at ScaleDeep Kayal
While modeling techniques in machine learning have matured drastically, the deployment of models at scale has been overlooked. These are some learnings that I've had over the years, that I presented at Cognizant in Amsterdam.
Tuning the Untunable - Insights on Deep Learning OptimizationSigOpt
This document discusses techniques for optimizing deep learning models, including hyperparameter optimization. It describes SigOpt's approach which uses software to automate repeatable tasks like training orchestration and model tuning. Experts can then focus on data science tasks. SigOpt utilizes techniques like Bayesian optimization, multitask optimization, and infrastructure orchestration to improve model performance while reducing costs and tuning time.
The presentation discussed managing experiments and feature flags across Optimizely and a software application. It began with an experimentation maturity curve showing increasing levels of experimentation from executional to a culture of experimentation. Examples were given of how Optimizely was used at different levels from managing datafiles to consolidating projects and increasing automated testing. Takeaways included passing datafiles between front-end and back-end for performance, caching datafiles in memcache, and improving quality through easy user testing and automated tests.
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
Looking to implement MLOps using AWS services and Kubeflow? Come and learn about machine learning from the experts of Provectus and Amazon Web Services (AWS)!
Businesses recognize that machine learning projects are important but go beyond just building and deploying models, which is mostly done by organizations. Successful ML projects entail a complete lifecycle involving ML, DevOps, and data engineering and are built on top of ML infrastructure.
AWS and Amazon SageMaker provide a foundation for building infrastructure for machine learning while Kubeflow is a great open source project, which is not given enough credit in the AWS community. In this webinar, we show how to design and build an end-to-end ML infrastructure on AWS.
Agenda
- Introductions
- Case Study: GoCheck Kids
- Overview of AWS Infrastructure for Machine Learning
- Provectus ML Infrastructure on AWS
- Experimentation
- MLOps
- Feature Store
Intended Audience
Technology executives & decision makers, manager-level tech roles, data engineers & data scientists, ML practitioners & ML engineers, and developers
Presenters
- Stepan Pushkarev, Chief Technology Officer, Provectus
- Qingwei Li, ML Specialist Solutions Architect, AWS
Feel free to share this presentation with your colleagues and don't hesitate to reach out to us at info@provectus.com if you have any questions!
REQUEST WEBINAR: https://provectus.com/webinar-mlops-and-reproducible-ml-on-aws-with-kubeflow-and-sagemaker-aug-2020/
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Climate Impact of Software Testing at Nordic Testing Days
ODSC West 2022 – Kitbashing in ML
1. Kit-bashing in ML:
The age of toolbox Machine Learning
ODSC West - Nov 3, 2022
Dr. Bryan Bischof
– Head of Data Science @ Weights and Biases –
1
In collaboration with Dr. Eric Bunch & Ashraf Shaik
Email: bryan.bischof@gmail.com
3. Definition
Kit-bashing, or model-bashing, is taking parts of kits to create a
new kit. It can be to increase complexity of the model (greebling),
or to build the model into a new
form expressively and quickly.
Two key aspects of kit-bashing:
- Rapidity
- Nuance
3
c.f. Kitbashing Experience, 2021, Kitbashing in the digital age, 2020
4. Mere aggregation
These sculptural bricolage
often encode meaning and
context in the relationship
between components to
demonstrate a striking
example of Aristotleʼs:
“something besides its parts”
4
c.f. Nathalie Miebach
5. In Machine Learning
Letʼs return to our friend, Compositionality – determined by the
meanings of its constituent parts and the rules for how those parts
are combined.
Is Kit-bashing just composition?
No.
5
c.f. Composition in ML, ODSC West, 2021, Fong, Spivak, 2018
7. What can we do with this analogy?
7
“The art of the possible”
🤝
“How far can we go on our current gas tank”
We are starting to move towards a paradigm of Machine
Learning products where understanding the existing
resources and clever way to combine them outstrips the
ability to build from scratch.
8. Universal Foundations
8
The Center for Research on Foundation Models (CRFM) group
argues in this book for the power of Foundation Models via
homogenization in other words, the trending of disparate
domains towards the same information learning structures.
This is risk as it restricts the broadness of approach and
diversity in progress. This is opportunity because of the speed
of iteration and the combinatorial explosion of ways to
combine things.
9. So how does one kit-bash?
9
1. Track and monitor your experiments
2. Build reproducible pipelines
3. Establish hard contracts
4. Automate integrations
5. Experiment like hell
6. Make the tent bigger
10. 10
1. Track and monitor your experiments
You can and should track more of your experiments.
Are you tracking your feature selection? How about your
regularization efforts? Are you tracking you ensembling? Did you try
fine-tuning? How did you compare to the base model?...
11. 11
2. Build reproducible pipelines
If you canʼt tie back models and results to
training and evaluation data, what are you doing?
When you build a pipeline, each run should be a version of the assets
along the way; in many Data Science domains this isnʼt optional.
12. 3. Establish hard contracts
Hard contracts and strong composition
You donʼt have to buy the farm and go fully functional, but you
should expect that the components in your system have input
and output types. You should be able to pass parameter updates
between them; and ideally train them jointly. 12
13. 13
4. Automate integrations
Trying a new model architecture should be plug-and-play
Iteration speed is going to be extremely limited if youʼre not
automating integration; validation on blessed hold-outs and worst
case tests should ✨ just happen ✨.
14. 14
5. Experiment like hell
Seriously, just try shit.
Thatʼs the beauty of the previous steps! New model types, new
data sources, maybe a side task? Label some additional data out of
distribution? Wanna include a new feature pipeline? These are the
salad days.
15. 6. Make the tent bigger
Internal task challenges
The more people who can try their hand at solving these problems,
the higher the likelihood someone will. Make it easy to get started,
even for people with less ML experience. Some people will come
with ideas on other aspects of the problem, welcome them too! 15
20. FC RecSys Deep Dive: Trigger based validation
20
Based on triggers that a new data artifact is up:
● Use hand-labeled tags for ground truth
● Fit a multiclass, multilabel KNN classifier
to choose the initial document
embedding for the RecSys.
○ This mimics the behavior of the
vector similarity-search architecture
chosen for the RecSys in prod.
● Rank by multi-objective loss on these
classifiers; automatically promote.
● Generate UMAP projections of vectors.
● Send the promoted latent space over for
user-feedback training.
23. Key components of deployment
1. Github actions trigger a deploy to the designed infrastructure.
2. Secure perimeter to restrict API access from outside and
mitigate the risk of data exfiltration.
3. Prediction service on Cloud Run.
4. Streamlit UI App for debugging; internal service integration for
frontend
23
24. 24
● FastAPI - HTTP inference endpoint route
● Build the serving image as a docker container, pushed to GCR (Google container registry)
● Cloud Run - Attach compute by selecting instance and traffic based auto scaling configuration.
● Optionally and internal service that talks to the inference endpoint. E.g. a Streamlit App on CloudRun or
your main app service.
● Secure with VPC service control
○ Allows configuring secure perimeter rules that restricts API access outside and mitigate the risk of
data exfiltration.
○ Setup a Serverless VPC Access connector
○ Setup Ingress policy
○ Allow internal and cloud load balancing requests into the service.
○ Setup egress policy as Allow all, allowing all traffic to go through the VPC firewall.
● Set up API Gateway
● Streamlit service requests to inference service have to authorize via an ID token in the request.
● Github actions - for CI/CD of both model training on dagster + FastAPI service on cloud run
Production Inference (GCP + W&B)
25. Steal this look
25
This isnʼt my first time
kit-bashing…
If you want to map arbitrary UGC
to personalized recommendations,
grab some kits and some super
glue!
26. Whatʼs available?
26
We already had a Match-score
model, that we used for core
recommendations.
And we had a computer vision
model for featurization.
So we used the CV model to do
item-similarity, and fine-tuned the
match score model to the new recs.
29. Things can go wrong
29
1. Make sure that your experimentation practice is strong: global holdouts,
published validation artifacts, random re-testing, and no peeking can decrease
the risk of bridges to nowhere (long term type 1 error)
2. There should be a team who ultimately signs off on all experimental design.
3. Weak-composition should be avoided or strengthened to strong-composition;
make them contracts hard, and always try joint learning
4. Experiment documentation for ML experiments and development is crucial to
avoid duplication or mis-reporting of results.
5. Make comparison to SOTA (internal) easy, otherwise people wonʼt bother
6. Partnership should be upheld as a core value, and preferred to 🍠YAM (yet
another model) developed independently
31. Ya, ya, I work at a tools company
31
Weʼre building the tools to make all of this easy;
- The tools for reproducible pipelines and experiment tracking
- Sharing your results with teammates
- Logging tables artifacts and models to the registry
- Automated job triggering with retraining and parameter exploration
But most excitingly, an ecosystem of arbitrary Machine Learning functions coming
next year.
Hope youʼve still got room in your garage for a new toolbox.
32. Thanks!
Iʼm Bryan Bischof, find me on
Twitter @bebischof
Look out for my forthcoming
book from Oʼreilly next year
about recommendation systems:
32
33. Thanks!
Check out W&Bʼs composable tools at:
Wandb.ai
Totally free for individuals & academics.
Come chat with us at our booth, or email contact@wandb.ai. 33