Describing the iterative process of optimizing personalized content recommendation models, starting from user preferences optimization to a spark-based large-scale modular machine learning framework.
Managing an Experimentation Platform by LinkedIn Product LeaderProduct School
Main Takeaways:
-Establishing a culture of experimentation at scale
-Developing the product vision and strategy
-Backlog prioritization based on Impact Score formula
Has your app taken off? Are you thinking about scaling? MongoDB makes it easy to horizontally scale out with built-in automatic sharding, but did you know that sharding isn't the only way to achieve scale with MongoDB?
In this webinar, we'll review three different ways to achieve scale with MongoDB. We'll cover how you can optimize your application design and configure your storage to achieve scale, as well as the basics of horizontal scaling. You'll walk away with a thorough understanding of options to scale your MongoDB application.
Topics covered include:
- Scaling Vertically
- Hardware Considerations
- Index Optimization
- Schema Design
- Sharding
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialQiang Zhu
Predictive modeling is the art of building statistical models that forecast probabilities and trends of future events. It has broad applications in industry across different domains. Some popular examples include user intention predictions, lead scoring, churn analysis, etc. In this tutorial, we will focus on the best practice of predictive modeling in the big data era and its applications in industry, with motivating examples across a range of business tasks and relevance products. We will start with an overview of how predictive modeling helps power and drive various key business use cases. We will introduce the essential concepts and state of the art in building end-to-end predictive modeling solutions, and discuss the challenges, key technologies, and lessons learned from our practice, including case studies of LinkedIn feed relevance and a platform for email response prediction. Moreover, we will discuss some practical solutions of building predictive modeling platform to scale the modeling efforts for data scientists and analysts, along with an overview of popular tools and platforms used across the industry.
Business Applications of Predictive Modeling at ScaleSongtao Guo
Tutorial delivered in KDD 2016 San Francisco
Abstract
Predictive modeling is the art of building statistical models that forecast probabilities and trends of future events. It has broad applications in industry across different domains. Some popular examples include user intention predictions, lead scoring, churn analysis, etc. In this tutorial, we will focus on the best practice of predictive modeling in the big data era and its applications in industry, especially sales and marketing. We will start with an overview of how predictive modeling helps power and drive various key business use cases. We will introduce the essential concepts and state of the art in building end-to-end predictive modeling solutions, and discuss the challenges, key technologies, and lessons learned from our practice, followed by a case study. Moreover, we will discuss some practical solutions of building predictive modeling platform to scale the modeling efforts for data scientists and analysts, along with an overview of popular tools and platforms used across the industry.
Target Audience and Prerequisites
This tutorial is suitable for researchers, students, and practitioners of predictive modeling who are interested in the industry applications. Advanced techniques in data mining and statistical modeling are not required but some background in statistics and big data is expected.
Managing an Experimentation Platform by LinkedIn Product LeaderProduct School
Main Takeaways:
-Establishing a culture of experimentation at scale
-Developing the product vision and strategy
-Backlog prioritization based on Impact Score formula
Has your app taken off? Are you thinking about scaling? MongoDB makes it easy to horizontally scale out with built-in automatic sharding, but did you know that sharding isn't the only way to achieve scale with MongoDB?
In this webinar, we'll review three different ways to achieve scale with MongoDB. We'll cover how you can optimize your application design and configure your storage to achieve scale, as well as the basics of horizontal scaling. You'll walk away with a thorough understanding of options to scale your MongoDB application.
Topics covered include:
- Scaling Vertically
- Hardware Considerations
- Index Optimization
- Schema Design
- Sharding
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialQiang Zhu
Predictive modeling is the art of building statistical models that forecast probabilities and trends of future events. It has broad applications in industry across different domains. Some popular examples include user intention predictions, lead scoring, churn analysis, etc. In this tutorial, we will focus on the best practice of predictive modeling in the big data era and its applications in industry, with motivating examples across a range of business tasks and relevance products. We will start with an overview of how predictive modeling helps power and drive various key business use cases. We will introduce the essential concepts and state of the art in building end-to-end predictive modeling solutions, and discuss the challenges, key technologies, and lessons learned from our practice, including case studies of LinkedIn feed relevance and a platform for email response prediction. Moreover, we will discuss some practical solutions of building predictive modeling platform to scale the modeling efforts for data scientists and analysts, along with an overview of popular tools and platforms used across the industry.
Business Applications of Predictive Modeling at ScaleSongtao Guo
Tutorial delivered in KDD 2016 San Francisco
Abstract
Predictive modeling is the art of building statistical models that forecast probabilities and trends of future events. It has broad applications in industry across different domains. Some popular examples include user intention predictions, lead scoring, churn analysis, etc. In this tutorial, we will focus on the best practice of predictive modeling in the big data era and its applications in industry, especially sales and marketing. We will start with an overview of how predictive modeling helps power and drive various key business use cases. We will introduce the essential concepts and state of the art in building end-to-end predictive modeling solutions, and discuss the challenges, key technologies, and lessons learned from our practice, followed by a case study. Moreover, we will discuss some practical solutions of building predictive modeling platform to scale the modeling efforts for data scientists and analysts, along with an overview of popular tools and platforms used across the industry.
Target Audience and Prerequisites
This tutorial is suitable for researchers, students, and practitioners of predictive modeling who are interested in the industry applications. Advanced techniques in data mining and statistical modeling are not required but some background in statistics and big data is expected.
Building Ranking Infrastructure: Data-Driven, Lean, Flexible - Sergii Khomenk...Sergii Khomenko
Nowadays there are plenty of solution to build a search subsystem. The question is how to keep such a system flexible and easy to react on data-driven decisions, constantly improve the quality. In talk are presented lessons learned from our experience of building lean ranking infrastructure, that could be used with data-driven approach in product development. With slides we walk through the process of scaling out the search system from a couple to 13 countries around the world, but keeping flexibility, that allows to test hypothesis on different levels and perform a/b testing in different dimensions.
Develop an App with the Odoo Framework or How to Implement a Plant Nursery in a Few Minutes.
Yannick Tivisse, Software Engineer, RD4HR Team Leader, Odoo
NLP Text Recommendation System Journey to Automated TrainingDatabricks
This talk will cover how we built and productionized automated machine learning pipelines at Salesforce. Starting with heuristics to automated retraining using technologies including but not limited to Scala, Python, Apache Spark, Docker, Sagemaker for training, and serving. We will walk through the generally applicable data prep, feature engineering, training, evaluation/comparisons, and continuous model training including data feedback loops in containerized environments with Sagemaker. We will talk about our deployment and validation approach. Finally, we’ll draw lessons from iteratively building an enterprise ML product. Attendees will learn about the mental models for building end to end prod ML pipelines and GA ready products.
Lean Analytics - How to Measure Your ProductLiron Hayun
This presentation was given to startup founders and software people to help them understand how to better measure the success (or failure) of their product by using objective data.
4-year chronicles of ALLSTOCKER (a trading platform for used construction equipment and machinery). We describe how the system has evolved incrementally using Pharo smalltalk.
Angular.JS is a modern Javascript MVC Framework that was built from the ground up by a team of Googlers, sponsored by Google itself. Angular.JS allows web developers a clear separation between logic and view, and greatly improves the ability to reuse the code by using things such as Directives, Services, Components.Angular.JS smart templating engine also allows to minimize the HTML code, During the presentation, you'll learn some medium-advanced usages of Angular.JS, how to use it, tips & tricks that will make your app amazing.
Système de recommandations de produits sur un site marchand par Koby KARP, Data Scientist (Equancy) & Hervé MIGNOT, Partner at Equancy
La recommandation reste un outil clé pour la personnalisation des sites marchands et le sujet est loin d’être épuisé. La prise en compte de la particularité d’un marché peut nécessité d’adapter le traitement et les algorithmes utilisés. Après une revue des techniques de recommandations, nous présenterons la démarche spécifique que nous avons adopté. Le système a été développé sous Spark pour la préparation des données et le calcul des modèles de recommandations. Une API simple et son service ont été développé pour délivrer les recommandations aux applications clientes.
Prashant technical practices-tdd for xebia eventXebia India
Theme: Agile Technical Practices
Epic: TDD implementation
Stories:
Context of TDD
What is TDD
Response of Developers to TDD implementation
Practices complimenting TDD
Success with TDD
Michael will present an overview of Elastic's machine learning capabilities.
As we know, data science work can be messy, fractured, and challenging as data volumes increase. This session will explore how the Elastic stack can offer a single destination for data ingestion and exploration, time series modeling, and communication of results through data visualizations by focusing on a few sample data sources.
We will also explore new functionality offered by Elastic machine learning, in particular an integration with our APM solution.
Trained as a mathematician, Michael Hirsch started his career with no development experience. His first task - "model the world in a relational database." Over the last 7 years Michael has established himself a data scientist, with a focus on building end-to-end systems. In his career, he has built machine learning powered platforms for clients including Nike, Samsung, and Marvel, and approaches his work with the idea that machine learning is only as useful as the interfaces that users interact with.
Currently, Michael is a Product Engineer for Machine Learning at Elastic. He focuses on tailoring Elastic's ML offering to customer use cases, as well as integrating machine learning capabilities across the entire Elastic Stack.
Tech Mentro offers 6 months live Project based Industrial Training in Java, Android, Microsoft .Net & PHP technologies for MCA/BCA/BE/B.Tech/MSc(CS and IT ) Students & professionals. This 6 months Industrial Training is part of curriculum of the most of the technical universities to enhance the industry specific skills in latest technologies and to learn corporate structure.
Elasticsearch Performance Testing and Scaling @ SignalJoachim Draeger
In this talk I describe the specific challenges that we faced at Signal to make our use case scale. I then go into detail on how we benchmarked single queries and different shard configurations. You can try the experiments yourself using The Signal Media One-Million News Articles Dataset, a Docker Compose stack and some scripts provided here: https://github.com/joachimdraeger/elasticsearch-performance-experiments.
I also got the great advice to have a look at https://github.com/elastic/rally which can also give you summaries for test runs.
Recommender Systems @ Scale, Big Data Europe Conference 2019Sonya Liberman
Serving tens of billions of personalized recommendations a day under a latency of 30 milliseconds is a challenge. In this talk I’ll share our algorithmic architecture, including its Spark-based offline layer, and its Elasticsearch-based serving layer, that enable running complex models under difficult scale constrains and shorten the cycle between research and production.
Search-Based Serving Architecture of Embeddings-Based Recommendations (RecSys...Sonya Liberman
Productizing User/Content Embeddings on top of Elasticsearch
Published on PMLR 2019 http://proceedings.mlr.press/v109/liberman19a.html
Workshop on Online Recommender Systems and User Modeling
RecSys 2019
More Related Content
Similar to Iterative Methodology for Personalization Models Optimization
Building Ranking Infrastructure: Data-Driven, Lean, Flexible - Sergii Khomenk...Sergii Khomenko
Nowadays there are plenty of solution to build a search subsystem. The question is how to keep such a system flexible and easy to react on data-driven decisions, constantly improve the quality. In talk are presented lessons learned from our experience of building lean ranking infrastructure, that could be used with data-driven approach in product development. With slides we walk through the process of scaling out the search system from a couple to 13 countries around the world, but keeping flexibility, that allows to test hypothesis on different levels and perform a/b testing in different dimensions.
Develop an App with the Odoo Framework or How to Implement a Plant Nursery in a Few Minutes.
Yannick Tivisse, Software Engineer, RD4HR Team Leader, Odoo
NLP Text Recommendation System Journey to Automated TrainingDatabricks
This talk will cover how we built and productionized automated machine learning pipelines at Salesforce. Starting with heuristics to automated retraining using technologies including but not limited to Scala, Python, Apache Spark, Docker, Sagemaker for training, and serving. We will walk through the generally applicable data prep, feature engineering, training, evaluation/comparisons, and continuous model training including data feedback loops in containerized environments with Sagemaker. We will talk about our deployment and validation approach. Finally, we’ll draw lessons from iteratively building an enterprise ML product. Attendees will learn about the mental models for building end to end prod ML pipelines and GA ready products.
Lean Analytics - How to Measure Your ProductLiron Hayun
This presentation was given to startup founders and software people to help them understand how to better measure the success (or failure) of their product by using objective data.
4-year chronicles of ALLSTOCKER (a trading platform for used construction equipment and machinery). We describe how the system has evolved incrementally using Pharo smalltalk.
Angular.JS is a modern Javascript MVC Framework that was built from the ground up by a team of Googlers, sponsored by Google itself. Angular.JS allows web developers a clear separation between logic and view, and greatly improves the ability to reuse the code by using things such as Directives, Services, Components.Angular.JS smart templating engine also allows to minimize the HTML code, During the presentation, you'll learn some medium-advanced usages of Angular.JS, how to use it, tips & tricks that will make your app amazing.
Système de recommandations de produits sur un site marchand par Koby KARP, Data Scientist (Equancy) & Hervé MIGNOT, Partner at Equancy
La recommandation reste un outil clé pour la personnalisation des sites marchands et le sujet est loin d’être épuisé. La prise en compte de la particularité d’un marché peut nécessité d’adapter le traitement et les algorithmes utilisés. Après une revue des techniques de recommandations, nous présenterons la démarche spécifique que nous avons adopté. Le système a été développé sous Spark pour la préparation des données et le calcul des modèles de recommandations. Une API simple et son service ont été développé pour délivrer les recommandations aux applications clientes.
Prashant technical practices-tdd for xebia eventXebia India
Theme: Agile Technical Practices
Epic: TDD implementation
Stories:
Context of TDD
What is TDD
Response of Developers to TDD implementation
Practices complimenting TDD
Success with TDD
Michael will present an overview of Elastic's machine learning capabilities.
As we know, data science work can be messy, fractured, and challenging as data volumes increase. This session will explore how the Elastic stack can offer a single destination for data ingestion and exploration, time series modeling, and communication of results through data visualizations by focusing on a few sample data sources.
We will also explore new functionality offered by Elastic machine learning, in particular an integration with our APM solution.
Trained as a mathematician, Michael Hirsch started his career with no development experience. His first task - "model the world in a relational database." Over the last 7 years Michael has established himself a data scientist, with a focus on building end-to-end systems. In his career, he has built machine learning powered platforms for clients including Nike, Samsung, and Marvel, and approaches his work with the idea that machine learning is only as useful as the interfaces that users interact with.
Currently, Michael is a Product Engineer for Machine Learning at Elastic. He focuses on tailoring Elastic's ML offering to customer use cases, as well as integrating machine learning capabilities across the entire Elastic Stack.
Tech Mentro offers 6 months live Project based Industrial Training in Java, Android, Microsoft .Net & PHP technologies for MCA/BCA/BE/B.Tech/MSc(CS and IT ) Students & professionals. This 6 months Industrial Training is part of curriculum of the most of the technical universities to enhance the industry specific skills in latest technologies and to learn corporate structure.
Elasticsearch Performance Testing and Scaling @ SignalJoachim Draeger
In this talk I describe the specific challenges that we faced at Signal to make our use case scale. I then go into detail on how we benchmarked single queries and different shard configurations. You can try the experiments yourself using The Signal Media One-Million News Articles Dataset, a Docker Compose stack and some scripts provided here: https://github.com/joachimdraeger/elasticsearch-performance-experiments.
I also got the great advice to have a look at https://github.com/elastic/rally which can also give you summaries for test runs.
Similar to Iterative Methodology for Personalization Models Optimization (20)
Recommender Systems @ Scale, Big Data Europe Conference 2019Sonya Liberman
Serving tens of billions of personalized recommendations a day under a latency of 30 milliseconds is a challenge. In this talk I’ll share our algorithmic architecture, including its Spark-based offline layer, and its Elasticsearch-based serving layer, that enable running complex models under difficult scale constrains and shorten the cycle between research and production.
Search-Based Serving Architecture of Embeddings-Based Recommendations (RecSys...Sonya Liberman
Productizing User/Content Embeddings on top of Elasticsearch
Published on PMLR 2019 http://proceedings.mlr.press/v109/liberman19a.html
Workshop on Online Recommender Systems and User Modeling
RecSys 2019
Serving tens of billions of personalized recommendations a day under a latency of 30 milliseconds is a challenge. In this talk I'll share our algorithmic architecture, including its Spark-based offline layer, and its Elasticsearch-based serving layer, that enable running complex models under difficult scale constrains and shorten the cycle between research and production.
Sonya Liberman leads the Personalization team @ Outbrain's Recommendations group, developing large-scale machine learning algorithms for Outbrain's content recommendations platform serving tens of billions real-time recommendations a day. She specializes in Information Retrieval, Machine Learning, and Computational Linguistics. Before joining Outbrain, she led the Research and Algorithms @ ConvertMedia (acquired by Taboola). She holds an MSc in Computer Science and a BSc in Computer Science and Computational Biology.
This invited talk was given at PyData Meetup, April 2019
https://www.meetup.com/PyData-Tel-Aviv/
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Sonya Liberman
Outbrain is the world’s largest discovery platform, bringing personalized and relevant content to audiences while helping publishers understand their audiences through data.
Its recommender system is serving billions of content recommendations daily, based on millions of hourly user interactions.
Our predictive models span over a variety of supervised learning techniques, ranging from content-based recommenders, through behavioral models and all the way to collaborative techniques such as factorization machines. Agility and stability are crucial aspects of the system.
This talk will cover our journey towards solutions that would not compromise neither on scale nor on model complexity, and design a dynamic framework that shortens the cycle between research and production.
We will cover the different stages of the framework, including important take away lessons for data scientists as well as software engineers.
Sonya Liberman is leading a team of Machine Learning Engineers and Data Scientists building large-scale recommender systems for personalized content discovery @ Outbrain, serving tens of billions real-time recommendations a day.
Especially enjoys bringing theory to production and seeing how it affects the engagement of (many) users.
This invited talk was given at ILTechTalk Week, 2018 by Shaked Bar, a Teach Lead and Algorithms Engineer in the team.
From Spark to Elasticsearch and Back - Learning Large Scale Models for Conten...Sonya Liberman
Sonya Liberman leads the Personalization team @ Outbrain's Recommendations group, developing large-scale machine learning algorithms for Outbrain's content recommendations platform serving tens of billions real-time recommendations a day. She specializes in Machine Learning, Information Retrieval and Computational Linguistics. Before joining Outbrain, she led the Algorithms team @ ConvertMedia (acquired by Taboola). She holds an MSc in Computer Science and a BSc in Computer Science and Computational Biology.
This invited talk was given at the Inspiring Big Data Science meetup, January 2018.
Abstract: Sonya will share how Outbrain, a world leading content recommendations service, uses machine learning to monthly deliver 200 billion personalized content recommendations to hundreds of millions of unique monthly users. She will cover the layers of their algorithmic architecture, including its Spark-based offline layer, and its Elasticsearch-based serving layer that enables running complex models under difficult scale constrains and shortens the cycle between research and production.
Looking at Content Recommendations through a Search Lens - Extended VersionSonya Liberman
Sonya Liberman leads the Personalization team @ Outbrain's Recommendations group, developing large-scale machine learning algorithms for Outbrain's content recommendations platform serving tens of billions real-time recommendations a day. She specializes in Information Retrieval, Machine Learning, and Computational Linguistics. Before joining Outbrain, she led the Research and Algorithms @ ConvertMedia (acquired by Taboola). She holds an MSc in Computer Science and a BSc in Computer Science and Computational Biology.
This invited talk was given at the Recommender Systems Workshop 2017, University of Haifa.
Sonya Liberman leads the Personalization team @ Outbrain's Recommendations group, developing large-scale machine learning algorithms for Outbrain's content recommendations platform serving tens of billions real-time recommendations a day. She specializes in Information Retrieval, Machine Learning, and Computational Linguistics. Before joining Outbrain, she led the Research and Algorithms @ ConvertMedia (acquired by Taboola). She holds an MSc in Computer Science and a BSc in Computer Science and Computational Biology.
This talk was given at the International Join Conference on Artificial Intelligence at the WIKIAI09 Workshop.
Full article can be found here:
http://www.cs.technion.ac.il/~sonyal/CHESA_WikiAI09.pdf
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
42. Why User Profile Is Important
● Personalization
● Lookalikes modeling
● Interest Targeting
● and more….
43. User Profile Data Model
DocId Timestamp Feature Confidence
12 0 Sport Cars 1
42 -1 Sport Cars 1
43 -21 World Cup 1
55 -21 World Cup 1
Offline Profile
44. User Profile Data Model
DocId TS Feature Co
nf
12 0 Sport Cars 1
42 -1 Sport Cars 1
43 -21 World Cup 1
55 -21 World Cup 1
Category Conf
Sports
Cats
2
Soccer 2
Serving ProfileOffline Profile
45. User Profile - Boost Recent X2
DocId TS Feature Co
nf
12 0 Sport Cars 1
42 -1 Sport Cars 1
43 -21 World Cup 1
55 -21 World Cup 1
Category Conf
Sports
Cats
2
Soccer 1
Serving ProfileOffline Profile
46. Motivation - User Profile Tweaks
● Is this hypothesis true?
● What is the decay schema?
● Linear in time?
● Exponential in time?
● Potentially many trial & error cycles
47. Profile Lab - Basic Flow
● Static dataset of offline profiles
● Sequence of docids for each user
● Static feature mapping all docids
● Lean algo block needed to be implemented
● Transform offline profile to online
● Apply algo piece to generate online profile
● Generate KPIs
58. Motivation: Named Entities & Wikitags
● Very high cardinality ~300-400K
● Precise user taste
● Big potential in perso
● Big money for user segmentation
● Hard to leverage as is
60. Gender Royal Age Tech Rich
Prince Harry 1 0.9 -0.05 0.3 0.9
Queen Elizabeth -1 0.99 0.9 -0.8 0.9
Apple inc. 0.1 -0.03 0.5 0.9 0.9
Machine Learning Students -0.7 0.02 -0.5 0.8 -0.6
Facebook 0 0.01 0.2 0.9 0.87
Embeddings: Dense Representation
61. Embeddings: Dense Representation
● Given a high confidence concept in a doc
● Context is other concepts
● Lots of training data in our DocStore
● Many existing libraries: word2vec, glove, starspace etc
● Good embedding model
63. Embeddings Based Models
● Major change in prod model architecture
● High dev costs
● Potential issue with Elastic
Static embedding cluster is a good fallback
64. Clustering Phase
● Cluster embedding vectors
● Cluster id = doc feature
● Concept vector => cluster id
● Easy integration with current architecture
65. Clustering - Many Hyperparameters
● Train embedding model : |D| docs, |C| coordinates
● Quick sanity over embeddings model
● Select most frequent N concepts
● Apply sk-learn clustering analysis method A
● Benchmark clusters using common metrics
● Qualitative cluster analysis look good?
● No: Try different |D|, |C|, N, A
● Yes: Implement test and run on lab
66. Clustering - Many Hyperparameters
● Starspace embeddings / Word2Vec
● Embedding Dim - 50, 100, 300
● Number of Clusters - 1000, 2000, 5000, 10000
● Clustering Algorithms - k-means, DB-Scan