During this session we will cover the best practices for implementing a product catalog with MongoDB. We will cover how to model an item properly when it can have thousands of variations and thousands of properties of interest. You'll learn how to index properly and allow for faceted search with milliseconds response latency and how to implement per-store, per-sku pricing while still keeping a sane number of documents. We will also cover operational considerations, like how to bring the data closer to users to cut down the network latency.
MongoDB and Ecommerce : A perfect combinationSteven Francia
Presentation given at the MongoDB NYC Meetup by Steve Francia, VP of Engineering at OpenSky. OpenSky uses MongoDB to develop the next ecommerce platform. OpenSky also uses Symfony 2, Doctrine 2, PHP 5.3, PHPUnit 3.5, jQuery, node.js, Git (with gitflow) and a touch of Java and Python. The OpenSky team contributes back to many of these technologies and employs core members of the Symfony 2 and Doctrine 2 teams.
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
“Why is MongoDB so slow?” you may ask yourself on occasion. You’ve created indexes, you’ve learned how to use the aggregation pipeline. What the heck? Could it be your queries? This talk will outline what tools are at your disposal (both in MongoDB Atlas and in MongoDB server) to identify inefficient queries.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. I will share more common mistakes observed and some tips and tricks to avoiding them.
MongoDB and Ecommerce : A perfect combinationSteven Francia
Presentation given at the MongoDB NYC Meetup by Steve Francia, VP of Engineering at OpenSky. OpenSky uses MongoDB to develop the next ecommerce platform. OpenSky also uses Symfony 2, Doctrine 2, PHP 5.3, PHPUnit 3.5, jQuery, node.js, Git (with gitflow) and a touch of Java and Python. The OpenSky team contributes back to many of these technologies and employs core members of the Symfony 2 and Doctrine 2 teams.
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
“Why is MongoDB so slow?” you may ask yourself on occasion. You’ve created indexes, you’ve learned how to use the aggregation pipeline. What the heck? Could it be your queries? This talk will outline what tools are at your disposal (both in MongoDB Atlas and in MongoDB server) to identify inefficient queries.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. I will share more common mistakes observed and some tips and tricks to avoiding them.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
MongoDB World 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a senior member of the support team I will share more common mistakes observed and some tips and tricks to avoiding them.
Intro to MongoDB
Get a jumpstart on MongoDB, use cases, and next steps for building your first app with Buzz Moschetti, MongoDB Enterprise Architect.
@BuzzMoschetti
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
MongoDB is an open-source document database, and the leading NoSQL database. Written in C++.
MongoDB has official drivers for a variety of popular programming languages and development environments. There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Whether you're a MongoDB professional or totally new to document databases, our MongoDB performance success factors & evaluation framework has something for you,
Curious about MongoDB performance?
Mydbops CTO, Manosh Malai illustrates the secret sauce for MongoDB performance best practices & analysis tool.
This talk is focused on tuning analysing and optimizing MongoDB query and index with the use of Database Profiler and "explain()" function.
Also, performance of database can also be impacted by configuring the underline ( Linux ) OS with some recommended settings which do not come by default.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
MongoDB World 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a senior member of the support team I will share more common mistakes observed and some tips and tricks to avoiding them.
Intro to MongoDB
Get a jumpstart on MongoDB, use cases, and next steps for building your first app with Buzz Moschetti, MongoDB Enterprise Architect.
@BuzzMoschetti
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
MongoDB is an open-source document database, and the leading NoSQL database. Written in C++.
MongoDB has official drivers for a variety of popular programming languages and development environments. There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Whether you're a MongoDB professional or totally new to document databases, our MongoDB performance success factors & evaluation framework has something for you,
Curious about MongoDB performance?
Mydbops CTO, Manosh Malai illustrates the secret sauce for MongoDB performance best practices & analysis tool.
This talk is focused on tuning analysing and optimizing MongoDB query and index with the use of Database Profiler and "explain()" function.
Also, performance of database can also be impacted by configuring the underline ( Linux ) OS with some recommended settings which do not come by default.
Multi-model Databases and Tightly Integrated PolystoresJiaheng Lu
One of the most challenging issues in the era of Big Data is the
“Variety” of the data. In general, there are two solutions to directly manage multi-model data currently: a single integrated multi-model database system or a tightly-integrated middleware over multiple single-model data stores. In this tutorial, we review and compare these two approaches giving insights on their advantages, tradeoffs, and research opportunities. In particular, we dive into four key aspects of technology for both types of systems, namely (1) theoretical foundation of multi-model data management, (2) storage strategies for multi-model data, (3) query languages across models, and (4) query evaluation and its optimization. We provide a comparison of performance for the two approaches and discuss related open problems and remaining challenges.
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...MongoDB
During this session we will cover the best practices for implementing the insight component with MongoDB. This includes efficiently ingesting and managing a large volume of user activity logs, such as clickstreams, views, likes and sales. We'll dive into how you can derive user statistics, product maps and trends using different analytics tools like the aggregation framework, map/reduce or the Hadoop connector. We will also cover operational considerations, including low-latency data ingestion and seamless aggregation queries.
MongoDB Days UK: Building Apps with the MEAN StackMongoDB
Presented by Norberto Leite, Developer Advocate, MongoDB
Experience level: Advanced
Get ready to be MEAN! The MEAN Stack (MongoDB, ExpressJS, AngularJS and Node.js) allows developers to do rapid application development and application scaffolding. In this session, Norberto will walk you through strategies and best practices for building applications on the MEAN stack, the benefits of using such an application stack and the key benefits of each of the individual components.
Has your app taken off? Are you thinking about scaling? MongoDB makes it easy to horizontally scale out with built-in automatic sharding, but did you know that sharding isn't the only way to achieve scale with MongoDB?
In this webinar, we'll review three different ways to achieve scale with MongoDB. We'll cover how you can optimize your application design and configure your storage to achieve scale, as well as the basics of horizontal scaling. You'll walk away with a thorough understanding of options to scale your MongoDB application.
Topics covered include:
- Scaling Vertically
- Hardware Considerations
- Index Optimization
- Schema Design
- Sharding
With the Analytics Cloud, you can connect any data, from any source, to everyone in your company.
Learn about the Wave Platform and technologies that fuel the Analytics Cloud. See how Datasets, Lenses and Dashboards quickly deliver insights that all users can leverage with a demonstration.
Hear an introduction to advanced topics such as XMD, SAQL, mobile layouts and security.
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeMongoDB
Let’s face it – the consumer is in control. Retailers, this means – you need to be constantly prepared to listen, speak relevantly and act personally. To meet modern demands and expanding selling channels, retailers need to deploy seamless product information with endless aisle, empowered associates turned sales agents – whenever, to whatever medium they want, however the customer wants.
Knowing today’s realities, most databases systems are rigid and difficult to change, making it a challenge to provide personalized information to customers, wherever they want - right now.
MongoDB is an agile, game-changing technology that provides a real-time view of business with based upon consumer requirements. In this webinar you will learn how leading global retailers create unique business value using MongoDB such as:
1. Real-time view of product information
2. Relevant view of the customer from whichever channel they engage
3. Smart mobile applications that understands the customer's most recent activities
Once in place, retailers continue to leverage the data views to extend their business information across other business areas.
Learn about retailers embracing this approach to meet today’s business needs with MongoDB. As part of a mini-series, led by Rebecca Bucnis, global business architect @MongoDB, we will share how you can get started on your way to Omni-Channel retailing, one step at a time.
Con MongoDB 3.6, podrá avanzar al ritmo que marcan sus datos. Los plazos de lanzamiento de las nuevas aplicaciones se acelerarán, y estas funcionarán de forma segura y fiable en entornos de cualquier tamaño, además de aportar información útil en tiempo real. https://www.mongodb.com/mongodb-3.6
SuiteHelp 4.0: Latest Features in Enterprise WebhelpSuite Solutions
Learn about the new features in SuiteHelp 4.0, the latest in enterprise webhelp!
SuiteHelp 4.0's mobile-friendly responsive design uses common Bootstrap elements already familiar to mobile users. SuiteHelp 4.0 also includes a variety of improvements in context sensitivity, navigation, and ease of customization.
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! Embarcadero Technologies
Watch the accompanying webinar presentation at http://embt.co/BigXE6
These are the slides for the ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! on September 18, 2014
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
4. The many catalogs problem
1. One department in charge of master product works hard at fitting
4
data into SQL tables
2. Resulting data sits in a SQL server with a couple replicas. It's
forbidden to hit it more than 100 times / sec
3. Other departments need to access the data way more often for
their own services
4. Other departments need more information that is not available
since it did not fit in that long devised rigid SQL schema
5. ETLs and Message Buses are put in place for other teams to try
figure it out themselves…
6. Data becomes inconsistent, fragmented, not up-to-date…
Problem visible both internally and by customers!
5. Search – Using Solr
5
How many Catalogs and
Catalog Caches do you have?
6. The many catalogs problem
6
Online Store
Catalog
Marketing
Catalog
Dozens of catalogs!
Department 3
Catalog
Product Department
Master
Catalog
Department 4
Catalog
Department 5
Catalog
Department 1
Catalog
Message
Bus
ETLs
7. Goal: Single View of Product
• Single view of a product, one central catalog
7
service
• Flexible schema containing all useful data
• Read volume high and sustained, 100k reads / s
• Can seamlessly take write spikes during catalog
update
• Advanced indexing and querying
• Geographical distribution for HA and low latency
8. Agenda
1. MongoDB Overview
2. Catalog Service Architecture
3. Data Store Models
4. Product Search
8
10. MongoDB is a great fit
• Holds complex JSON structures
• Dynamic Schema for Agility
• complex querying and in-place updating
• Secondary, compound and geo indexing
• full consistency, durability, atomic operations
• HA and geo-distributed via Replication
• Near linear scaling via Sharding
• Overall, MongoDB is a unique fit!
10
12. build your data to fit your application
Relational MongoDB
12
{ customer_id : 1,
name : "Mark Smith",
city : "San Francisco",
orders: [ {
order_number : 13,
store_id : 10,
date: “2014-01-03”,
products: [
{SKU: 24578234,
Qty: 3,
Unit_price: 350},
{SKU: 98762345,
Qty: 1,
Unit_Price: 110}
]
},
{ <...> }
]
}
CustomerID First Name Last Name City
0 John Doe New York
1 Mark Smith San Francisco
2 Jay Black Newark
3 Meagan White London
4 Edward Danields Boston
Order Number Store ID Product Customer ID
10 100 Tablet 0
11 101 Smartphone 0
12 101 Dishwasher 0
13 200 Sofa 1
14 200 Coffee table 1
15 201 Suit 2
15. Architecture Overview
15
Information
Management
Merchandising
Content
Inventory
Customer
Channel
Sales &
Fulfillment
Insight
Social
Customer
Channels
Amazon
Ebay
…
Stores
POS
Kiosk
…
Mobile
Smartphone
Tablet
Website
Contact
Center
Social
Facebook
Twitter
…
Application
Servers
API
Data and
Service
Integration
Suppliers
Supply Chain
Management
System
Data
Warehouse
Analytics
3rd Party
In Network
Web
Servers
18. Merchandising - Architecture
19
MongoDB Data Store
Items Pricing Promotions
Variants
Ratings &
Reviews
Search Engine
…
Product Service API
Online Store Marketing Inventory SCMS Public API …
20. Models - Product Page
21
Product
images
General
Informatio
n
List of
Variants
External
Informatio
n
Localized
Description
21. Models - Overview
• Item: the overall product info (e.g. Levi’s 501)
• Variant: a specific variant of an item (e.g. in black size 6)
22
which typically has a specific SKU / UPC
• Price: price information may vary based on the store, the
variant, etc
• Hierarchy: the item taxonomy
• Facet: facets to search products by
• Vendors: a given sku may be available through several
vendors if the site is a marketplace
> Don't try to fit all in the same document!
22. 23
One Item
Hundreds
of sizes
Dozens of
colors
Models – Overview
23. Models - Overview
• A single item may have thousands of variants
• Each variant can have hundreds of attributes
• Altogether a single item can represent many MBs
24
worth of JSON text
• Don't try to fit everything into the same
document!
• Use a schema that is natural and fits the API
24. Models - Item Model
{ "_id": "054VA72303012P", // the item id
25
"desc": [ // item descriptions
{ "lang": "en", "val": "Give your dressy look a lift with ..." }, ...
],
"name": "Women's Kate Ivory Peep-Toe Stiletto Heel",
"category": "/84700/80009/1282094266/1200003270", // hierarchy
"brand": { "id": "2483510", "img": "http://...", "name": "Metaphor" },
"assets": { // references to all assets
"imgs": [
{ "img": { "width": 1900, "height": 1900, "src": "http://..." }, ...
]
},
"shipping": { // shipping specs }, "specs": { // item specs },
"attrs": [ // list of items attributes (facets)
{ "name": "Heel Height", "value": "High (2-1/2 to 4 in.)" },
{ "name": "Toe", "value": "Open toe" }, ...
],
"variants": { // quick info on the variants
"cnt": 9,
"attrs": [
{ "dispType": "DROPDOWN", "name": "Color" },
{ "dispType": "DROPDOWN", "name": "Shoe Size" }, ...
]
},
"lastUpdated": 1400877254787 // keep track of updates }
25. Models - Item Model
• Get item by id
26
db.definition.findOne( { _id: "301671" } )
• Get items from list of ids
db.definition.findOne( { _id: { $in: ["301671", "301672" ] } } )
• Get items by department
db.definition.find({ category: { $regex: "^/84700/" } })
• Get items by category prefix
db.definition.find( { category: { $regex: "^/84700/80009/" } } )
• Secondary Indices
name, category, lastUpdated
26. Models – Variant Model
{ "_id": "05458452563", // the sku
27
"name": "Width:Medium,Color:Ivory,Shoe Size:6.5",
"itemId": "054VA72303012P", // reference to the item id
"altIds": { "upc": "632576103580" },
"assets": { // list of assets specific to variant
"imgs": [
{ "width": 1900, "height": 1900, "src": "http://..." },
{ "width": 1900, "height": 1900, "src": "http://..." }, ...
]
},
"attrs": [ // list of attributes specific to variant
{ "name": "Width", "value": "Medium" },
{ "name": "Color", "family": "White", "value": "Ivory" },
{ "name": "Size", "value": "6.5" }, ...
],
"lastUpdated": 1400877254787 // keep track of updates }
27. Models – Variant Model
• Get variant from SKU
28
db.variant.find( { _id: "05458452563" } )
• Get all variants for a product, sorted by SKU
db.variant.find( { itemId: "054VA72303012P" } ).sort( { _id: 1 } )
• Indices
itemId, lastUpdated
28. Models - Hierarchy
29
{
"_id": "1200003270", // the node id
"name": "Women's Heels & Pumps",
"count": 22305, // how many items in this category
"parents": [ // list of parents
"1282094266"
],
"facets": [ // facets that exists for this category
"Heel Height",
"Toe",
"Upper Material",
"Width",
"Shoe Size",
"Color"
]
}
29. Models – Hierarchy
• Get hierarchy node by id
30
db.hierarchy.find( { _id: "1200003270" } )
• Get hierarchy node from parent id
db.hierarchy.find( { parents: "1282094266" } )
• Get departments (no parent)
db.hierarchy.find( { parents: null } )
• Secondary Indices
parents
30. Models – per Store Pricing
Per store pricing could result in billions of
documents…unless it is built in a modular way:
_id: concatenation of item and store.
Item: can be an item id or variant id (sku)
Store: can be a store group (online) or store id.
31
{ "_id": "skuSPM8824542513_1234/store123",
"price": 69.99,
"sale": {
"salePrice": 42.72,
"saleEndDate": "2050-12-31 23:59:59"
},
"lastUpdated": 1374647707394 }
31. Models – per store Pricing
• Get all prices for a given item
32
db.prices.find( { _id: /^item301671/ )
• Get all prices for a given sku (price could be at item level)
db.prices.find( { _id: { $in: [ /^sku730223104376/, /^item301671/ ])
• Get minimum and maximum prices for a sku
db.prices.aggregate( { match }, { $group: { _id: 1, min: { $min: price },
max: { $max : price} } })
• Get price for a sku and store id (returns up to 4 prices)
db.prices.find( { _id: { $in: [ "sku730223104376/store1234",
"sku730223104376/sgroup0",
"item301671/store1234",
"item301671/sgroup0"] , { price: 1 })
33. Search – Browse and Search products
Browse by
category
34
Special
Lists
Filter by
attributes
Lists hundreds
of item
summaries
By far the toughest page to get right and fast …
34. Search – Browse and Search products
The previous page presents many challenges:
• Response within milliseconds for hundreds of items
• Faceted search on many attributes: category, brand, …
• Efficient sorting on several attributes: price, popularity
• Pagination feature which requires deterministic ordering
> Search engines are built for this purpose!
35
35. Search – Traditional Architecture
36
Product Data Store Product Search
Indexing
#1 obtain
search
results IDs
#2 obtain objects by
ID from cache or DB
Cache Application
Pre-joined
into objects
36. Search – Traditional Architecture
The traditional architecture issues:
• 3 different systems to maintain: RDBMS,
37
Search engine, Caching layer
• RDBMS schema is complex and static
• Applications needs to talk many languages
37. Search – Architecture with MongoDB
38
Product Data Store Product Search
Indexing
#1 obtain
search
results IDs
Applications
#2 obtain
objects by
list of IDs
MongoDB
Ready-to-use
product
documents
Search Engine
Product API
Application
issues single
query
38. Search - Mongo-Connector
39
MongoDB
Search
Engine
Oplog
Mongo
Connector
#1 Initial dump
of the
collections
#2 Updates
streaming via
Oplog
Translatio
n, filtering
Indexing
Indexing
39. Search - Mongo-Connector
• Open-source Project at
40
https://github.com/10gen-labs/mongo-connector
• Python app that reads from MongoDB's oplog
and publishes to target of choice
• Supports initial sync by dumping the data
• Default connectors for Solr, Elastic Search,
other MongoDB cluster
• Easily extensible to update other systems like
SQL
41. Search – More Searching
42
Images of the matching
variants are displayed
Price and
Rating
Facets for
variants
42. Search – More Searching
… more challenges:
• Attributes at the variant level: color, size, etc
• Attributes from other docs: pricing, ratings, etc
• Display the matching variant's image and details
• Thousands of matching variants for an item, still
43
need to display a single item
• Challenge to properly index the data
> Need for a single summary document per item
43. Search - Architecture
44
MongoDB Data Store
Items Summaries Pricing
Ratings &
Reviews
Variants Promotions
44. Search – Summary Model
{ "_id": "3ZZVA46759401P", // the item id
45
"name": "Women's Chic - Black Velvet Suede",
"dep": "84700", // useful as standalone for indexing
"cat": "/84700/80009/1282094266/1200003270",
"desc": { "lang": "en", "val": "This pointy toe slingback ..." },
"img": { "width": 450, "height": 330, "src": "http://..." },
"attrs": [ // global attributes, easily indexable by SE
"heel height=mid (1-3/4 to 2-1/4 in.)",
"brand=metaphor",
"shoe size=6",
"shoe size=6.5", ...
],
"sattrs": [ // global attributes, not to be indexed
"upper material=synthetic",
"toe=open toe", ...
],
"vars": [
{ "id": "05497884001",
"img": [ // images],
"attrs": [ // list of variant attributes to index ]
"sattrs": [ // list of variant attributes not to index ] }, …
] }
47. Search - Using Solr
Defining the schema in schema.xml
<fields>
<!-- some of the core fields -->
<field name="_id" type="string" indexed="true" stored="true" />
<field name="name" type="text_general" indexed="true" stored="true" />
<field name="cat" type="string" indexed="true" stored="true" />
<field name="price" type="float" indexed="true" stored="true"/>
<!-- the full text to index -->
<field name="desc.0.val" type="text_general" indexed="true" stored="true"/>
<!-- dynamic attributes for facetting -->
<dynamicField name="attrs.*" type="string" indexed="true" stored="true"/>
<!– some Solr specific fields -->
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="timestamp" type="date" indexed="true" stored="true" default="NOW"
multiValued="false"/>
<dynamicField name="*" type="ignored" multiValued="true"/>
</fields>
48
48. Search - Using Solr
Starting up the connector
> mongo-connector
> Keep it running, it will just stream the Oplog
49
-m ec2-54-80-63-229.compute-1.amazonaws.com:27017 // the mongo
-t http://localhost:8983/solr // the solr
-d mongo_connector/doc_managers/solr_doc_manager.py
-n "catalog.summary" // target summary collection
--auto-commit-interval=60 // commit every 1 min
…
49. Search – Using Solr
Document in Solr looks like:
{ "desc.0.val": "Our classic "Flying Duck" styled as a ...",
Lists are flattened which is difficult to use
> Must use to named fields to implement Facets
50
"name": "Drake Waterfowl Duck Label SS T-Shirt Army Green",
"attrs.1": "brand=Drake Waterfowl",
"attrs.0": "style=t-shirts",
"cat": "/84700/1200000239/1282094207/1200000817",
"_id": "SPM10823491916",
"_version_": 1479173524477182000,
"timestamp": "2014-09-13T23:09:59.782Z"
}
50. Search – Using Elastic Search
51
Let's use Elastic Search…
52. Search - Using Elastic Search
ElasticSearch understands whole document right off the bat
Just need to tell ES not to tokenize the facets:
> Everything else is indexed auto-magically!
53
$ curl -XPOST localhost:9200/largecat3.summary -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"string" : { // string is the name of default mapping type
"properties" : {
"attrs" : { "type" : "string", "index" : "not_analyzed" }
}
} } }'
53. Search - Using Elastic Search
Starting up the connector
> mongo-connector
> Keep it running, it will just stream the Oplog
54
-m ec2-54-80-63-229.compute-1.amazonaws.com:27017 // the mongo
-t http://localhost:9200 // the ES
-d mongo_connector/doc_managers/elastic_doc_manager.py
-n "catalog.summary" // target summary collection
--auto-commit-interval=60 // commit every 1 min
…
55. Search – Using MongoDB Indexing
56
How about MongoDB's indexes and
Full-Text-Search?
56. Search – Using MongoDB indexing
The summary contains:
• department e.g. "Shoes"
• Fields to index
57
– Category path, e.g. "Shoes/Women/Pumps"
– Price
– List of Item Attributes, e.g. Brand = Guess
– List of Variant Attributes, e.g. Color = red
• Fields not to index
– List of Item Secondary Attributes, e.g. Style = Designer
– List of Variant Secondary Attributes, e.g. heel height = 4.0
57. Search - Using MongoDB indexing
• Get summary from item id
58
db.variation.find({ _id: "p301671" })
• Get summary's specific variation from SKU
db.variation.find( { "vars.sku": "730223104376" }, { "vars.$": 1 } )
• Get summary by department, sorted by rating
db.variation.find( { department: "Shoes" } ).sort( { rating: 1 } )
• Get summary with mix of parameters
db.variation.find( { department : "Shoes" ,
"vars.attrs" : { "color" : "Gray"} ,
"category" : ^/Shoes/Women/ ,
"price" : { "$gte" : 65.99 , "$lte" : 180.99 } } )
58. Search – Using MongoDB indexing
• The following indices are used:
59
– department + attr + category + _id
– department + vars.attrs + category + _id
– department + category + _id
– department + price + _id
– department + rating + _id
• _id used for pagination
• Can take advantage of index intersection
• With several attributes specified (e.g. color=red
and size=6), which one is looked up?
59. Search – Using MongoDB indexing
Facet samples:
{ "_id" : "Accessory Type=Hosiery" , "count" : 14}
{ "_id" : "Ladder Material=Steel" , "count" : 2}
{ "_id" : "Gold Karat=14k" , "count" : 10138}
{ "_id" : "Stone Color=Clear" , "count" : 1648}
{ "_id" : "Metal=White gold" , "count" : 10852}
Single operations to insert / update:
db.facet.update( { _id: "Accessory Type=Hosiery" },
60
{ $inc: 1 }, true, false)
The facet with lowest count is the most restrictive…
It should come first in the $all query!
60. Search – Comparing Solutions
• Search Engine advantages:
61
– Index size (~ 10x smaller than MongoDB's)
– Indexing speed
– Read speed, integrated cache
– All languages support
– Built-in facetted search, which includes facet counts
• MongoDB's Indexing advantages:
– Built-in the data store, no additional server / software needed
– Single query to get the results
– Can filter down the variant entry and save computing
> Winner here is Elastic Search