Learn how Gilt uses Solr to incorporate realtime inventory changes and targeting
Note that this is a slight reworking of an original presentation from ApacheCon 2012 for Lucene Revolution 2013
Solr nyc meetup - May 14 2015 - Adrian TrenamanAdrian Trenaman
Learn how we adopted Solr at Gilt to provide personalised search results, from early days using Solr 3 on just three nodes, to a more modern clustered deployment today running Solr 4 that powers all listing pages on Gilt through our daily pulse load at noon. We'll discuss what worked, what didn't, and where we're going next for search on gilt.com.
Scaling Through Partitioning and Shard Splitting in Solr 4thelabdude
Over the past several months, Solr has reached a critical milestone of being able to elastically scale-out to handle indexes reaching into the hundreds of millions of documents. At Dachis Group, we've scaled our largest Solr 4 index to nearly 900M documents and growing. As our index grows, so does our need to manage this growth.
In practice, it's common for indexes to continue to grow as organizations acquire new data. Over time, even the best designed Solr cluster will reach a point where individual shards are too large to maintain query performance. In this Webinar, you'll learn about new features in Solr to help manage large-scale clusters. Specifically, we'll cover data partitioning and shard splitting.
Partitioning helps you organize subsets of data based on data contained in your documents, such as a date or customer ID. We'll see how to use custom hashing to route documents to specific shards during indexing. Shard splitting allows you to split a large shard into 2 smaller shards to increase parallelism during query execution.
Attendees will come away from this presentation with a real-world use case that proves Solr 4 is elastically scalable, stable, and is production ready.
Personalized Search on the Largest Flash Sale Site in AmericaAdrian Trenaman
This talk provides a tour of how Apache Solr is used to power search for America's largest flash sale site, www.gilt.com. We show how to address the challenges of providing listings for fast moving inventory in a search space personalized for each of our members. The solution, built on Play Framework comprises less than 4,000 lines of code, and provides response times of 40ms on average.
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...MongoDB
During this session we will cover the best practices for implementing the insight component with MongoDB. This includes efficiently ingesting and managing a large volume of user activity logs, such as clickstreams, views, likes and sales. We'll dive into how you can derive user statistics, product maps and trends using different analytics tools like the aggregation framework, map/reduce or the Hadoop connector. We will also cover operational considerations, including low-latency data ingestion and seamless aggregation queries.
OLAP Basics and Fundamentals by Bharat Kalia Bharat Kalia
OLAP is a category of software technology that enables analysts, managers, and executives to gain insight into the data through fast, consistent, interactive, access in a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user.
Solr nyc meetup - May 14 2015 - Adrian TrenamanAdrian Trenaman
Learn how we adopted Solr at Gilt to provide personalised search results, from early days using Solr 3 on just three nodes, to a more modern clustered deployment today running Solr 4 that powers all listing pages on Gilt through our daily pulse load at noon. We'll discuss what worked, what didn't, and where we're going next for search on gilt.com.
Scaling Through Partitioning and Shard Splitting in Solr 4thelabdude
Over the past several months, Solr has reached a critical milestone of being able to elastically scale-out to handle indexes reaching into the hundreds of millions of documents. At Dachis Group, we've scaled our largest Solr 4 index to nearly 900M documents and growing. As our index grows, so does our need to manage this growth.
In practice, it's common for indexes to continue to grow as organizations acquire new data. Over time, even the best designed Solr cluster will reach a point where individual shards are too large to maintain query performance. In this Webinar, you'll learn about new features in Solr to help manage large-scale clusters. Specifically, we'll cover data partitioning and shard splitting.
Partitioning helps you organize subsets of data based on data contained in your documents, such as a date or customer ID. We'll see how to use custom hashing to route documents to specific shards during indexing. Shard splitting allows you to split a large shard into 2 smaller shards to increase parallelism during query execution.
Attendees will come away from this presentation with a real-world use case that proves Solr 4 is elastically scalable, stable, and is production ready.
Personalized Search on the Largest Flash Sale Site in AmericaAdrian Trenaman
This talk provides a tour of how Apache Solr is used to power search for America's largest flash sale site, www.gilt.com. We show how to address the challenges of providing listings for fast moving inventory in a search space personalized for each of our members. The solution, built on Play Framework comprises less than 4,000 lines of code, and provides response times of 40ms on average.
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...MongoDB
During this session we will cover the best practices for implementing the insight component with MongoDB. This includes efficiently ingesting and managing a large volume of user activity logs, such as clickstreams, views, likes and sales. We'll dive into how you can derive user statistics, product maps and trends using different analytics tools like the aggregation framework, map/reduce or the Hadoop connector. We will also cover operational considerations, including low-latency data ingestion and seamless aggregation queries.
OLAP Basics and Fundamentals by Bharat Kalia Bharat Kalia
OLAP is a category of software technology that enables analysts, managers, and executives to gain insight into the data through fast, consistent, interactive, access in a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user.
Project Management CaseYou are working for a large, apparel desi.docxbriancrawford30935
Project Management Case
You are working for a large, apparel design and manufacturing company, Trillo Apparel Company (TAC), headquartered in Albuquerque, New Mexico. TAC employs around 3000 people and has remained profitable through tough economic times. The operations are divided into 4 districts; District 1 – North, District 2 – South, District 3 – West and District 4 – East. The company sets strategic goals at the beginning of each year and operates with priorities to reach those goals.Trillo Apparel Company Current Year Priorities
Increase Sales and Distribution in the East
Improve Product Quality
Improve Production in District 4
Increase Brand Recognition
Increase RevenuesCompany Details
Company Name: Trillo Apparel Company (TAC)
Company Type: Apparel design and production
Company Size: 3000 employees
Position
# Employees
Owner/CEO
1
Vice President
4
Chief Operating Officer
1
Chief Financial Officer
1
Chief Information Officer
1
IT Department
38
District Manager
4
Sales Team
30
Accountant
12
Administrative Assistant
7
Order Fullfilment
45
Customer Service
57
Designer
24
Project Manager
10
Maintenance
25
Operations
2500
Shipping Department
240
Total Employees
3000
Products: Various Apparel
Corporate Location: Albuquerque, New MexicoTAC Organization Chart
District 4 Production Warehouse Move Project Details
The business has expanded considerably over the past few years and District 4 in the East has outgrown its current production facility. Because of this growth the executives want to expand the current facility, moving the whole facility 10 miles away. The location selected has enough room for the production and the shipping department. However, the current warehouse needs some renovation to accommodate the district’s operational needs.
The VP of Operations estimates the production and shipping warehouse move for District 4 will provide room required to generate the additional $1 million/year product revenues to meet the current demand due to the expanded production capacity. Daily production generates $50,000 revenue so a week of downtime will cost $250,000 in lost revenues.
The move must be completed in 4 months.
Mileage between the old and new facilities is 10 miles.
Bids have been received from contractors to build out the new office space and production floor and have signed contracts for work as follows:
Activity
Company Providing Services
Total Contract
Supplies
Time Needed
Pack, move and unpack production equipment
City Equipment Movers
$150,000
n/a
5 Days
Move non-production equipment and materials
Express Moving Company
$125,000
n/a
5 Days
Framing
East Side Framing & Drywall
$121,000
$125,000
15 Days
Electrical
Sparks Electrical
$18,000
$12,000
10 Days
Plumbing
Waterworks Plumbing
$15,000
$13,000
10 Days
Drywall
East Side Framing & Drywall
$121,000
$18,000
15 Days
Finish Work
Woodcraft Carpentry
$115,000
$15,000
15 Days
Build work benches for production floor
Student Workers Carpentry
$112,000
$110,000
15 Days
Product.
Ws innovation in action 2013 cgi and scania gamification presentation slideshareMartin Högenberg
This is the slide set we used for the Workshop in Innovation in Action yearly symphosium. We aimed for engaging people and to make them feel how it is to gamify a process. We did explore challenging innovative ideas around the future of bikes, bike 3.0!
Visualize data using the split-apply-combine approachLuca Candela
An easy to understand primer about the "split-apply-combine" concept popularized by Hadley Wickham applied to data visualization. Following that I go through a simple introduction to the perceptual variables available for data visualization and some common mistakes.
Deep Anomaly Detection from Research to Production Leveraging Spark and Tens...Databricks
Anomaly detection has numerous applications in a wide variety of fields. In banking, with ever growing heterogeneity and complexity, the difficulty of discovering deviating cases using conventional techniques and scenario definitions is on the rise. In our talk, we’ll present an outline of Swedbank’s ways of constructing and leveraging scalable pipelines based on Spark and Tensorflow in combination with an in-house tailor-made platform to develop, deploy and monitor deep anomaly detection models. In summary, this talk will present Swedbank’s approach on building, unifying and scaling an end-to-end solution using large amounts of heterogeneous imbalanced data. In this talk we will include sections with the following topics: Feature engineering: transactions2vec; Anomaly detection and its applications in banking; Deep anomaly detection methods: Deep SVDD and Generative adversarial networks, Model overview and code snippets in Tensorflow estimator API; Model Deployment: An overview of how the different puzzle pieces outlined above are put together and operationalized to create and end-to-end deployment.
Selling to the Enterprise: Value Statements, Buyer Types & Stages of the SaleSalesQualia
Presentation slides from the June 6, 2013 "Selling to the Enterprise" workshop presented by Scott Sambucci at PARISOMA in San Francisco, CA.
The workshop focused on developing Value Statements, Identifying Buyer Types, and moving sales opportunities through the Stages of the Sale.
Questions? Call or email Scott directly at (415) 596 0804 | scott@salesqualia.com
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
During this session we will cover the best practices for implementing a product catalog with MongoDB. We will cover how to model an item properly when it can have thousands of variations and thousands of properties of interest. You'll learn how to index properly and allow for faceted search with milliseconds response latency and how to implement per-store, per-sku pricing while still keeping a sane number of documents. We will also cover operational considerations, like how to bring the data closer to users to cut down the network latency.
Greenfield projects are awesome – you can develop highest quality application using best practices on the market. But what if your bread actually is Legacy projects? Does it mean that you need to descend into darkness of QA absence? Does it mean that you can’t use Agile or modern communication practices like BDD?
A talk I gave to Form 4 at my son's school, St. Mary's College Rathmines, Dublin, Ireland, for Science Week, on why I love computer science and the art of programming. We had some fun going through the Josephus problem, and talking about how the 1's and 0's really work, and mentioning some of the heroes and heroines of the discipline.
Slides from my presentation on Micro-Services at Gilt at JavaOne 2015, San Francisco. I've given this preso a number of times, and this iteration includes some notes on how we classify and track our services, how we've inferred an emergent architecture, tracked ownership, and identified complexity.
Project Management CaseYou are working for a large, apparel desi.docxbriancrawford30935
Project Management Case
You are working for a large, apparel design and manufacturing company, Trillo Apparel Company (TAC), headquartered in Albuquerque, New Mexico. TAC employs around 3000 people and has remained profitable through tough economic times. The operations are divided into 4 districts; District 1 – North, District 2 – South, District 3 – West and District 4 – East. The company sets strategic goals at the beginning of each year and operates with priorities to reach those goals.Trillo Apparel Company Current Year Priorities
Increase Sales and Distribution in the East
Improve Product Quality
Improve Production in District 4
Increase Brand Recognition
Increase RevenuesCompany Details
Company Name: Trillo Apparel Company (TAC)
Company Type: Apparel design and production
Company Size: 3000 employees
Position
# Employees
Owner/CEO
1
Vice President
4
Chief Operating Officer
1
Chief Financial Officer
1
Chief Information Officer
1
IT Department
38
District Manager
4
Sales Team
30
Accountant
12
Administrative Assistant
7
Order Fullfilment
45
Customer Service
57
Designer
24
Project Manager
10
Maintenance
25
Operations
2500
Shipping Department
240
Total Employees
3000
Products: Various Apparel
Corporate Location: Albuquerque, New MexicoTAC Organization Chart
District 4 Production Warehouse Move Project Details
The business has expanded considerably over the past few years and District 4 in the East has outgrown its current production facility. Because of this growth the executives want to expand the current facility, moving the whole facility 10 miles away. The location selected has enough room for the production and the shipping department. However, the current warehouse needs some renovation to accommodate the district’s operational needs.
The VP of Operations estimates the production and shipping warehouse move for District 4 will provide room required to generate the additional $1 million/year product revenues to meet the current demand due to the expanded production capacity. Daily production generates $50,000 revenue so a week of downtime will cost $250,000 in lost revenues.
The move must be completed in 4 months.
Mileage between the old and new facilities is 10 miles.
Bids have been received from contractors to build out the new office space and production floor and have signed contracts for work as follows:
Activity
Company Providing Services
Total Contract
Supplies
Time Needed
Pack, move and unpack production equipment
City Equipment Movers
$150,000
n/a
5 Days
Move non-production equipment and materials
Express Moving Company
$125,000
n/a
5 Days
Framing
East Side Framing & Drywall
$121,000
$125,000
15 Days
Electrical
Sparks Electrical
$18,000
$12,000
10 Days
Plumbing
Waterworks Plumbing
$15,000
$13,000
10 Days
Drywall
East Side Framing & Drywall
$121,000
$18,000
15 Days
Finish Work
Woodcraft Carpentry
$115,000
$15,000
15 Days
Build work benches for production floor
Student Workers Carpentry
$112,000
$110,000
15 Days
Product.
Ws innovation in action 2013 cgi and scania gamification presentation slideshareMartin Högenberg
This is the slide set we used for the Workshop in Innovation in Action yearly symphosium. We aimed for engaging people and to make them feel how it is to gamify a process. We did explore challenging innovative ideas around the future of bikes, bike 3.0!
Visualize data using the split-apply-combine approachLuca Candela
An easy to understand primer about the "split-apply-combine" concept popularized by Hadley Wickham applied to data visualization. Following that I go through a simple introduction to the perceptual variables available for data visualization and some common mistakes.
Deep Anomaly Detection from Research to Production Leveraging Spark and Tens...Databricks
Anomaly detection has numerous applications in a wide variety of fields. In banking, with ever growing heterogeneity and complexity, the difficulty of discovering deviating cases using conventional techniques and scenario definitions is on the rise. In our talk, we’ll present an outline of Swedbank’s ways of constructing and leveraging scalable pipelines based on Spark and Tensorflow in combination with an in-house tailor-made platform to develop, deploy and monitor deep anomaly detection models. In summary, this talk will present Swedbank’s approach on building, unifying and scaling an end-to-end solution using large amounts of heterogeneous imbalanced data. In this talk we will include sections with the following topics: Feature engineering: transactions2vec; Anomaly detection and its applications in banking; Deep anomaly detection methods: Deep SVDD and Generative adversarial networks, Model overview and code snippets in Tensorflow estimator API; Model Deployment: An overview of how the different puzzle pieces outlined above are put together and operationalized to create and end-to-end deployment.
Selling to the Enterprise: Value Statements, Buyer Types & Stages of the SaleSalesQualia
Presentation slides from the June 6, 2013 "Selling to the Enterprise" workshop presented by Scott Sambucci at PARISOMA in San Francisco, CA.
The workshop focused on developing Value Statements, Identifying Buyer Types, and moving sales opportunities through the Stages of the Sale.
Questions? Call or email Scott directly at (415) 596 0804 | scott@salesqualia.com
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
During this session we will cover the best practices for implementing a product catalog with MongoDB. We will cover how to model an item properly when it can have thousands of variations and thousands of properties of interest. You'll learn how to index properly and allow for faceted search with milliseconds response latency and how to implement per-store, per-sku pricing while still keeping a sane number of documents. We will also cover operational considerations, like how to bring the data closer to users to cut down the network latency.
Greenfield projects are awesome – you can develop highest quality application using best practices on the market. But what if your bread actually is Legacy projects? Does it mean that you need to descend into darkness of QA absence? Does it mean that you can’t use Agile or modern communication practices like BDD?
Similar to Lucene revolution 2013 adrian trenaman (20)
A talk I gave to Form 4 at my son's school, St. Mary's College Rathmines, Dublin, Ireland, for Science Week, on why I love computer science and the art of programming. We had some fun going through the Josephus problem, and talking about how the 1's and 0's really work, and mentioning some of the heroes and heroines of the discipline.
Slides from my presentation on Micro-Services at Gilt at JavaOne 2015, San Francisco. I've given this preso a number of times, and this iteration includes some notes on how we classify and track our services, how we've inferred an emergent architecture, tracked ownership, and identified complexity.
GeeCON Microservices 2015 scaling micro services at giltAdrian Trenaman
An evolution of the talk I gave at CraftConf earlier this year, talking about software architecture and micro-services at Gilt. Some new additions include ownership, service discovery and service anatomy.
My talk at the inaugural micro services meet-up at the Engine Yard in Dublin! An honest look at how we've landed on our micro-services architecture at Gilt, and the challenges we're facing.
Instructional webinar on how to create an consume web services with Apache ServiceMix using Apache CXF. We cover code generation, JAX-WS implementation, Spring configuration and both WAR and OSGi bundle-based deployment models.
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010Adrian Trenaman
Want to know how to design, implement and deploy modular enterprise integration solutions using OSGi? The Apache Karaf OSGi shell, used by Apache Felix and Apache ServiceMix, enhances core OSGi implementations like Felix or Equinox with an easy to use, extendible command shell, providing logging, hot deployment, configuration, container administration, clustering, high availability and easy 'feature-based' dependency management In this session, you'll learn how Karaf works, and how you can leverage Karaf either on its own or embedded within ServiceMix to deploy business logic, RESTful services, EIP-based integration flows and web services. You'll learn how to extend the command shell with your own commands, and, use Spring-DM *or* OSGi BluePrint Services to make using OSGi a walk in the park.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
12. The
Gilt
noon
‘attack
of
self
denial’
12
Email
&
Mobile
notification
of
today’s
sales
goes
to
members
at
11:45
EST...
...
Sales
go
live
at
Noon
EST.
Stampede.
15. Gilt
Data
Model
101
Sale
Product
Look
SKU
**
*
*
1
1
Silk Charmeuse
Wrap Dress
... in white
... in size 8
15
A Haiku:
Simple relations,
Solr makes hard; weeping, I
denormalize now.
Ade, Sinsheim, 2012
8
16. QED Architecture Pattern: Query, Enrich, Deliver
web-search SolrSolr
Solr
web-search
<<play>>
controllers
SearchAPI
Inventory
Service
Product
Service
Browser
pluginspluginsplugins &
components
search response: (lookId, skuId*)* as JSON
HTML/JSON
enrich
Motivating principle:“Do One Thing Well”
note: search API for
isolation, plugins to
perform personalization
& inventory sorting,
hydration of results
16
Targeting
Service
client-side
rendering
18. Search
space:
what
to
index?
Brand, Product Name!
Price!
Color!
Inventory Status!
Description!
Taxonomy!
SKU Attributes /
Material / Origin!
18
19. Data,
data,
data
gonna
hate
ya,
hate
ya,
hate
ya
Not
all
product
data
is
as
clean
as
you’d
like.
Description
contains
distracting
tokens
(‘also
available
in
blue,
black
(patent),
green
and
leather’)
Colors
are
often
poorly
named:
‘blk’,
‘black’,
‘priest’,
‘black
/
white’,
‘nite’,
‘night’,
‘100’,
‘multi’,
‘patent’.
Brand
names
can
mislead:
‘Thomas
Pink
Shirts’,
‘White
+
Warren
Black
Dress’
Trust
nothing.
Try
everything.
Be
ready
for
surprises...
19
20. Example:
synonym
leakage
Naive
and
excited,
we
configured
the
Princeton
Synonym
database
with
Solr
:)
Search
for
“black
dress”
yields
“Jet
Set
Tote”
:(
Learning:
don’t
blindly
rely
on
synonyms.
huh?
Doh!
(black → ‘jet black’ → jet) (dress → set)
20
21. Index
by
looks
or
by
SKU?
Turmoil:
we
list
looks,
but
filter
by
SKU
attributes
(e.g.
size).
Bad
Idea:
index
looks,
with
size
as
a
multi-‐value
field:
product_look_sizes
=
"S",
"M",
"L",
"XL"
...
filtering
with
product_look_sizes:M
might
return
a
product
that
has
inventory
for
S,
L,
XL
but
no
inventory
for
"M".
Really
Bad
Experience.
Better
idea:
denormalize.
Index
by
SKU
with
a
single-‐valued
field
size
field:
sku_size
=
"M"
...
now,
when
someone
indexes
by
"M",
they
only
SKUs
in
medium.
Bazinga!
21
22. Index
by
looks
or
by
SKU?
(cont’)
Problem:
But
we
list
looks,
not
SKUS!!
The
last
thing
a
customer
wants
is
to
see
8
pictures
of
the
same
dress
in
8
different
sizes
Solution:
Group
by
'product
look
ID':
works
a
treat!
<str
name="group">
true
</str>
<str
name="group.field">
sku_look_id_s
</str>
<str
name="group.limit">
-‐1
</str>
<str
name="group.ngroups">
true
</str>
Initial
worries
about
performance
are
(so-‐far)
unfounded.
“Premature
optimization
is
the
root
of
all
evil.”
We
-‐may-‐
change
to
block-‐join
queries
when
we
move
to
Solr4.
22
23. Solr Internals & Extensions I
Solr
search response: (lookId, skuId*)* as JSON
INDEX
request
group
23
Index contains SKUs
25. Real-‐time
inventory
ordering
Crucial:
sold-‐out
looks
should
move
to
bottom
of
list;
i.e.
sort
by
‘inventory
status’.
Given:
an
asynchronously
updated
caching
'inventory
status'
API.
InventoryStatus
status
=
getInventory(sku)
...
we
created
a
custom
inventory()
function
in
Solr
that
assigns
a
numeric
value
(0/1)
to
each
SKU
in
a
result
[O(n)]
Queries
can
now
be
sorted
using:
order
by
inventory(sku),
score
25
26. Solr Internals & Extensions II
Solr
Inventory
Service
search response: (lookId, skuId*)* as JSON
sort
inventoryfunction
INDEX
request
group
26
28. Targeting
Sale-‐targeting
is
an
important
tool
for
Gilt
merch:
Groups
(Noir
members,
employees,
...),
geo-‐IP,
weather,
inclusion/exclusion
lists
Search
listings
and
auto-‐suggest
must
respect
sale
targeting.
Solution:
Store
the
saleId
that
the
SKU
is
available
in
the
index
Pass
the
user’s
GUID
to
Solr
Filter
results
using
a
custom
filter
if
(!
allowed(saleId,
userGuid))
{
//
remove
result
from
listing
}
28
29. Solr Internals & Extensions III
Solr
Inventory
Service
search response: (lookId, skuId*)* as JSON
sort
inventoryfunction
targetingfilter
Targeting
Service
INDEX
request
group
29
31. Our
index
is
time-‐sensitive
Sales
go
live
at
Noon
EST:
Do
not
want
products
to
be
searchable
if
the
sale
is
not
yet
active.
Do
want
products
to
be
searchable
at
exactly
Noon
EST
We
index
Solr
every
n
minutes
(n
==
15)
Need
to
encode
a
sense
of
time
in
the
index.
31
32. Time-‐sensitive
search
1st
idea:
index
a
SKUs
start/end
availability
and
then
filter
on
that.
Use
Solr’s
‘dynamic
fields’:
for
each
store
(men
/
women
/
home
/
kids
/
...)
{
//
get
the
nearest
upcoming
availbility
window
&
index
it.
sku_availability_start_<store>
=
startTime
sku_availability_end_<store>
=
endTime
sku_availability_sale_<store>
=
saleId
}
Then,
naively
filter
using
(see
next
slide
for
improvement):
fq=+sku_availability_start_<store>:[*
TO
NOW]
+sku_availability_end_<store>:[NOW+1HOUR
TO
*]
32
33. All
praise
and
thanks
to
the
chump.
Filtering
by
time
for
every
request
is
expensive
:(
Posed
question
to
The
Chump
at
Lucene
Revolution
2012.
Solution:
rounding
time
bounds
to
nearest
hour
/
minute
means
we
get
a
cached
query
filter.
[*
TO
NOW]
→
[*
TO
NOW/HOUR]
Super
Chump.
Chump
is
Champ.
Chump-‐tastic.
Chump-‐nominal.
Awe-‐chump.
33
34. Time-‐sensitive
search
(cont’)
e.g.
Consider
a
SKU
available
in
women
&
home:
sku_availability_start_home
=
32141232
//
Time
sku_availability_end_home
=
32161232
//
Time
sku_availability_sale_home
=
12121221
//
SaleID
sku_availability_start_women
=
32141232
//
Time
sku_availability_end_women
=
32161232
//
Time
sku_availability_sale_women
=
223423424
//
SaleID
Now,
can
search
in
Home
store
using:
fq=+sku_availability_start_home:[*
TO
NOW/HOUR]
+sku_availability_end_home:[NOW/HOUR+1HOUR
TO
*]
34
35. Problems
in
the
space-‐time
continuum
(cont’)
Sale A (targeted to ‘Noir’)
Sale B (untargeted)
Sale C (weekend specials)
‘Best Availability Window’
Danger Zone
35
Problem:
SKUs
can
be
active
in
more
than
one
sale,
but
our
schema
doesn’t
allow
for
this.
Heuristic:
“Pick
earliest
active
window,
or
soonest
starting
window”:
Problems:
might
choose
a
targeted
sale
over
an
open
sale
and
deny
customers
the
chance
to
buy.
36. Problems
in
the
space-‐time
continuum
(cont’)
Further:
some
sales
are
restricted
to
the
sale’s
channel
:
Mobile,
iPad,
...
Our
'best
availability
window'
code
was
already
creaking;
Introducing
more
dynamic
fields
(channel
x
store)
stank.
sku_availability_start_web_home
=
32141232
//
Time
sku_availability_end_web_home
=
32161232
//
Time
sku_availability_sale_web_home
=
12121221
//
SaleID
sku_availability_start_ipad_home
=
32141232
//
Time
sku_availability_end_ipad_home
=
32161232
//
Time
sku_availability_sale_ipad_home
=
12121221
//
SaleID
sku_availability_start_mobile_home
=
32141232
//
Time
:
:
:
:
36
37. Eureka!
Stop
indexing
SKUs;
instead:
index
'the
availability
of
a
SKU
at
a
moment
in
time'
sku_name
=
...
sku_color
=
...
sku_availibility_start
=
sku_availibility_end
=
sku_store
=
home
sku_channels
=
mobile,
ipad
sku_sale_id
=
The
filter
becomes
much
simpler:
fq
=
+sku_availability_start:[*
TO
NOW/HOUR]
+sku_availability_end:[NOW/HOUR+1HOUR
TO
*]
Code
fell
away.
Just
had
to
to
post-‐filter
out
multiple
SKU
availabilities
in
results
[O(n)]
37
38. Solr Internals & Extensions IV
Solr
Inventory
Service
search response: (lookId, skuId*)* as JSON
sort
inventoryfunction
targetingfilter
Targeting
Service
de-dupefilter
INDEX
request
group
availability-component
38
40. Mixed-‐logic
multi-‐select
faceting
Facet
filters
typically
use
'AND'
logic
size
=
36
AND
brand
=
'Timber
Island'
AND
color
=
'RED'
Prefer
mixed
logic,
where
'OR'
is
used
for
filters
on
the
same
facet.
size
=
36
OR
size
=
38
AND
brand
=
'Timber
Island'
AND
color
=
'RED'
OR
color='BLUE'
Use
‘multi-‐select’
faceting
technique
to
ensure
that
facet
values
don’t
disappear.
Tag
the
facet
filter-‐query:
fq:{!tag=brand_fq}brand_facet:"Timber
Island"
Ignore
the
effects
of
the
facet
filter-‐query
when
faceting:
<str
name="facet.field">{!ex=brand_fq}brand_facet</str>
40
41. ‘Disappearing
facet
problem’
To
get
facets
relevant
to
the
query,
we
set
facet.mincount
to
1.
<str
name="facet.mincount">1</str>
However:
if
subsequent
facet
filters
reduce
the
facet
count
to
zero,
then
the
facet's
'disappear'
:(
Consider
a
search
for
'shoes',
returning
the
following
facets
brand:
Ade
-‐>
10,
Eric
-‐>
5,
color:
black
-‐>
8,
red
-‐>
7
Then
filter
on
red
(and
assume
that
only
Ade
shoes
are
red).
We
want
to
have:
brands:
Ade
-‐>
7,
Eric
-‐>
0,
colors:
black
-‐>
0,
red
-‐>
7
However,
if
facet.mincount
==
1,
we
get:
brands:
Ade
-‐>
7
colors:
red
-‐>
7
41
42. ‘Disappearing
facet
problem’
(cont’)
We
want
to
say
'Eric
and
black
are
relevant
to
the
query,
but
has
zero
results
due
to
filtering'.
Solution:
For
each
facet
overlay
the
original
counts
with
the
values
of
the
new
count
if
present,
or
zero
otherwise.
<str
name="facet.field">{!ex=brand_fq,size_fq,taxonomy_fq,color_fq
key=all_brand_facets}brand_facet</str>
<str
name="facet.field">{!ex=brand_fq,size_fq,taxonomy_fq,color_fq
key=all_color_facets}color_facet</str>
This
means
we
get:
all_brand_facets:
Ade
-‐>
10,
Eric
-‐>
5,
brand_facet:
Ade
-‐>
7
all_color_facets:
black
-‐>
8,
red
-‐>
7,
color_facet:
red
-‐>
7
Merge
to
get:
brand_facet:
Ade
-‐>
7,
Eric
-‐>
0,
color_facet:
red
-‐>
7,
black
-‐>
0
42
43. Need
to
map
SKU's
color
to
a
simple
set
of
colors
for
faceting.
Used
a
synonym
file
to
map
'urobilin'
to
'yellow'
Use
a
stopwords
file
to
remove
unknown
colors
It
works,
but
it's
brittle;
want
to
move
to
a
solution
based
on
color
analysis
of
swatches
Color
Me
Happy
43
44. Hierarchical
taxonomy
faceting
Our
taxonomy
is
hierarchical.
Solr
faceting
is
not
:(
Encoded
taxonomy
hierarchy
using
'paths'
"Gilt::Men::Clothing::Pants"
Our
search
API
converts
the
paths
into
a
tree
for
rendering
purposes.
Works
but
(a)
feels
jenky
(b)
ordering
is
alphabetic
Prefer
in
future
to
use
unique
taxonomy
key,
and
use
that
the
filter
through
an
ordered
tree
in
our
AP
44
45. Hierarchical
size
faceting
Size
ordering
is
non
trivial:
there
is
a
two-‐level
hierarchy
&
ordering
is
difficult:
00,
0,
2,
4,
6,
8,
10,
12
rather
than
0,
00,
10,
12,
2,
4,
6,
8
XS,
S,
M,
L,
XL,
XXL
rather
than
M,
L,
S,
XL,
XS,
XXL
End
up
encoding
a
size
ordinal
into
the
size
facet
label
"Women's
Apparel::[[0000000003]]00"
"Women's
Apparel::[[0000000004]]0"
"Women's
Apparel::[[0000000005]]2"
OK,
but
again,
feels
hacky.
“If
only
there
was
a
way
to
annotate
facets
with
meaningful
data”
45
46. Solr Internals & Extensions (V)
Solr
Inventory
Service
search response: (lookId, skuId*)* as JSON
sort
inventoryfunction
targetingfilter
facet
Targeting
Service
de-dupefilter
INDEX
request
group
availability-component
46
48. Lesson
II:
Solr
makes
the
business
Love
You
Gilt
Search
released
in
A/B
test
to
30%.
Initial
deployment
can
handle
>
180,000
RPM
(5:1
ratio
of
auto-‐complete
to
search)
Members
who's
visit
included
search
are
4x
more
likely
to
buy.
Search
generates
2-‐4%
incremental
revenue
Business
pleads
with
us
to
end
A/B
test
and
release
to
100%
We've
iterated
to
drive
more
traffic
(&
succeeded)
with
no
loss
of
conversion.
From
subtle
to
suBtle.
48
49. Lesson
III:
Excelsior
Power
all
listings
(keyword,
category,
sale,
brand)
via
Solr
Faceting
on
SKU
attributes
(e.g.
‘age’,
‘gender’)
Solr
4
Hack
Debridement:
Improve
hierarchical
faceting
&
facet
ordering
‘Double
facet’
feels
wrong
49
50. CONFERENCE PARTY
The Tipsy Crow: 770 5th Ave
Starts after Stump The Chump
Your conference badge gets
you in the door
TOMORROW
Breakfast starts at 7:30
Keynotes start at 8:30
ME:
Adrian Trenaman
atrenaman@gilt.com
h"p://gilt.com/lucene-‐revolu4on-‐2013