Finding the right car is a complex decision for every buyer. To build a more engaging shopping experience for our customers, cars.com chose Couchbase as our NoSQL database solution and delivered the first projects into production in 2016. We will share the story of our decision to adopt a NoSQL solution, our vendor selection process, the steps we took to deploy the platform and release the first projects, lessons learned, and how we continue to deliver great customer experiences.
Good morning! My name is Jason Williams, and I am an Enterprise Architect at Cars.com.
I’m here today to share some highlights of our Couchbase experience with you.
From an EA perspective, Cars.com is a data processing engine.
We acquire a variety of data from a variety of sources. A primary data entityin the system is our vehicle listings, but we acquire (or create) many other types of data, such as Service & Repair data or our Editorial Content.
All of the data we acquire must be stored somewhere, and from that storage platform it is distributed to consumers via a few difference channels: Cars.com, mobile app, data syndication partners.
Couchbase plays an increasingly important role in our platform – because in order to achieve our goals, we needed an inherently more flexible data platform that was easier to scale than our existing traditional databases.
Transition to Bruce
I'm on the Architecture Team at Cars.com with Jason. I also participated in the vendor selection and then I lead the rollout of Couchbase at Cars. I'm going to talk a little bit about our first use cases...
We have a number of use cases that are a good fit for NoSQL and Couchbase that we're planning to do in the future but we had some specific goals for the first things that we selected to do. Unless your company is a small start up there's quite a bit of planning and work that comes with rolling out a new database platform.
We wanted our first use cases to give us the opportunity to operationalize the platform with minimal risk and we wanted to provide some visible business value.
We also establish some standards and patterns to accelerate future development and we wanted to create enthusiasm with developers.
Our first use case came out of an idea for a hackathon competition at Cars submitted by my co-worker Krishna.
The idea was called Vehicle Listing Demand Metrics and it was developed by one of the hackathon teams over several Fridays and they won the competition.
The Cars.com website has a Vehicle Detail Page which we call the VDP.
The VDP shows price and all the other detail for a vehicle that's for sale.
The idea was to add activity metrics to our VDP page to present to shoppers.
Of course, Cars tracks daily website activity for analytics purposes but in this case we wanted to present the data to shoppers updated in real-time. So we added counts for page views, saves, and emails.
The hope is that this would create some urgency to encourage shoppers select a vehicle and contact the dealer.
We have an internal Active Logging API which streams data to Kafka which goes to our Data Warehouse for analytics.
Our Activity Logging process is able to flag activity which is suspected as coming from bots or scrappers. We created a new Spark process to aggregate the data that we needed and write it to Couchbase in mini batches.
Kafka and Spark give us to capability to buffer the data and also to reject it when it's been identified as coming from a bot or scrapper.
We used Couchbase atomic counters to store the metrics. This allowed us to store metrics for millions of vehicles using very little memory. It also allows us to resume the Spark process without losing our total counts if we need to stop it for some reason or if it crashes.
We set a time to live on our counter documents so they'll be deleted automatically after we no longer need them. This eliminates the need to have a separate batch process to purge the documents.
This was a fairly simple use case on Couchbase since it didn't require any indexes or N1QL queries but it gave us the opportunity to get our feet wet and get the Couchbase deployed.
Our next use case internally we called Unpacking the Price.
Our mission for this was to increase consumer confidence in the vehicle prices listed on Cars.com.
(The picture is from our confidence commercial.)
Our VDP page displays all of the features available on a vehicle listed for sale but there's no easy way for the shopper to determine how those features impact the price of the vehicle.
Vehicles that appear similar could have a significant price differences because of the features included.
Dealers don't provide feature pricing information so we decided to purchase manufacturer pricing data and present that information as a benefit to our shoppers.
We wanted to present features to shoppers in a way that provides price transparency.
As part of our Vehicle Listing Acquisition Pipeline we retrieve the manufacturer data which is provided in the form of a complex SOAP XML document from our Partner’s API. We convert the XML to a JSON format to store it in Couchbase. The documents contain a lot of information about each vehicle and are about 45 KB in size.
Using N1QL we’re able to query the JSON and retrieve only the information that we need. In this case we only needed pricing data for the vehicle features.
Prior to using Couchbase we would have stored the data in our database and then cached it in a separate distributed cache. Since Couchbase does both functions it allowed us to simplify our architecture.
Again this was a fairly simple use case which only required a key lookup but we have many more use cases for this data. Having the ability to index and query the data in different ways later is one of the things we like about Couchbase.
We've been using Couchbase live in production for a few months now and we have plans to do a lot more next year. We put together a short list of the things that we liked.
We found that learning the basics and getting up and running with Couchbase was easy. During our vendor evaluation we were able get a small three node cluster up and running and get data in and out of it in under and hour. We also found the Couchbase forums were helpful when we needed them.
The automatic in-memory caching simplifies our architecture and design.
The single node type architecture made it possible for us to create a single Chef recipe to automate the build out of the VMs for our clusters.
Many developers at Cars have experience with SQL so they liked that N1QL uses syntax that they're already familiar with. It also provides us with many of the capabilities of a relational database if we need them.
The Couchbase SDK allows us flexibility to configure various options at the client level so we can tune for performance or reliability as needed for each use case.
Couchbase allows us the flexibility to start simple with a key value store without losing the ability to index and query our data differently later using N1QL.
Building a better car buying experience using NoSQL at Cars.com – Couchbase Connect New York 2017