CAVE - an overview 
Val Dumitrescu 
Paweł Raszewski
Another monitoring solution? 
At GILT: 
● OpenTSDB + Nagios 
● DataDog 
● NewRelic 
● Notifications with PagerDuty
What is CAVE? 
Continuous 
Audit 
Vault 
Enterprise
What is CAVE? 
A monitoring system that is: 
● secure 
● independent 
● proprietary 
● open source
Requirements 
● horizontally scalable to millions of metrics, alerts 
● multi-tenant, multi-user 
● extensible HTTP-based API 
● flexible metric definition 
● data aggregation / multiple dimensions 
● flexible and extensible alert grammar 
● pluggable notification delivery system 
● clean user interface for graphing and dashboarding
Architecture
Architecture
Architecture
Architecture
Architecture
Architecture
Architecture
Alert Grammar 
Metric has name and tags (key-value pairs) 
e.g. 
orders [shipTo: US] 
response-time [svc: svc-important, env: prod]
Alert Grammar 
Aggregated Metric has metric, aggregator and 
period of aggregation, e.g. 
orders [shipTo: US].sum.5m 
response-time [svc: svc-important, env: prod].p99.5m 
Supported aggregators: 
count, min, max, mean, mode, median, sum 
stddev, p99, p999, p95, p90
Alert Grammar 
Alert Condition contains one expression with 
two terms and an operator. Each term is a 
metric, an aggregated metric or a value. 
e.g. 
orders [shipTo: US].sum.5m < 10 
orders [shipTo: US].sum.5m < ordersPredictedLow [shipTo: US]
Alert Grammar 
An optional number of times the threshold is 
broken, e.g. 
response-time [svc: svc-team, env: prod].p99.5m > 3000 at least 
3 times
Alert Grammar 
Special format for missing data 
e.g. 
orders [shipTo: US] missing for 5m 
heartbeat [svc: svc-important, env: prod] missing for 10m
Alert Grammar 
trait AlertParser extends JavaTokenParsers { 
sealed trait Source 
case class ValueSource(value: Double) extends Source 
case class MetricSource( 
metric: String, tags: Map[String, String]) extends Source 
case class AggregatedSource( 
metricSource: MetricSource, 
aggregator: Aggregator, duration: FiniteDuration) extends Source 
sealed trait AlertEntity 
case class SimpleAlert( 
sourceLeft: Source, operator: Operator, 
sourceRight: Source, times: Int) extends AlertEntity 
case class MissingDataAlert( 
metricSource: MetricSource, duration: FiniteDuration) extends AlertEntity 
… 
}
Alert Grammar 
trait AlertParser extends JavaTokenParsers { 
… 
def valueSource: Parser[ValueSource] = decimalNumber ^^ { 
case num => ValueSource(num.toDouble) 
} 
def word: Parser[String] = """[a-zA-Z][a-zA-Z0-9.-]*""".r 
def metricTag: Parser[(String, String)] = (word <~ ":") ~ word ^^ { 
case key ~ value => key -> value 
} 
def metricTags: Parser[Map[String, String]] = repsep(metricTag, ",") ^^ { 
case list => list.toMap 
} 
… 
}
Alert Grammar 
trait AlertParser extends JavaTokenParsers { 
… 
def metricSourceWithTags: Parser[MetricSource] = 
(word <~ "[") ~ (metricTags <~ "]") ^^ { 
case metric ~ tagMap => MetricSource(metric, tagMap) 
} 
def metricSourceWithoutTags: Parser[MetricSource] = word ^^ { 
case metric => MetricSource(metric, Map.empty[String, String]) 
} 
def metricSource = metricSourceWithTags | metricSourceWithoutTags 
… 
}
Alert Grammar 
trait AlertParser extends JavaTokenParsers { 
… 
def duration: Parser[FiniteDuration] = wholeNumber ~ ("s"|"m"|"h"|"d") ^^ { 
case time ~ "s" => time.toInt.seconds 
case time ~ "m" => time.toInt.minutes 
case time ~ "h" => time.toInt.hours 
case time ~ "d" => time.toInt.days 
} 
def aggregatedSource: Parser[AggregatedSource] = 
(metricSource <~ ".") ~ (aggregator <~ ".") ~ duration ^^ { 
case met ~ agg ~ dur => AggregatedSource(met, agg, dur) 
} 
def anySource: Parser[Source] = valueSource | aggregatedSource | metricSource 
… 
}
Alert Grammar 
trait AlertParser extends JavaTokenParsers { 
… 
def missingDataAlert: Parser[MissingDataAlert] = 
metricSource ~ "missing for" ~ duration ^^ { 
case source ~ _ ~ d => MissingDataAlert(source, d) 
} 
def simpleAlert: Parser[SimpleAlert] = anySource ~ operator ~ anySource ^^ { 
case left ~ op ~ right => SimpleAlert(left, op, right, 1) 
} 
def repeater: Parser[Int] = "at least" ~ wholeNumber ~ "times" ^^ { 
case _ ~ num ~ _ => num.toInt 
} 
def simpleAlertWithRepeater: Parser[SimpleAlert] = 
anySource ~ operator ~ anySource ~ repeater ^^ { 
case left ~ op ~ right ~ num => SimpleAlert(left, op, right, num) 
}
Alert Grammar 
trait AlertParser extends JavaTokenParsers { 
… 
def anyAlert: Parser[AlertEntity] = 
missingDataAlert | simpleAlertWithRepeater | simpleAlert 
} 
Usage: 
class Something(conditionString: String) extends AlertParser { 
… 
parseAll(anyAlert, conditionString) match { 
case Success(SimpleAlert(left, op, right, times), _) => … 
case Success(MissingDataAlert(metric, duration), _) => … 
case Failure(message, _) => … 
} 
}
Slick 
Functional Relational Mapping (FRM) 
library for Scala 
Slick <> Hibernate
Slick 
compile-time safety 
no need to write SQL 
full control over what is going on
Scala Collections API 
case class Person(id: Int, name: String) 
val list = List(Person(1, "Pawel"), 
Person(2, "Val"), 
Person(3, "Unknown Name"))
Scala Collections API 
case class Person(id: Int, name: String) 
val list = List(Person(1, "Pawel"), 
Person(2, "Val"), 
Person(3, "Unknown Name")) 
list.filter(_.id > 1)
Scala Collections API 
case class Person(id: Int, name: String) 
val list = List(Person(1, "Pawel"), 
Person(2, "Val"), 
Person(3, "Unknown Name")) 
list.filter(_.id > 1).map(_.name)
Scala Collections API 
case class Person(id: Int, name: String) 
val list = List(Person(1, "Pawel"), 
Person(2, "Val"), 
Person(3, "Unknown Name")) 
list.filter(_.id > 1).map(_.name) 
SELECT name FROM list WHERE id > 1
Schema 
ORGANIZATIONS TEAMS
Entity mapping 
/** Table description of table orgs.*/ 
class OrganizationsTable(tag: Tag) extends Table[OrganizationsRow](tag,"organizations") { 
... 
/** Database column id AutoInc, PrimaryKey */ 
val id: Column[Long] = column[Long]("id", O.AutoInc, O.PrimaryKey) 
/** Database column name */ 
val name: Column[String] = column[String]("name") 
/** Database column created_at */ 
val createdAt: Column[java.sql.Timestamp] = column[java.sql.Timestamp]("created_at") 
… 
/** Foreign key referencing Organizations (database name token_organization_fk) */ 
lazy val organizationsFk = foreignKey("token_organization_fk", organizationId, 
Organizations)(r => r.id, onUpdate = ForeignKeyAction.NoAction, onDelete = 
ForeignKeyAction.NoAction) 
}
CRUD 
val organizationsTable = TableQuery[OrganizationsTable] 
// SELECT * FROM ORGANIZATIONS 
organizationsTable.list 
// SELECT * FROM ORGANIZATIONS WHERE ID > 10 OFFSET 3 LIMIT 5 
organizationsTable.filter(_.id > 10).drop(3).take(5).list 
// INSERT 
organizationsTable += OrganizationsRow(1, "name", "email", "notificationUrl", ... , None, 
None) 
// UPDATE ORGANIZATIONS SET name = “new org name” WHERE ID=10 
organizationsTable.filter(_.id === 10).map(_.name).update("new org name") 
// DELETE FROM ORGANIZATIONS WHERE ID=10 
organizationsTable.filter(_.id === 10).delete
Queries - JOINS 
val organizationsTable = TableQuery[OrganizationsTable] 
val teamsTable = TableQuery[TeamsTable] 
val name = “teamName” 
val result = for { 
t <- teamsTable.sortBy(_.createdAt).filter(t => t.deletedAt.isEmpty) 
o <- t.organization.filter(o => o.deletedAt.isEmpty && o.name === name) 
} yield (t.name, o.name) 
SELECT t.name, o.name FROM TEAMS t 
LEFT JOIN ORGANIZATIONS o ON t.organization_id = o.id 
WHERE t.deleted_at IS NULL AND o.deleted_at IS NULL AND o.name = `teamName` 
ORDER BY t.created_at
SELECT t.name, o.name FROM TEAMS t 
LEFT JOIN ORGANIZATIONS o ON t.organization_id = o.id 
WHERE t.deleted_at IS NULL AND o.deleted_at IS NULL AND o.name = `teamName` 
ORDER BY t.created_at 
val result: List[(String, String)]
Connection pool and transactions 
val ds = new BoneCPDataSource 
val db = { 
ds.setDriverClass(rdsDriver) 
ds.setJdbcUrl(rdsJdbcConnectionString) 
ds.setPassword(rdsPassword) 
ds.setUser(rdsUser) 
Database.forDataSource(ds) 
} 
db.withTransaction { implicit session => 
// SLICK CODE GOES HERE 
}

CAVE Overview

  • 1.
    CAVE - anoverview Val Dumitrescu Paweł Raszewski
  • 2.
    Another monitoring solution? At GILT: ● OpenTSDB + Nagios ● DataDog ● NewRelic ● Notifications with PagerDuty
  • 3.
    What is CAVE? Continuous Audit Vault Enterprise
  • 4.
    What is CAVE? A monitoring system that is: ● secure ● independent ● proprietary ● open source
  • 5.
    Requirements ● horizontallyscalable to millions of metrics, alerts ● multi-tenant, multi-user ● extensible HTTP-based API ● flexible metric definition ● data aggregation / multiple dimensions ● flexible and extensible alert grammar ● pluggable notification delivery system ● clean user interface for graphing and dashboarding
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
    Alert Grammar Metrichas name and tags (key-value pairs) e.g. orders [shipTo: US] response-time [svc: svc-important, env: prod]
  • 14.
    Alert Grammar AggregatedMetric has metric, aggregator and period of aggregation, e.g. orders [shipTo: US].sum.5m response-time [svc: svc-important, env: prod].p99.5m Supported aggregators: count, min, max, mean, mode, median, sum stddev, p99, p999, p95, p90
  • 15.
    Alert Grammar AlertCondition contains one expression with two terms and an operator. Each term is a metric, an aggregated metric or a value. e.g. orders [shipTo: US].sum.5m < 10 orders [shipTo: US].sum.5m < ordersPredictedLow [shipTo: US]
  • 16.
    Alert Grammar Anoptional number of times the threshold is broken, e.g. response-time [svc: svc-team, env: prod].p99.5m > 3000 at least 3 times
  • 17.
    Alert Grammar Specialformat for missing data e.g. orders [shipTo: US] missing for 5m heartbeat [svc: svc-important, env: prod] missing for 10m
  • 18.
    Alert Grammar traitAlertParser extends JavaTokenParsers { sealed trait Source case class ValueSource(value: Double) extends Source case class MetricSource( metric: String, tags: Map[String, String]) extends Source case class AggregatedSource( metricSource: MetricSource, aggregator: Aggregator, duration: FiniteDuration) extends Source sealed trait AlertEntity case class SimpleAlert( sourceLeft: Source, operator: Operator, sourceRight: Source, times: Int) extends AlertEntity case class MissingDataAlert( metricSource: MetricSource, duration: FiniteDuration) extends AlertEntity … }
  • 19.
    Alert Grammar traitAlertParser extends JavaTokenParsers { … def valueSource: Parser[ValueSource] = decimalNumber ^^ { case num => ValueSource(num.toDouble) } def word: Parser[String] = """[a-zA-Z][a-zA-Z0-9.-]*""".r def metricTag: Parser[(String, String)] = (word <~ ":") ~ word ^^ { case key ~ value => key -> value } def metricTags: Parser[Map[String, String]] = repsep(metricTag, ",") ^^ { case list => list.toMap } … }
  • 20.
    Alert Grammar traitAlertParser extends JavaTokenParsers { … def metricSourceWithTags: Parser[MetricSource] = (word <~ "[") ~ (metricTags <~ "]") ^^ { case metric ~ tagMap => MetricSource(metric, tagMap) } def metricSourceWithoutTags: Parser[MetricSource] = word ^^ { case metric => MetricSource(metric, Map.empty[String, String]) } def metricSource = metricSourceWithTags | metricSourceWithoutTags … }
  • 21.
    Alert Grammar traitAlertParser extends JavaTokenParsers { … def duration: Parser[FiniteDuration] = wholeNumber ~ ("s"|"m"|"h"|"d") ^^ { case time ~ "s" => time.toInt.seconds case time ~ "m" => time.toInt.minutes case time ~ "h" => time.toInt.hours case time ~ "d" => time.toInt.days } def aggregatedSource: Parser[AggregatedSource] = (metricSource <~ ".") ~ (aggregator <~ ".") ~ duration ^^ { case met ~ agg ~ dur => AggregatedSource(met, agg, dur) } def anySource: Parser[Source] = valueSource | aggregatedSource | metricSource … }
  • 22.
    Alert Grammar traitAlertParser extends JavaTokenParsers { … def missingDataAlert: Parser[MissingDataAlert] = metricSource ~ "missing for" ~ duration ^^ { case source ~ _ ~ d => MissingDataAlert(source, d) } def simpleAlert: Parser[SimpleAlert] = anySource ~ operator ~ anySource ^^ { case left ~ op ~ right => SimpleAlert(left, op, right, 1) } def repeater: Parser[Int] = "at least" ~ wholeNumber ~ "times" ^^ { case _ ~ num ~ _ => num.toInt } def simpleAlertWithRepeater: Parser[SimpleAlert] = anySource ~ operator ~ anySource ~ repeater ^^ { case left ~ op ~ right ~ num => SimpleAlert(left, op, right, num) }
  • 23.
    Alert Grammar traitAlertParser extends JavaTokenParsers { … def anyAlert: Parser[AlertEntity] = missingDataAlert | simpleAlertWithRepeater | simpleAlert } Usage: class Something(conditionString: String) extends AlertParser { … parseAll(anyAlert, conditionString) match { case Success(SimpleAlert(left, op, right, times), _) => … case Success(MissingDataAlert(metric, duration), _) => … case Failure(message, _) => … } }
  • 24.
    Slick Functional RelationalMapping (FRM) library for Scala Slick <> Hibernate
  • 25.
    Slick compile-time safety no need to write SQL full control over what is going on
  • 26.
    Scala Collections API case class Person(id: Int, name: String) val list = List(Person(1, "Pawel"), Person(2, "Val"), Person(3, "Unknown Name"))
  • 27.
    Scala Collections API case class Person(id: Int, name: String) val list = List(Person(1, "Pawel"), Person(2, "Val"), Person(3, "Unknown Name")) list.filter(_.id > 1)
  • 28.
    Scala Collections API case class Person(id: Int, name: String) val list = List(Person(1, "Pawel"), Person(2, "Val"), Person(3, "Unknown Name")) list.filter(_.id > 1).map(_.name)
  • 29.
    Scala Collections API case class Person(id: Int, name: String) val list = List(Person(1, "Pawel"), Person(2, "Val"), Person(3, "Unknown Name")) list.filter(_.id > 1).map(_.name) SELECT name FROM list WHERE id > 1
  • 30.
  • 31.
    Entity mapping /**Table description of table orgs.*/ class OrganizationsTable(tag: Tag) extends Table[OrganizationsRow](tag,"organizations") { ... /** Database column id AutoInc, PrimaryKey */ val id: Column[Long] = column[Long]("id", O.AutoInc, O.PrimaryKey) /** Database column name */ val name: Column[String] = column[String]("name") /** Database column created_at */ val createdAt: Column[java.sql.Timestamp] = column[java.sql.Timestamp]("created_at") … /** Foreign key referencing Organizations (database name token_organization_fk) */ lazy val organizationsFk = foreignKey("token_organization_fk", organizationId, Organizations)(r => r.id, onUpdate = ForeignKeyAction.NoAction, onDelete = ForeignKeyAction.NoAction) }
  • 32.
    CRUD val organizationsTable= TableQuery[OrganizationsTable] // SELECT * FROM ORGANIZATIONS organizationsTable.list // SELECT * FROM ORGANIZATIONS WHERE ID > 10 OFFSET 3 LIMIT 5 organizationsTable.filter(_.id > 10).drop(3).take(5).list // INSERT organizationsTable += OrganizationsRow(1, "name", "email", "notificationUrl", ... , None, None) // UPDATE ORGANIZATIONS SET name = “new org name” WHERE ID=10 organizationsTable.filter(_.id === 10).map(_.name).update("new org name") // DELETE FROM ORGANIZATIONS WHERE ID=10 organizationsTable.filter(_.id === 10).delete
  • 33.
    Queries - JOINS val organizationsTable = TableQuery[OrganizationsTable] val teamsTable = TableQuery[TeamsTable] val name = “teamName” val result = for { t <- teamsTable.sortBy(_.createdAt).filter(t => t.deletedAt.isEmpty) o <- t.organization.filter(o => o.deletedAt.isEmpty && o.name === name) } yield (t.name, o.name) SELECT t.name, o.name FROM TEAMS t LEFT JOIN ORGANIZATIONS o ON t.organization_id = o.id WHERE t.deleted_at IS NULL AND o.deleted_at IS NULL AND o.name = `teamName` ORDER BY t.created_at
  • 34.
    SELECT t.name, o.nameFROM TEAMS t LEFT JOIN ORGANIZATIONS o ON t.organization_id = o.id WHERE t.deleted_at IS NULL AND o.deleted_at IS NULL AND o.name = `teamName` ORDER BY t.created_at val result: List[(String, String)]
  • 35.
    Connection pool andtransactions val ds = new BoneCPDataSource val db = { ds.setDriverClass(rdsDriver) ds.setJdbcUrl(rdsJdbcConnectionString) ds.setPassword(rdsPassword) ds.setUser(rdsUser) Database.forDataSource(ds) } db.withTransaction { implicit session => // SLICK CODE GOES HERE }