datastax talks sessions c* breakouts event presentations 2016 apache cassandra cassandra summit nosql datastax enterprise big data cassandra dse the last pickle spark datastax enterprise graph real time database enterprise database database cql data management cluster data modeling netflix apache spark open source solr "cassandra summit graph database data model nodes rdbms apache solr hadoop datastax opscenter real-time clusters instaclustr data opscenter distributed systems kafka fault tolerance high availability operations azure iot distributed database cassandra query language apache hadoop cx expero fraud aws 3.0 time series datascale graph nodetool pythian thrift duyhai doan stream processing jon haddad challenges machine learning financial services accenture scalability oracle patrick mcfadin analytics real time replication fraud detection customer experience big data application fault tolerant sony alexander dejanovski jdbc monitoring philip thompson alain rodriguez tombstones vinay chella apache tinkerpop developer tools brian hess data ingestion user experience iot data rob murphy security features data analytics data mining deployment dse search integration uptime quorum bootstrapping testing elasticsearch logs ben slater the cloud real time data sla aaron morton apache cassandra committer ananth ram dse spark acid google ben lackey docker microservices s3 ansible apis applications application development data centers designing data science christopher batey schema nate mccall 3.x achilles insights high-performance load microsoft office 365 scalable performance latency production search 3.4 at scale architecture building backup example read path sstables codecentric ag application protectwise improvements distributed computing spotify issues features use case cloud applications smack stack apache kafta time series data rachel pedreschi target google cloud platform spark shark python predictive analytics startup in-memory database cassandra data modeling mongodb cassandra internals ecommerce comcast database security dba database administrator cql3 vnodes virtual nodes java cloud black friday gdpr hybrid cloud contextual payments aci worldwide webinar banking finance digital transformation cloud computing linkurious datastax managed cloud softeam dsp2 virtual reality mesosphere psd2 payment services directive inventory adam hutson conceptual data model service scheduling application data modeler enterprise physical data model logical data model data storage on demand expiring columns live go90 datetieredcompactionstrategy videos scala mobile entertainment activity feed sstable 2.1 the pythian group john schulz transform extract ihs markit execution framework jobs jim hatcher etl protectwise " ip addresses approximate data data structures low cost ben kornmeier michaël figuière speculative retries 2.0 drivers cloud storage java driver 3.0 j.b. langston troubleshooting task automation cloud platform distributed architecture " carlos rolo subsystems release model james witschey tick-tock encryption ameesh divatia customer data data breaches baffle.io 3.6 javadocs lcs wei deng solutions (cassandra-10805) jiras dba-free rotating clusters mock data disk utilization dtcs security monitoring platform threat stack natural use case intersection bounding box stratio's cassandra lucene centroid union geospatial geometrical transformations exclusion search features convex hull complex polygons metrics 1.0 operational tooling chris lohfink software engineer cold storage joshua hollander parquet patch final approval local quorum consistency level performant systems keyspaces christopher bradford consistency rack networktopologystrategy node topology data enrichment responsive data platform sigmoid rahul kumar visualisation apache mesos version 2.1 anti-entropy deleted data nodetool repair primary range alexandra klimova pricing managers data visualisation allianz deutschland ag pipelines enterprise platforms flink db single installation scalable solution life-cycle management clustered solutions prepaid billing voucher management ericsson spark nodes centralised system playstation4 videos storage available platform alexander filipchik dustin pham mutation propagate streams support amazon kinesis replication lag norton symantec endorse collisions shu zhang no-sql pl/sql ilya sokolov proxy nodes simplereach decommissioning nodes eric lubow dc clusters multiple data centers antientropy data inconsistency cassandra-11206 000 cells per partition large partitions 100 100mb robert stupp out-of-memory failures aws regions datadog grafana dashboards isolation 2.2.5 report generation 2.1.13 2.0.14 reconciliation full-table scan atomicity distributed software oom errors batching parameters memory russell spitzer throughput matt stump large-scale software vorstella strong consistency lwt replicas semantics data centres " light weight transactions syntax buffer objects custom scripts knewton collection g1 garbage jvm g1gc carlos monroy java tuning cassandra-7019 tick-tock release line delete data soft delete ndbench open source tools iops heap pressure pool configurations 99th latency wide partition relational database avoid library component agent radovan zvoncek cluster topology replacing nodes hecuba2 bug free api expanding christos kalantzis distributed databases center of excellence db engineering rabbitmq users time-series pat patterson streamsets data collector cleanse user defined aggregates ingest collect devices sensors traversal language marko rodriguez graph structure olap-based vendor-agnostic dsegraph oltp- gremlin graph graph systems large-scale distributed alex popescu development teams large scale pivotal cloud foundry bosh cloud native applications platforms-as-a-service automated lifecycle manual deployment paas zero downtime developer experience usability system integration dsefs functional coverage rocco varela dse file system gradle open-source ci servers predrag knežević docker swarms production code dockerized hdfs speed level security integrated security ssl streaming applications widows dc kerberos human error manikandan srinivasan custom scripting mike lococo configuration deviations protecting cliff gilmore constrained deployment advanced replication replicating data multi-dc hub-and-spoke configurations auditing attribute based access control (abac) securing network communication secure deployments ldap/active directory authentication with kerberos key-management interoperability protocol (kmip) role-based authorization encrypting data files ldap role assignment batch analysis operational data dse analytics operational analytics ui tools strategies visualize meaningful audience graph data chris lacava user-centered gary stewart ing devops christopher middleware engineering distributed data smart meter zookeeper kafka brokers kafka-rest wei deng dse spark masters schema registry adversarial modeling detection data graph theory identity theft synthetic identities agile sql client applications jacek lewandowski cqlsh property graphs conceptual-logical-physical artem chebotko flexible graph data model always-on applications spark integrations data analysis services application design robbie strickland configuration enable linearly scalable academic network source code pragmatic problem bob briody inspection reproducibles real-world analysis concept network analysis techniques visualization 5.0 enhancements high level nick panahi ariel weisberg future features beyond recovery thomas valley multi-data center failure scenarios pagerduty " donny nadolny conflicting data clock skew rimas silkaitis data analysis postgres heroku massive data ingest http based api cassieq no dependencies building queues anton kropp authentication distributed data stores installation distributed queue is hard live coding rdd apache zeppelin hardware requirements secundary indexes materialized views carlos rolo cassandra single node compaction strategies udf small cluster jbod shrink ben bromhead token pinning ebs backed disks scale reduce costs design performant scalable data model software development techniques design session remi trouville kibana independant elassandra cassandra-stress tool load test scaling requirements vaibhav puranik reporting daily batched gumgum microbatching lambda architecture access data store data development steps optimizing diagnostics setups design patterns micro batch processing system grpc luke tillman falcor cql 3 language re-write storage engine emodb non-blocking conflict-free replicated datatype cross data center communication distributed compactions json json delta global writes restful crdt solution architecture cpus murali kannan node.js php client libraries ruby c/c++ cloud datacenters c# “always on” in-memory performance apache ignite gridgain sql-99 read latencies oltp supply full sql-99 slas graphframes gke gce google compute engine kubernetes ravi madasu google container engine open source tracing http consultant mick semb wever cassandra-10392 zipkin opentracing project single tracing cluster-wide metadata automated restoratiom distributed backups sstable files which again leverages ansible dba's knewton's restoration strategy vpcs system_traces keyspace diagnose issues open sourced tools jeffrey berge cassandra-tracing tool real-time output metadata processing feature extraction centric architecture kildane software technologies kerry koitzsch image formats high accuracy analysis image processing applications implementation technology types quality messaging services randy fradin scalable storage blackrock cross-wan consistency multi-region clusters investment management platform aladdin glob metric path queries display plug-in aggregate analyse graphite cyanite sasi indexes store markus hofer bucketing datamodeling tombstoneoverwhelmingexception optimizing queries maintain buckets microservice architectures it consultant hofer split partitions ajay upadhyay scalable persistence streaming services senior software engineer arun agrawal payment information bookmarks billing viewing history aaron ploetz regions providers lead technical architect crossfit relational compound keys high-volume database systems cassandra data adam hutson data architect composite keys clustering keys primary scalable data architecture validation metadata data exploration test tool perform under stress re-writes cassandra-stress best and worst production cluster functions cassandra operations advanced interface jmx internal structures process changes cql changes benjamin lerer cassandra 2.2 disaster avoidance customized scripts restore fast reads/writes backed up datacenter single datacenter deployment multi-datacenter clusters bootstrapping nodes repairs compactions low volume clusters high latency high volume video sharing usability enhancements killrvideo indexing changes developer cassandra 2.x common mistakes art streamlined achilles 4.x syntax mistake ds object mapping code base linearly scalable fleet jim peregord robust element corp stack pluggable platform benchmark database migrate effective approach yabin meng exposed client drivers not supported protocol moving away writes improve throughput adam zegelin client application reduce cpu micro-batching submitting real world insurance company iag eddie satterly leverage australia datanexus integrate silos anubhav kale running 400 node 300 tb job and talent data models carlos alonso core concepts parameters detailed look stability cassandra.yaml file configuration settings configuration files edward capriolo above and beyond data corruption cassandra tuning network splits matija gobec failure smartcat hardware sasi ride performant indices full text query data like '%term%' full text search columns accuracy view jason cacciatore monitoring cassandra health monitoring system false alarms reactive jeffrey carpenter international choice hotels microservice cloud-based reservation system distributed schema design andrew baker multiple datacenters mesos abhishek verma uber node repairs framework machine utilization automate statically partitioned wide row store cdm eventually consistent shortcut to awesome dataset manager partition messaging system high throughput airbnb massive real-time datasets apache kafka data integration confluent linkedin ewen cheslack-postava tyler hobbs disk driver internals coordinator reading failures selecting replicas paging problems stephan kepser lessons event sourcing cqrs matthias niehoff standard deletes without tombstones fail eric stevens limitations deleting data deletion options solutions ttls scaling instagram infrastructure use cases key-value storage cassandra at instagram patches high scalability facebook low latency core infra team dikang gu yahoo japan next generation infrastructure nosql team satoshi konno automation emilio del tessandoro terabytes parallelizable problem trivially exporting data tooling cassandra exports storage infrastructure messaging tips brooke jensen internet of things diagnosis methods vp technical operations ad tech streaming processing roopa tangirala infrastructure prasanna padmanabhan recommendations time machine personalization cloud application architecture master data management customer 360 degree bank fraud monitor risk intuit trenches availability charles rich jkool demands streaming data analyzing i20 coursera spark analytics cloud comput) clear capital cassandra training healthcare technology infographic shark databricks free spark driver apache 2.0 license apache 2.0 silicon valley headquarters employees culture x1 dvr data center distributed architecture performance tuning training tunable consistency eventual consistency bulk loading pci-dss security compliance gazzang cassandra dba cqlengine triggers transactions real ooyala real t mysql data mode apache hadoop (software) real-time database message bus redis open message bus database engineer distributed processing cep storm cassadra atomic batchs leveled compaction databases sourceninja apache hadoop enterprise cap theorem data consistency data partitioning cloud database
See more