Bathcamp 2010-riak

•

3 likes•717 views

Riak is a document oriented database written in Erlang that is highly fault tolerant and based on Dynamo and the CAP theorem. It uses a similar data model to MongoDB by storing semi-structured data as documents but achieves high availability through quorum writes rather than MongoDB's in-place writes. Key features of Riak include configurable replica counts for reads and writes (N/R/W), masterless replication, and an integer keyspace that allows any node to service requests. Riak search also provides distributed, fault tolerant full-text search capabilities.

What is Riak?
• Documented orientated database
• Written in Erlang
• Based on Dynamo[1] and CAP Theorem[2]
• Highly fault tolerant
• HTTP and ProtoBuff interface
• Write MapReduce in Erlang or JavaScript
1. http://goo.gl/r8Np
2. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

Same, Same but different
• Riak solves similar problems to MongoDB
• Semi-structured data modeled as "documents”
• Storage of non-document data in the database
• High write-availability
• Riak is intrinsically multi-node scalable
• Mongo in comparison is single system (+ sharding)
• Riak achieves availability via quorum writes
• Mongo uses performant in-place writes
• Riak uses “masterless” replication

N/R/W – Dynamo
N = Number of replicas to store
R = Number of replicas needed to read
W = Number of replicas needed to read
• These principals first appeared in an Amazon
research paper known as Dynamo

• 160bit integer key
space. Each node that
joins is assigned part
of that space for
consistent hashing
• Hashing means any
node can service any
request making the
cluster masterless and
eventually consistant
Number of replicas

• Number of replies
before Riak gives
the client a
successful reply.
• Tries to access all
nodes, but as soon
as the N/R is
satisfied a response
is given
Reads

• Same as reads; W
implies the number
of successful nodes
that must reply
before the write
is considered
consistent by
the client
Writes

Extreme example
• Given N=10, R=W=2 we
could have 8 nodes
down and the cluster
would still be fully
available to all clients

What does this all mean?
• N/R/W specified at request time, so each
client can specify its own tolerance for
outages dynamically
• Despite any outages within the cluster, the whole
cluster can still appear available based on N/R/W
• Given N=3 and R=W=2, we can have 3-2=1 node
down/unreachable/laggy in the cluster
• Stupidly high availability complete with eventual
consistency controlled by dynamic clients

Brewer’s CAP Theorem
• Consistency
• Availability
• Partition Tolerance
• You cant have all things, all the time…
• …but you can have some of each, all the time!
• Riak is about choosing your own levels of
each according to your use case

Consistency
• Start with document
version zero
• Things get redistributed
and n0 and n2 are
sitting in NYC and n1
and n3 are in London
• What if stuff changes??

Consistency
• Uh oh: inconsistency
• Both parts of the cluster
are still fully available
• NYC serves v1 whilst
London serves v0
• The network resumes
and Riak determines
the latest version by
using vector clocks

Consistency
• What if both sides of
the Atlantic changed?
• Riak is unable to
determine which is the
right document, both
are returned to the
client with an indication
of the inconsistency

• Distributed, fault-tolerant full-text searching
• Lucene syntax for queries
• No need for index sharding
• Linier scaling
• Double the number of nodes to get double the
search capacity (awesome!)
• Search via:
• Fields, wildcards, fuzzy text or token proximity
Riak Search

Questions?
basho.com/riak.html
github.com/basho/riak
twitter.com/timperrett
github.com/timperrett
blog.getintheloop.eu

This document provides an introduction to Akka Streams, which implements the Reactive Streams specification. It discusses the limitations of traditional concurrency models and Actor models in dealing with modern challenges like high availability and large data volumes. Reactive Streams aims to provide a minimalistic asynchronous model with back pressure to prevent resource exhaustion. Akka Streams builds on the Akka framework and Actor model to provide a streaming data flow library that uses Reactive Streams interfaces. It allows defining processing pipelines with sources, flows, and sinks and includes features like graph DSL, back pressure, and integration with other Reactive Streams implementations.

C100 k and go

tracymacding

This document discusses the C100K problem of handling 100,000 concurrent network connections efficiently and describes how the Go programming language solves this problem. It explains that Go uses lightweight goroutines instead of OS threads, has a fast scheduler, and uses non-blocking I/O with epoll to efficiently handle a large number of clients with a small memory footprint on each CPU core. An example TCP/HTTP server is shown to demonstrate how Go implements networking.

Introduction to Reactive programming

Dwi Randy Herdinanto

Introduction To Streaming Data and Stream Processing with Apache Kafka

confluent

Event driven-arch

Mohammed Shoaib

Event Driven Architecture and Apache Kafka were discussed. Key points: - Event driven systems allow for asynchronous and decoupled communication between services using message queues. - Apache Kafka is a distributed streaming platform that allows for publishing and subscribing to streams of records across a cluster of servers. It provides reliability through replication and allows for horizontal scaling. - Kafka provides advantages over traditional queues like decoupling, scalability, and fault tolerance. It also allows for publishing of data and consumption of data independently, unlike traditional APIs.

Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...

Flink Forward

Flink is a great stream processor, Python is a great programming language, Apache Beam is a great programming model and portability layer. Using all three together is a great idea! We will demo and discuss writing Beam Python pipelines and running them on Flink. We will cover Beam's portability vision that led here, what you need to know about how Beam Python pipelines are executed on Flink, and where Beam's portability framework is headed next (hint: Python pipelines reading from non-Python connectors)

Querying Dynamic Datasources with Continuously Mapped Sensor Data

Ruben Taelman

Triple Pattern Fragments and continuous ETL processes are used to publish raw sensor data as RDF on the web and allow clients to query current temperature and humidity readings through a lightweight query interface. Sensor measurements are extracted, transformed by adding metadata on measurement time, and loaded as RDF which is then queried using SPARQL to retrieve live temperature and humidity values from a sensor.

The document discusses Project Reactor, a library for building asynchronous and non-blocking applications in Java or Kotlin. It explains the differences between blocking and non-blocking code, provides examples of using Project Reactor, and highlights some gotchas. Benchmarking results show that a non-blocking Dropwizard application using Project Reactor can handle over 12 times as many requests per second as a blocking version. The document also includes links to code samples on GitHub that demonstrate concepts like combining different publishers, exception handling, and caching.

Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...

Flink Forward

Scaling a real-time streaming warehouse with Apache Flink, Parquet and Kubernetes At Branch, we process more than 12 billions events per day, and store and aggregate terabytes of data daily. We use Apache Flink for processing, transforming and aggregating events, and parquet as the data storage format. This talk covers our challenges with scaling our warehouse, namely: How did we scale our Flink-Parquet warehouse to handle 3x increase in traffic? How do we ensure exactly once, event-time based, fault tolerant processing of events? In this talk, we also provide an overview on deploying and scaling our streaming warehouse. We give an overview on: How we scaled our Parquet warehouse by tuning memory Running on Kubernetes cluster for resource management How we migrated our streaming jobs with no disruption from Mesos to Kubernetes Our challenges and learnings along the way

Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...

Flink Forward

Aljoscha Krettek - Portable stateful big data processing in Apache Beam

Ververica

Building your own Distributed System The easy way - Cassandra Summit EU 2014

Kévin LOVATO

How to manage large amounts of data with akka streams

Igor Mielientiev

This document discusses how to manage large amounts of data with Akka Streams. It introduces Reactive Streams, which defines standard interfaces for asynchronously processing potentially unbounded streams of data with backpressure. It discusses problems with traditional Java stream management approaches and how Akka Streams provides a better solution through composable, pure functional streams that allow data to be piped and transformed between a producer and consumer in a type-safe manner. Specific examples are given around using Akka Streams to handle large HTTP requests by normalizing and asynchronously saving file chunks to storage without buffering the entire file in memory.

ChronoLogic Tools Demo: 6/12/18

ChronoLogic

On Tuesday, June 12th at 1pm EDT, ChronoLogic Developer Anthony Adegbemi and Community Manager Sean Morgan will host a Livestream to unveil new ChronoLogic Tools. You can view the recording at https://youtu.be/uXcy-xIngMw The LiveStream included: The Latest Development Updates The Electron Dapp Demo and Discussion The Token Distribution Allocator Demo and Discussion Community Questions Sean Morgan addressed the communities most pressing questions and interviewed Anthony about the implications of ChronoLogic's most recent developments. If you want to find out the latest ChronoLogic information, you will not want to miss this LiveStream.

Kafka At Scale in the Cloud

confluent

Thoughts on consistency models

rogerbodamer

The CAP theorem states that it is impossible for a distributed computer system to simultaneously provide consistency, availability, and partition tolerance. You must give up one of these properties. Most systems choose to sacrifice consistency (eventual consistency), making them either AP (available during partition) or CP (consistent during partition). With AP systems like MongoDB, updates will propagate between nodes eventually, so clients may see inconsistent or stale data temporarily. CP systems guarantee consistency during a partition by blocking writes.

Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas

Flink Forward

Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.

Parallel processing for splitter in mule esb

Sunil Kumar

Alexander Kolb – Flink. Yet another Streaming Framework?

Flink Forward

The document evaluates and compares several streaming frameworks, including SQLStream, Pulsar, SPQR, Apache Spark, and Apache Flink. It assesses the frameworks based on usability, functionality, architecture, support, and non-functional requirements. For each framework, it provides information on architectural diagrams, window aggregation examples, and scores the frameworks in various categories. It concludes that Apache Spark and Apache Flink received the highest overall scores based on the evaluation.

Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...

Flink Forward

http://flink-forward.org/kb_sessions/declarative-stream-processing-with-streamsql-and-cep/ Complex event processing (CEP) and stream analytics are commonly treated as distinct classes of stream processing applications. While CEP workloads identify patterns from event streams in near real-time, stream analytics queries ingest and aggregate high-volume streams. Both types of use cases have very different requirements which resulted in diverging system designs. CEP systems excel at low-latency processing whereas engines for stream analytics achieve high throughput. Recent advances in open source stream processing yielded systems that can process several millions of events per second at sub-second latency. Systems like Apache Flink enable applications that include typical CEP features as well as heavy aggregations. In this talk we will show how Apache Flink unifies CEP and stream analytics workloads. Guided by examples, we introduce Flink’s CEP-enriched StreamSQL interface and discuss how queries are compiled, optimized, and executed on Flink.

Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...

Flink Forward

As stream processing engines become more and more popular and are used in different environments, the demand to support different deployment scenarios increases. Depending on the user's infrastructure, a stream processor might be run on a bare metal cluster in standalone mode, deployed via Apache Yarn and Mesos, or run in a containerized environment. In order to fulfill the requirements of different deployment options and to provide enough flexibility for the future, the Flink community has recently started to redesign Flink's distributed architecture. This talk will explain the limitations of the old architecture and how they are solved with the new design. We will present the new building blocks of a Flink cluster and demonstrate, using the example of Flink's Mesos and Docker support, how they can be combined to run Flink nearly everywhere.

Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...

Flink Forward

This document discusses providing an R dataframe abstraction for efficient distributed computation on Apache Flink. The goals are to provide a natural API for R and achieve performance comparable to Flink's native dataflow. The approach represents R dataframes as Flink data sets and compiles R functions into the native execution plan where possible. For user-defined R functions, they are evaluated within worker tasks using a just-in-time compiler. This allows executing R code within the same Java virtual machine as Flink for good performance, even on a single node. Results show it can achieve native Flink performance even for functions containing R code.

Introduction to Structured streaming

datamantra

This document provides an introduction to Structured Streaming in Apache Spark. It discusses the evolution of stream processing, drawbacks of the DStream API, and advantages of Structured Streaming. Key points include: Structured Streaming models streams as infinite tables/datasets, allowing stream transformations to be expressed using SQL and Dataset APIs; it supports features like event time processing, state management, and checkpointing for fault tolerance; and it allows stream processing to be combined more easily with batch processing using the common Dataset abstraction. The document also provides examples of reading from and writing to different streaming sources and sinks using Structured Streaming.

Matthias J. Sax – A Tale of Squirrels and Storms

Flink Forward

The document discusses similarities and differences between Apache Flink and Apache Storm, two stream processing frameworks. It describes how Flink and Storm have similar capabilities as true stream processing engines with low latency. However, it notes that Flink has advantages like richer APIs, exactly-once processing, and higher throughput. The document also provides details on the system architectures, topology deployment strategies, and Storm compatibility features of Flink.

Notes on Netty baics

Rick Hightower

Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...

Flink Forward

Witnessing the rise of stream processing from the driving seat, we see Apache Flink® and associated technologies used for a wide variety of business applications, from routing data through systems, serving as a backbone for real-time analytics on live data using SQL, detecting credit card fraud, to implementing complete end-to-end social networks. Such applications enable modern data-driven businesses where decisions and actions happen in real-time, and transform traditional businesses to become more data-driven. Observing the variety of these applications implemented using Flink, it becomes apparent that the traditional dividing line between analytics and operational applications is becoming more and more blurry. Historically, operational applications were built using transactional databases, and analytics were done offline. In contrast, Flink’s, state, checkpoints, and time management are the core building blocks for both operational applications with strong data consistency needs, and for real-time analytics with correctness guarantees. With these shared building blocks, developers start building what is arguably a new class of data-driven applications: applications that are operational in that they serve live systems and at the same time analytical in that they perform complex data analysis. Following application architectures like CQRS and using new features like Flink’s queryable state, streaming analytics and online applications move even closer to each other. In this talk, guided by real-world use cases, we present how the unique core concepts behind Flink simplify the development, deployment, and management of data-driven applications, and we conclude with a vision for the future for Flink and stream processing.

Apache Software Foundation: How To Contribute, with Apache Flink as Example (...

Apache Flink Taiwan User Group

Electronica jj

Andres Cardona

Actividad1 curriculum

Luzfrida

Este documento presenta información sobre un curso de maestría en educación en la ciudad de Tehuacán, México. Se enfoca en el tema de currículo y provee una definición de currículo, así como un cuadro comparativo de diferentes conceptos de currículo como silabo, tabla de contenido, libro de texto, plan de estudios y experiencias planeadas. Además, discute la importancia de cinco tipos de currículos y cómo la contextualización de los planes y programas de estudios permite el buen funcionamiento de los currículos.

What's hot

Project Reactor By Example

Denny Abraham Cheriyan

Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...

Flink Forward

Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...

Flink Forward

Aljoscha Krettek - Portable stateful big data processing in Apache Beam

Ververica

Building your own Distributed System The easy way - Cassandra Summit EU 2014

Kévin LOVATO

How to manage large amounts of data with akka streams

Igor Mielientiev

ChronoLogic Tools Demo: 6/12/18

ChronoLogic

Kafka At Scale in the Cloud

confluent

Thoughts on consistency models

rogerbodamer

Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas

Flink Forward

Parallel processing for splitter in mule esb

Sunil Kumar

Alexander Kolb – Flink. Yet another Streaming Framework?

Flink Forward

Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...

Flink Forward

Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...

Flink Forward

Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...

Flink Forward

Introduction to Structured streaming

datamantra

Matthias J. Sax – A Tale of Squirrels and Storms

Flink Forward

Notes on Netty baics

Rick Hightower

Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...

Flink Forward

Apache Software Foundation: How To Contribute, with Apache Flink as Example (...

Apache Flink Taiwan User Group

What's hot (20)

Project Reactor By Example

Flink Forward San Francisco 2019: Scaling a real-time streaming warehouse wit...

Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...

Aljoscha Krettek - Portable stateful big data processing in Apache Beam

Building your own Distributed System The easy way - Cassandra Summit EU 2014

How to manage large amounts of data with akka streams

ChronoLogic Tools Demo: 6/12/18

Kafka At Scale in the Cloud

Thoughts on consistency models

Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas

Parallel processing for splitter in mule esb

Alexander Kolb – Flink. Yet another Streaming Framework?

Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...

Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...

Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...

Introduction to Structured streaming

Matthias J. Sax – A Tale of Squirrels and Storms

Notes on Netty baics

Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...

Apache Software Foundation: How To Contribute, with Apache Flink as Example (...

Viewers also liked

Electronica jj

Andres Cardona

Actividad1 curriculum

Luzfrida

ville du mont-dore

catherineguillaume

Que es slideshare

Gabriel Chaves

Slideshare es un espacio gratuito en línea donde los usuarios pueden subir y compartir presentaciones de PowerPoint de manera pública. Los usuarios crean las presentaciones en PowerPoint y luego las cargan en Slideshare, donde quedan almacenadas y disponibles para que otros las vean. Slideshare también permite compartir las presentaciones a través de correo electrónico u otras redes sociales.

CSR Events at Technika 15

Gaurav Raj Anand

The document discusses CSR events planned during Technika, the annual cultural and technical festival of Birla Institute of Technology, Patna. The CSR events aim to promote cleanliness through a "Donate a Dustbin" campaign to address Patna's waste management issues. They will also hold an anti-tobacco drive including seminars on reducing tobacco consumption and forming a human chain to generate awareness. Additionally, contributions will be made to local orphanages in Patna. The document provides details on how to get involved and the benefits of participation.

Lectura ser digital

LUISFDAVILA

El documento resume la distinción que hace Nicholas Negroponte entre átomos (cosas físicas) y bits (cosas digitales), y cómo los bits se han vuelto más accesibles que los átomos. También describe cómo el uso de computadoras e internet ha superado las expectativas originales, con teléfonos inteligentes y portátiles cada vez más pequeños y potentes. Finalmente, señala que la tecnología es ahora una parte integral de la vida cotidiana, especialmente para los adolescentes que usan computadoras e internet la mayor parte del día.

Exposicion equipo 1

Secretaria de Educación Publica

El documento presenta una discusión sobre los temas y problemas actuales de la política educativa en México. Se propone que la política debe enfocarse en lograr una transformación de la educación centrada en la definición de objetivos y metas claros, así como en la identificación de procesos y medios para mejorar la cobertura, calidad y equidad educativa, abordando particularmente los aprendizajes de los estudiantes y una distribución justa de recursos. Asimismo, se enfatiza la necesidad de reflexionar sobre el país que se aspira construir y el

Виктор Вяткин

Initial presentation Tesla management project (Swinburne University)

Anthony Campana

Tesla was founded in 2003 and is based in Silicon Valley, exclusively producing electric vehicles. They underwent major management changes in 2008 and lead the industry in technology and design. The group is analyzing Tesla because it is a groundbreaking company taking steps to help the environment through abnormal business strategies like openly sharing technology. Their unique approach creates debate around their success and makes them an interesting case study.

Wiring of-mandible

Zohaib Saleem

This document provides an outline on wiring of the mandible. It discusses the anatomy, common sites of fracture, signs and symptoms, diagnosis and treatment. Treatment involves restoring proper occlusion and stable fixation of fragments. Common fractures include the symphysis, body and ramus. Symphysis fractures are often treated with cerclage or hemicerclage wiring. Body fractures can be treated with interdental or plate fixation. Ramus fractures are treated with interfragmentary or interarcuate wiring. Post-operative care involves analgesia, antibiotics and soft diet. Fractures heal through hematoma formation, callus formation and bone remodeling over 2-3 months.

Gas licuado de petróleo GLP

Héctor Chire

El documento describe los procedimientos e instalación de un sistema de Gas Licuado de Petróleo (GLP) en un vehículo. Se explican las modificaciones necesarias en el motor como cambiar las bujías y ajustar la curva de avance. También se detallan los dispositivos a instalar como el vaporizador, mezclador y tanque de GLP así como los 11 pasos para la instalación práctica manteniendo normas de seguridad.

$Condylar fractures /certified fixed orthodontic courses by Indian dental acad...$ $Condylar fractures /certified fixed orthodontic courses by Indian dental acad...$

Condylar fractures /certified fixed orthodontic courses by Indian dental acad...

Indian dental academy

Favolare

Paolo Clemenza

$Mandibular fracture 2 / fixed orthodontic courses$ $Mandibular fracture 2 / fixed orthodontic courses$

Mandibular fracture 2 / fixed orthodontic courses

Indian dental academy

Mod morphology of deciduous dentition

Jamil Kifayatullah

The document summarizes the key morphological features of deciduous teeth. It describes that deciduous teeth are smaller than permanent teeth, with thinner enamel and larger pulp chambers. It then provides detailed descriptions of each deciduous tooth type, including their eruption times, root morphology, and distinguishing crown features. Specific traits of each maxillary and mandibular tooth are defined.

Phần 1 công ty kết cấu sx thép thái nguyêntranthihoaivan

A2 Media Studies Preproduction Development

GeorginaMediaStudies

Short film script & storyboard development- Film synopsis Importance of scripts Script draft 1 Evaluation of script 1 Script draft 2 Evaluation of script 2 Importance of storyboards First storyboard creation Evaluation of storyboard 1 Review of script & storyboard Redevelopment of film idea Film inspiration Screenwriting Character profile Script 3 Evaluation of script 3 Storyboard 3 Evaluation of storyboard 3 Final script and storyboard Final screenwriting Final script Evaluation of final script Final storyboard Evaluation

Human Dentition

Umm Al-Qura University Faculty of Dentistry

Feuillet memento Degremont - n°1 Ultragreen

Degrémont

Viewers also liked (19)

Electronica jj

Actividad1 curriculum

ville du mont-dore

Que es slideshare

CSR Events at Technika 15

Lectura ser digital

Exposicion equipo 1

Initial presentation Tesla management project (Swinburne University)

Wiring of-mandible

Gas licuado de petróleo GLP

$Condylar fractures /certified fixed orthodontic courses by Indian dental acad...$ $Condylar fractures /certified fixed orthodontic courses by Indian dental acad...$

Condylar fractures /certified fixed orthodontic courses by Indian dental acad...

Favolare

$Mandibular fracture 2 / fixed orthodontic courses$ $Mandibular fracture 2 / fixed orthodontic courses$

Mandibular fracture 2 / fixed orthodontic courses

Mod morphology of deciduous dentition

Phần 1 công ty kết cấu sx thép thái nguyên

A2 Media Studies Preproduction Development

Human Dentition

Feuillet memento Degremont - n°1 Ultragreen

Similar to Bathcamp 2010-riak

Scalable Persistent Storage for Erlang: Theory and Practice

Amir Ghaffari

The RELEASE project at Glasgow University aims to improve the scalability of Erlang onto commodity architectures with 100,000 cores. Such architectures require scalable and available persistent storage on up to 100 hosts. The talk describes the provision of scalable persistent storage options for Erlang. We outline the theory and apply it to popular Erlang distributed database management systems (DBMS): Mnesia, CouchDB, Riak and Cassandra. We identify Dynamo-style NoSQL DBMS as suitable scalable persistent storage technologies. To evidence the scalability we benchmark Riak in practice, measuring the scalability and elasticity of Riak on 100-node cluster with 800 cores.

HPC Controls Future

rcastain

Getting started with Riak in the Cloud

Ines Sombra

Getting started with Riak in the Cloud involves provisioning a Riak cluster on Engine Yard and optimizing it for performance. Key steps include choosing instance types like m1.large or m1.xlarge that are EBS-optimized, having at least 5 nodes, setting the ring size to 256, disabling swap, using the Bitcask backend, enabling kernel optimizations, and monitoring and backing up the cluster. Benchmarks show best performance from high I/O instance types like hi1.4xlarge that use SSDs rather than EBS storage.

Running a distributed system across kubernetes clusters - Kubecon North Ameri...

Alex Robinson

Kubernetes makes it easy to run distributed applications, even those that manage persistent state, within the confines of a single cluster. Running the same applications in a multi-region or multi-cloud fashion across multiple Kubernetes clusters, however, is considerably more difficult due to the networking and service discovery problems involved. In this talk, Alex will walk through his team’s experience over the last six months of running a distributed database across Kubernetes clusters in different regions and their attempts to make the process repeatable on different cloud providers and on-prem environments. He’ll cover common problems they encountered, solutions they’ve tried, how they’re running things today, and the future improvements he’s most excited about from community projects like Istio.

Multi-Datacenter Kafka - Strata San Jose 2017

Gwen (Chen) Shapira

MySQL on Ceph

Kyle Bader

This document outlines an agenda for a session on running MySQL on Ceph storage. The first part will discuss using MySQL on Ceph versus AWS and include a performance head-to-head. The second part will cover Ceph architecture including components like RADOS, pools, and CRUSH algorithm for data placement. The final part will discuss tuning MySQL and Ceph together for optimal performance including adjusting buffer pool size, transaction flushing, and creating specialized pools for IOPS workloads. An accompanying lab will compare MySQL performance on Ceph versus other cloud platforms.

My SQL on Ceph

Red_Hat_Storage

Aurora_session.pdf

Ramkumar34150

Amazon Aurora is a cloud-native database that provides the speed and availability of commercial databases with the simplicity and cost-effectiveness of open-source databases. It offers drop-in compatibility with MySQL and PostgreSQL and simple pay-as-you-go pricing. Aurora delivers high performance by separating compute and storage and leveraging a log-based architecture that allows for continuous backups and point-in-time recovery.

Highly available, scalable and secure data with Cassandra and DataStax Enterp...

Johnny Miller

Lessons learned from scaling YARN to 40K machines in a multi tenancy environment

DataWorks Summit

This document summarizes Microsoft's experience running YARN at a massive scale of over 40,000 machines to support its Cosmos big data platform. Some key points: - Cosmos processes over 500,000 jobs and 2 million containers per hour with high reliability and CPU utilization. - Scaling YARN to this level required optimizations like scheduler key pruning, time-based locality decay, and asynchronous logging to achieve sub-5 second allocation latencies. - A federated approach was used to partition the large cluster into independent YARN sub-clusters for improved scalability and maintenance. - Ongoing work involves tuning multi-cast policies, opportunistic container utilization, and log management to maximize scal

[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...

Matteo Ferroni

Internet of Things (IoT) is experiencing a huge hype these days, thanks to the increasing capabilities of embedded devices that enable their adoption in new fields of application (e.g. Wireless Sensor Networks, Connected Cars, Health Care, etc.). On the one hand, this is leading to an increasing adoption of multi-tenancy solutions for Cloud and Fog Computing, to analyze and store the data produced. On the other hand, power consumption has become a major concern for almost every digital system, from the smallest embedded circuits to the biggest computer clusters, with all the shades in between. Fine-grain control mechanisms are then needed to cap power consumption at each level of the stack, still guaranteeing Service Level Agreements (SLA) to the hosted applications. In this work, we propose DockerCap, a software-level power capping orchestrator for Docker containers that follows an Observe-Decide-Act loop structure: this allows to quickly react to changes that impact on the power consumption by managing resources of each container at run-time, to ensure the desired power cap. We show how we are able to obtain results comparable with the state of the art power capping solution provided by Intel RAPL, still being able to tune the performances of the containers and even guarantee SLA constraints. Full paper: http://ieeexplore.ieee.org/document/7982228/

Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn

Tuning kafka pipelines

Sumant Tambe

Kafka is a high-throughput, fault-tolerant, scalable platform for building high-volume near-real-time data pipelines. This presentation is about tuning Kafka pipelines for high-performance. Select configuration parameters and deployment topologies essential to achieve higher throughput and low latency across the pipeline are discussed. Lessons learned in troubleshooting and optimizing a truly global data pipeline that replicates 100GB data under 25 minutes is discussed.

End-to-End Reactive Data Access Using R2DBC with RSocket and Proteus

VMware Tanzu

Lack of asynchronous relational database drivers in Java has been a barrier to writing scalable, data-driven applications for many. R2DBC is seeking to change this with a new API designed from the ground up for reactive programming against relational databases—its intent ito support reactive data access built on natively asynchronous, non-blocking SQL database drivers. How does this change the game for data access in the cloud? Used in conjunction with RSocket and Proteus, it is now possible to write applications benefiting from reactive streaming end-to-end, from the browser all the way to the database. No more fiddling with paging APIs, polling for updates, or writing complex logic to merge data from multiple sources--reactive streams can handle this all for you! RSocket is an open-source, reactive networking protocol that is a collaborative development initiative of Netifi with Pivotal, Facebook, and others. Proteus is a freely available broker for RSocket that is designed to handle the challenges of communication between complex networks of services—both within the data center and over the internet—extending to mobile devices and browsers. Attend this webinar to learn how to use Pivotal Cloud Foundry with R2DBC and Proteus to build reactive microservices that return large amounts of data in a streaming fashion over RSocket. Speakers: Ryland Degnan, co-founder and CTO of Netifi and Dan Baskette, Pivotal host

TDC2017 | São Paulo - Trilha Containers How we figured out we had a SRE team ...

tdc-globalcode

High performace network of Cloud Native Taiwan User Group

HungWei Chiu

The document discusses high performance networking and summarizes a presentation about improving network performance. It describes drawbacks of the current Linux network stack, including kernel overhead and data copying. It then discusses approaches like DPDK and RDMA that can help improve performance by reducing overhead and enabling zero-copy data transfers. A case study is presented on using RDMA to improve TensorFlow performance by eliminating unnecessary data copies between devices.

Building Distributed Systems With Riak and Riak Core

Andy Gross

Andy Gross from Basho discussed Riak Core, an open source distributed systems framework extracted from Riak. Riak Core provides abstractions like virtual nodes, preference lists, and event watchers to help developers build distributed applications. It is currently Erlang-only but will support other languages. Riak Core aims to allow developers to outsource complex distributed systems tasks and implement their own distributed systems more easily.

Incremental Export of Relational Database Contents into RDF Graphs

Nikolaos Konstantinou

In addition to tools offering RDF views over databases, a variety of tools exist that allow exporting database contents into RDF graphs; tools proven that in many cases demonstrate better performance than the former. However, in cases when database contents are exported into RDF, it is not always optimal or even necessary to dump the whole database contents every time. In this paper, the problem of incremental generation and storage of the resulting RDF graph is investigated. An implementation of the R2RML standard is used in order to express mappings that associate tuples from the source database to triples in the resulting RDF graph. Next, a methodology is proposed that enables incremental generation and storage of an RDF graph based on a source relational database, and it is evaluated through a set of performance measurements. Finally, a discussion is presented regarding the authors’ most important findings and conclusions.

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

confluent

BY Jun Rao From the Bay Area Apache Kafka September 2016 Meetup. Abstract: To manage the ever-increasing volume and velocity of data within your company you have successfully made the transition from single machines and one-off solutions to large, distributed stream infrastructures in your data center powered by Apache Kafka. But what needs to be done if one data center is not enough? In this session we describe building resilient data pipelines with Apache Kafka that span multiple data centers and points of presence. We provide an overview of best practices and common patterns while covering key areas such as architecture guidelines, data replication and mirroring as well as disaster scenarios and failure handling.

Scalable Web Apps

Piotr Pelczar

This document discusses various techniques for scaling web applications, including horizontal scaling by adding more servers behind a load balancer, using a session store like Redis for shared sessions, centralized logging, and continuous integration to deploy updates. It also covers load balancing with HAProxy, monitoring with Zabbix, caching with Varnish, database scaling with master-slave replication or sharding in MongoDB, and using queues like RabbitMQ. The key is to think of the application as independent workers that can run on multiple servers rather than a single instance.

Similar to Bathcamp 2010-riak (20)

Scalable Persistent Storage for Erlang: Theory and Practice

HPC Controls Future

Getting started with Riak in the Cloud

Running a distributed system across kubernetes clusters - Kubecon North Ameri...

Multi-Datacenter Kafka - Strata San Jose 2017

MySQL on Ceph

My SQL on Ceph

Aurora_session.pdf

Highly available, scalable and secure data with Cassandra and DataStax Enterp...

Lessons learned from scaling YARN to 40K machines in a multi tenancy environment

[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...

Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn

Tuning kafka pipelines

End-to-End Reactive Data Access Using R2DBC with RSocket and Proteus

TDC2017 | São Paulo - Trilha Containers How we figured out we had a SRE team ...

High performace network of Cloud Native Taiwan User Group

Building Distributed Systems With Riak and Riak Core

Incremental Export of Relational Database Contents into RDF Graphs

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

Scalable Web Apps

More from Timothy Perrett

Nelson: Rigorous Deployment for a Functional World

Timothy Perrett

Functional programming finds its roots in mathematics - the pursuit of purity and completeness. We functional programmers look to formalize system behaviors in an algebraic and total manner. Despite this, when it comes time to deploy ones beautiful monadic ivory towers to production, most organizations cast caution to the wind and use a myriad of bash scripts and sticky tape to get the job done. In this talk, the speaker will introduce you to Nelson, an open-source project from Verizon that looks to provide rigor to your large distributed system, whilst offering best-in-class security, runtime traffic shifting and a fully immutable approach to application lifecycle. Nelson itself is entirely composed of free algebras and coproducts, and the speaker will show not only how this has enabled development, but also how it provided a frame with which to reason about solutions to fundamental operational problems.

Online Experimentation with Immutable Infrastructure

Timothy Perrett

Immutable infrastructure has changed the way we think about system lifecycle: compute machines live for days instead of months or years, and applications live for hours or less. With the proliferation of CI/CD systems, and infrastructure as a service, the increased churn in production systems has hastened the immediate need for tools that prioritize experimentation - is your next development iteration really better than the last? In such a volatile world, traditional notions of compute “environments” and mutable approaches to experimentation can be found lacking. In large systems, emergent behaviors are near impossible to replicate in isolation, so experimenting in production systems is the only way to effectively measure hypothesis. This session covers different schemes for experimentation and the primitives required to make converged infrastructure work for real systems.

Enterprise Algebras, Scala World 2016

Timothy Perrett

Verizon Labs is home to one of the largest Scala-based functional programming teams in North America, and in this talk Timothy Perrett provides an insight into the work of his infrastructure engineering team, in driving adoption of pure-functional programming in a fortune 15 company: from language education and proliferation within the team, to the specific positives and negatives of purely functional programming, when applied at massive scale

Large-scale Infrastructure Automation at Verizon

Timothy Perrett

As a company, Verizon networks and infrastructure touch nearly 70% of global internet traffic every single day. The many datacenters that support this - and many other large-scale Verizon services - are our lifeblood. This talk provides a glimpse into the work being done to reimagine the way in which we design and operate the software that runs our internal computing grids, and how we enable a large body of development staff to ship jobs and services to the grid every single day. We’ll cover how Consul and Vault make for invaluable building blocks in modern distributed systems, and highlight the importance of empowering teams through well designed infrastructure systems.

Reasonable RPC with Remotely

Timothy Perrett

Remotely is an elegant, purely functional machine-to-machine communication library developed in Scala at Verizon. Remotely is fast, lightweight, and models network operations as a monad. It features compositional, reusable protocols and codecs, where the compatibility between client and server is enforced using Scala's type system. It has support for TCP endpoints, with combinators for encryption, circuit-breaking, and load-balancing. In this talk we describe the API of Remotely, and delve into its design and implementation.

Building Enigma with State Monad & Lens

Timothy Perrett

Functional Programming at Verizon

Timothy Perrett

Scalalable Language for a Scalable Web

Timothy Perrett

BRUG - Hello, Scala

Timothy Perrett

This document is a presentation by Timothy Perrett about the Scala programming language. It introduces Scala as a statically typed, hybrid object-functional language. It discusses concepts of functional programming like referential transparency and avoiding side effects. It provides code examples comparing Scala and Ruby. It also highlights features of Scala like useful data types, implicit conversions, domain-specific languages, performance, and more.

Scala Helix

Timothy Perrett

Javazone 2011: Goal Directed Web Applications

Timothy Perrett

1) The document discusses building goal-directed web applications using Scala by modeling tasks as a series of inter-related functions. 2) It proposes representing user tasks abstractly and capturing user intent through functions to build "smart UIs" that are asynchronous and scale efficiently using technologies like Lift and Akka. 3) A demo is promised to show that reactive programming techniques are useful beyond chat applications.

Concurrency and Parallelism with Scala

Timothy Perrett

The document discusses concurrency and parallelism in Scala. It notes that manually handling threads and locks is difficult and error-prone. Actors provide a safer model for concurrency by encapsulating state and message-passing. Parallelism can be achieved in Scala through parallel collections that split work over multiple processors. The author advocates building asynchronous systems using actors for concurrency and parallel collections for parallelism to fully utilize hardware resources.

Scaladays 2011: Task Driven Scala Web Applications

Timothy Perrett

1. The document discusses task-based analysis for building scalable web applications in Scala. 2. Task-based analysis provides an abstract model for the entire task flow that is compatible with design patterns like domain-driven design and command query responsibility segregation. 3. This approach allows capturing user interface interactions based on the user's intent and building real-time user interfaces with tools like Lift Comet.

Javazone 2010-lift-framework-public

Timothy Perrett

Lift is a Scala web framework that makes building highly interactive real-time web applications simple. It leverages Scala language features and takes the best ideas from other frameworks, such as Seaside's granular sessions and security and Rails' convention over configuration. Lift has a unique "view first" architecture and offers unparalleled security and comet support out of the box. It is already used by major companies like Foursquare and has been proven at scale.

Devoxx 2009: The Lift Framework

Timothy Perrett

The document discusses a presentation on the Lift web framework and the real-time web. The presentation covers an overview of Scala, Lift's rationale, features of the real-time web, Lift's feature set, and how Scala enables Lift. Key features of Scala discussed include immutability, case classes, traits, pattern matching, and actors. Lift is introduced as a web framework that leverages Scala and takes the best ideas from other frameworks to make real-time web applications accessible.

More from Timothy Perrett (15)

Nelson: Rigorous Deployment for a Functional World

Online Experimentation with Immutable Infrastructure

Enterprise Algebras, Scala World 2016

Large-scale Infrastructure Automation at Verizon

Reasonable RPC with Remotely

Building Enigma with State Monad & Lens

Functional Programming at Verizon

Scalalable Language for a Scalable Web

BRUG - Hello, Scala

Scala Helix

Javazone 2011: Goal Directed Web Applications

Concurrency and Parallelism with Scala

Scaladays 2011: Task Driven Scala Web Applications

Javazone 2010-lift-framework-public

Devoxx 2009: The Lift Framework

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx

danishmna97

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/ Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit. In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing. van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.

RESUME BUILDER APPLICATION Project for students

KAMESHS29

Full-RAG: A modern architecture for hyper-personalization

Zilliz

Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.

Best 20 SEO Techniques To Improve Website Visibility In SERP

Pixlogix Infotech

“I’m still / I’m still / Chaining from the Block”

Claudio Di Ciccio

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

Neo4j

Dr. Sean Tan, Head of Data Science, Changi Airport Group Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.

UiPath Test Automation using UiPath Test Suite series, part 6

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI. UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities. Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes. What will you get from this session? 1. Insights into integrating generative AI. 2. Understanding how this integration enhances test automation within the UiPath platform 3. Practical demonstrations 4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath Topics covered: What is generative AI Test Automation with generative AI and Open AI. UiPath integration with generative AI Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

Neo4j

Communications Mining Series - Zero to Hero - Session 1

DianaGray10

This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered: • Communication Mining Overview • Why is it important? • How can it help today’s business and the benefits • Phases in Communication Mining • Demo on Platform overview • Q/A

20240609 QFM020 Irresponsible AI Reading List May 2024

Matthew Sinclair

TrustArc Webinar - 2024 Global Privacy Survey

TrustArc

How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024? In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores. See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe. This webinar will review: - The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey - The top challenges for privacy leaders, practitioners, and organizations in 2024 - Key themes to consider in developing and maintaining your privacy program

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

shyamraj55

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

Neo4j

Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

panagenda

Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/ DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen! Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell. Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten. Diese Themen werden behandelt - Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten - Wie funktionieren CCB- und CCX-Lizenzen wirklich? - Verstehen des DLAU-Tools und wie man es am besten nutzt - Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw. - Praxisbeispiele und Best Practices zum sofortigen Umsetzen

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject. We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup. Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved. The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring. The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise. By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.

Driving Business Innovation: Latest Generative AI Advancements & Success Story

Safe Software

Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency. During the hour, we’ll take you through: Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board. Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes. Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI. We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI. This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!

20240605 QFM017 Machine Intelligence Reading List May 2024

Matthew Sinclair

HCL Notes and Domino License Cost Reduction in the World of DLAU

panagenda

Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/ The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this! We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model. Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward. These topics will be covered - Reducing license cost by finding and fixing misconfigurations and superfluous accounts - How do CCB and CCX licenses really work? - Understanding the DLAU tool and how to best utilize it - Tips for common problem areas, like team mailboxes, functional/test users, etc - Practical examples and best practices to implement right away

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...

RESUME BUILDER APPLICATION Project for students

Full-RAG: A modern architecture for hyper-personalization

Best 20 SEO Techniques To Improve Website Visibility In SERP

“I’m still / I’m still / Chaining from the Block”

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

UiPath Test Automation using UiPath Test Suite series, part 6

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

Communications Mining Series - Zero to Hero - Session 1

20240609 QFM020 Irresponsible AI Reading List May 2024

TrustArc Webinar - 2024 Global Privacy Survey

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

Artificial Intelligence for XMLDevelopment

Driving Business Innovation: Latest Generative AI Advancements & Success Story

20240605 QFM017 Machine Intelligence Reading List May 2024

HCL Notes and Domino License Cost Reduction in the World of DLAU

Bathcamp 2010-riak

1. Timothy Perrett Bath Camp 2010

2. What is Riak? • Documented orientated database • Written in Erlang • Based on Dynamo[1] and CAP Theorem[2] • Highly fault tolerant • HTTP and ProtoBuff interface • Write MapReduce in Erlang or JavaScript 1. http://goo.gl/r8Np 2. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

3. Same, Same but different • Riak solves similar problems to MongoDB • Semi-structured data modeled as "documents” • Storage of non-document data in the database • High write-availability • Riak is intrinsically multi-node scalable • Mongo in comparison is single system (+ sharding) • Riak achieves availability via quorum writes • Mongo uses performant in-place writes • Riak uses “masterless” replication

4. N/R/W – Dynamo N = Number of replicas to store R = Number of replicas needed to read W = Number of replicas needed to read • These principals first appeared in an Amazon research paper known as Dynamo

5. • 160bit integer key space. Each node that joins is assigned part of that space for consistent hashing • Hashing means any node can service any request making the cluster masterless and eventually consistant Number of replicas

6. • Number of replies before Riak gives the client a successful reply. • Tries to access all nodes, but as soon as the N/R is satisfied a response is given Reads

7. • Same as reads; W implies the number of successful nodes that must reply before the write is considered consistent by the client Writes

8. Extreme example • Given N=10, R=W=2 we could have 8 nodes down and the cluster would still be fully available to all clients

9. What does this all mean? • N/R/W specified at request time, so each client can specify its own tolerance for outages dynamically • Despite any outages within the cluster, the whole cluster can still appear available based on N/R/W • Given N=3 and R=W=2, we can have 3-2=1 node down/unreachable/laggy in the cluster • Stupidly high availability complete with eventual consistency controlled by dynamic clients

10. Brewer’s CAP Theorem • Consistency • Availability • Partition Tolerance • You cant have all things, all the time… • …but you can have some of each, all the time! • Riak is about choosing your own levels of each according to your use case

11. Consistency • Start with document version zero • Things get redistributed and n0 and n2 are sitting in NYC and n1 and n3 are in London • What if stuff changes??

12. Consistency • Uh oh: inconsistency • Both parts of the cluster are still fully available • NYC serves v1 whilst London serves v0 • The network resumes and Riak determines the latest version by using vector clocks

13. Consistency • What if both sides of the Atlantic changed? • Riak is unable to determine which is the right document, both are returned to the client with an indication of the inconsistency

14. • Distributed, fault-tolerant full-text searching • Lucene syntax for queries • No need for index sharding • Linier scaling • Double the number of nodes to get double the search capacity (awesome!) • Search via: • Fields, wildcards, fuzzy text or token proximity Riak Search

15. Questions? basho.com/riak.html github.com/basho/riak twitter.com/timperrett github.com/timperrett blog.getintheloop.eu

Bathcamp 2010-riak

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (19)

Similar to Bathcamp 2010-riak

Similar to Bathcamp 2010-riak (20)

More from Timothy Perrett

More from Timothy Perrett (15)

Recently uploaded

Recently uploaded (20)

Bathcamp 2010-riak