This document provides an introduction and overview of NoSQL databases. It discusses what NoSQL means, the motivations behind NoSQL such as big data, scalability, flexible data formats and manageability. It covers key-value stores, document databases, column-oriented databases, graph databases and discusses when each type would be most applicable. Specific NoSQL databases discussed include MongoDB, Cassandra, Redis, CouchDB, Neo4J and others. The document also covers concepts like CAP theorem, BASE semantics, consistency hashing and more.
Six Principles of Software Design to Empower ScientistsDavid De Roure
Keynote talk for Workshop on Managing for Usability:
Challenges and Opportunities for E-Science Project Management, 10-11 April 2008,
OeRC, University of Oxford, UK
Software design principles for evolving architecturesFirat Atagun
While embracing software design principles, architectural patterns are studied in this presentation. Principles like, KISS, YAGNI, SOLID are used for software development. Also architectural patterns are presented: Layered Architecture, Event Driven Architecture, MicroKernel and Microservices can be found.
Umfjöllun um tvo kafla í Barbin bókinni, kafli 2 og 4. Sá fyrri fjallar um Architectual Principles og sá seinni um General Responsibilty Assignment Software Patterns.
Six Principles of Software Design to Empower ScientistsDavid De Roure
Keynote talk for Workshop on Managing for Usability:
Challenges and Opportunities for E-Science Project Management, 10-11 April 2008,
OeRC, University of Oxford, UK
Software design principles for evolving architecturesFirat Atagun
While embracing software design principles, architectural patterns are studied in this presentation. Principles like, KISS, YAGNI, SOLID are used for software development. Also architectural patterns are presented: Layered Architecture, Event Driven Architecture, MicroKernel and Microservices can be found.
Umfjöllun um tvo kafla í Barbin bókinni, kafli 2 og 4. Sá fyrri fjallar um Architectual Principles og sá seinni um General Responsibilty Assignment Software Patterns.
From catalogues to models: transitioning from existing requirements technique...James Towers
There is a growing consensus that the levels of complication we face in modern systems engineering projects cannot be controlled via a traditional document-centric approach. This is encouraging organisation to adopt a model-based approach, however the transition is not always straightforward.
Organizations with a mature document-centric requirements practice often have a significant investment in both existing tools and processes. The move to model-based systems engineering raises several questions about how these tools and processes ‘fit’ in the new MBSE world.
Model-Based Systems Engineering (MBSE) is an ambiguous concept that means many things to many different people. The purpose of this presentation is to “de-mystify” MBSE, with the intent of moving the sub-discipline forward. Model-Based Systems Engineering was envisioned to manage the increasing complexity within systems and System of Systems (SoS). This presentation defines MBSE as the formalized application of modeling (static and dynamic) to support system design and analysis, throughout all phases of the system lifecycle, and through the collection of modeling languages, structures, model-based processes, and presentation frameworks used to support the discipline of systems engineering in a model-based or model-driven context. Using this definition, the components of MBSE (modeling languages, processes, structures, and presentation frameworks) are defined. The current state of MBSE is then evaluated against a set of effective measures. Finally, this presents a vision for the future direction of MBSE.
Cohesion and Coupling - The Keys To Changing Your Code With ConfidenceDan Donahue
Developers often begrudge how difficult their code is to maintain. They look for new languages, new paradigms and new practices to help them write more maintainable code. But they often gloss over the basics of clean code.
This talks discusses two simple software metrics conceived in the 1960s - cohesion and coupling. These two measurable properties can tell you so much about how readable and maintainable your code really is.
You'll learn how to measure cohesion and coupling, interpret what the measurements mean and how to refactor to improve these metrics and ultimately, the maintainability of your codebase.
From catalogues to models: transitioning from existing requirements technique...James Towers
There is a growing consensus that the levels of complication we face in modern systems engineering projects cannot be controlled via a traditional document-centric approach. This is encouraging organisation to adopt a model-based approach, however the transition is not always straightforward.
Organizations with a mature document-centric requirements practice often have a significant investment in both existing tools and processes. The move to model-based systems engineering raises several questions about how these tools and processes ‘fit’ in the new MBSE world.
Model-Based Systems Engineering (MBSE) is an ambiguous concept that means many things to many different people. The purpose of this presentation is to “de-mystify” MBSE, with the intent of moving the sub-discipline forward. Model-Based Systems Engineering was envisioned to manage the increasing complexity within systems and System of Systems (SoS). This presentation defines MBSE as the formalized application of modeling (static and dynamic) to support system design and analysis, throughout all phases of the system lifecycle, and through the collection of modeling languages, structures, model-based processes, and presentation frameworks used to support the discipline of systems engineering in a model-based or model-driven context. Using this definition, the components of MBSE (modeling languages, processes, structures, and presentation frameworks) are defined. The current state of MBSE is then evaluated against a set of effective measures. Finally, this presents a vision for the future direction of MBSE.
Cohesion and Coupling - The Keys To Changing Your Code With ConfidenceDan Donahue
Developers often begrudge how difficult their code is to maintain. They look for new languages, new paradigms and new practices to help them write more maintainable code. But they often gloss over the basics of clean code.
This talks discusses two simple software metrics conceived in the 1960s - cohesion and coupling. These two measurable properties can tell you so much about how readable and maintainable your code really is.
You'll learn how to measure cohesion and coupling, interpret what the measurements mean and how to refactor to improve these metrics and ultimately, the maintainability of your codebase.
Comparison between mongo db and cassandra using ycsbsonalighai
Performed YCSB benchmarking test to check the performances of MongoDB and Cassandra for different workloads and a million opcounts and generated a report discussing clear insights.
In this paper we describe NoSQL, a series of non-relational database
technologies and products developed to address the current problems the
RDMS system are facing: lack of true scalability, poor performance on high
data volumes and low availability. Some of these products have already been
involved in production and they perform very well: Amazon’s Dynamo,
Google’s Bigtable, Cassandra, etc. Also we provide a view on how these
systems influence the applications development in the social and semantic Web
sphere.
In this paper we describe NoSQL, a series of non-relational database technologies and products developed to address the current problems the RDMS system are facing: lack of true scalability, poor performance on high data volumes and low availability. Some of these products have already been involved in production and they perform very well: Amazon’s Dynamo, Google’s Bigtable, Cassandra, etc. Also we provide a view on how these systems influence the applications development in the social and semantic Web sphere.
Overview of MongoDB and Other Non-Relational DatabasesAndrew Kandels
My Minnesota PHP Usergroup (mnphp.org) presentation where I give an overview on MongoDB and other non-relational databases and their ability to solve unique, complex problems.
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
The document talks about the overview behind the need and drive for NoSQL databases. It also mentions about some of the most popular NoSQL databases in the market.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
3. Use Cases
Massive write performance.
Fast key value look ups.
Flexible schema and data types.
No single point of failure.
Fast prototyping and development.
Out of the box scalability.
Easy maintenance.
6. Scalability
Scale up, Vertical scalability.
Increasing
server capacity.
Adding more CPU, RAM.
Managing is hard.
Possible down times
7. Scalability
Scale out, Horizontal scalability.
Adding servers to existing system with little effort, aka
Elastically scalable.
Shared nothing.
Use of commodity/cheap hardware.
Heterogeneous systems.
Controlled Concurrency (avoid locks).
Service Oriented Architecture. Local states.
Bugs, hardware errors, things fail all the time.
It should become cheaper. Cost efficiency.
Decentralized to reduce bottlenecks.
Avoid Single point of failures.
Asynchrony.
Symmetry, you don’t have to know what is happening. All
nodes should be symmetric.
8. What is Wrong With RDBMS?
Nothing. One size fits all? Not really.
Impedance mismatch.
Object Relational Mapping doesn't work quite well.
Rigid schema design.
Harder to scale.
Replication.
Joins across multiple nodes? Hard.
How does RDMS handle data growth? Hard.
Need for a DBA.
Many programmers are already familiar with it.
Transactions and ACID make development easy.
Lots of tools to use.
9. ACID Semantics
Atomicity: All or nothing.
Consistency: Consistent state of data and
transactions.
Isolation: Transactions are isolated from each
other.
Durability: When the transaction is
committed, state will be durable.
Any data store can achieve Atomicity, Isolation and
Durability but do you always need consistency? No.
By giving up ACID properties, one can achieve
higher performance and scalability.
10. Enter CAP Theorem
Also known as Brewer’s Theorem by Prof. Eric
Brewer, published in 2000 at University of
Berkeley.
“Of three properties of a shared data system:
data consistency, system availability and
tolerance to network partitions, only two can
be achieved at any given moment.”
Proven by Nancy Lynch et al. MIT labs.
http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
11. CAP Semantics
Consistency: Clients should read the same
data. There are many levels of consistency.
Consistency – RDBMS.
Tunable Consistency – Cassandra.
Eventual Consistency – Amazon Dynamo.
Strict
Availability: Data to be available.
Partial Tolerance: Data to be partitioned across
network segments due to network failures.
13. A Simple Proof
Available and partitioned
Not consistent, we get back old data.
App
Data
A
Old Data
B
14. A Simple Proof
Consistent and partitioned
Not available, waiting…
App
New Data
Wait for new data
A
B
15. BASE, an ACID Alternative
Almost the opposite of ACID.
Basically available: Nodes in the a distributed
environment can go down, but the whole
system shouldn’t be affected.
Soft State (scalable): The state of the system
and data changes over time.
Eventual Consistency: Given enough
time, data will be consistent across the
distributed system.
16. A Clash of cultures
ACID:
• Strong consistency.
• Less availability.
• Pessimistic concurrency.
• Complex.
BASE:
• Availability is the most important thing. Willing
to sacrifice for this (CAP).
• Weaker consistency (Eventual).
• Best effort.
• Simple and fast.
• Optimistic.
17. Distributed Transactions
Two phase commit.
Starbucks doesn’t use two phase commit by Gregor Hophe.
Possible failures
Network errors.
Node errors.
Database errors.
Commit
Rollback
Coordinator
Acknowledge
Problems:
Locking the entire cluster if one node is down
Possible to implement timeouts.
Possible to use Quorum.
Quorum: in a distributed environment, if there is
partition, then the nodes vote to commit or
rollback.
Complete operation
Release locks
18. Consistent Hashing
Solves Partitioning Problem.
Consistent Hashing, Memcahced.
servers = [s1, s2, s3, s4, s5]
serverToSendData = servers[hash(data) % servers.length]
A New Hope
Continuum Approach.
Virtual Nodes in a cycle.
Hash both objects and caches.
Easy Replication.
Eventually Consistent.
What happens if nodes fail?
How do you add nodes?
http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf
20. Vector Clocks
Used for conflict detection of data.
Timestamp based resolution of conflicts is not
enough.
Time 1:
Time 2:
Replicated
Time 3:
Update
Time 4: Update
Time 5:
Replicated
Conflict detection
22. Read Repair
Value = Data.v2
Client
GET (K, Q=2)
Value = Data.v2
Update K = Data.v2
Value = Data.v1
23. Gossip Protocol & Hinted Handoffs
Most preferred communication protocol in a
distributed environment is Gossip Protocol.
D
A
G
H
• All the nodes talk to each other peer wise.
• There is no global state.
• No single point of coordinator.
• If one node goes down and there is a Quorum
load for that node is shared among others.
• Self managing system.
• If a new node joins, load is also distributed.
B
C
F
Requests coming to F will be handled by
the nodes who takes the load of F, lets say C with
the hint that it took the requests which was for F,
when F becomes available, F will get this
Information from C. Self healing property.
26. Key-Value Stores
Memcached – Key value stores.
Membase – Memcached with persistence and
improved consistent hashing.
AppFabric Cache – Multi region Cache.
Redis – Data structure server.
Riak – Based on Amazon’s Dynamo.
Project Voldemort – eventual consistent key
value stores, auto scaling.
27. Memcached
Very easy to setup and use.
Consistent hashing.
Scales very well.
In memory caching, no persistence.
LRU eviction policy.
O(1) to set/get/delete.
Atomic operations set/get/delete.
No iterators, or very difficult.
28. Membase
Easy to manage via web console.
Monitoring and management via Web console.
Consistency and Availability.
Dynamic/Linear Scalability, add a node, hit join to
cluster and rebalance.
Low latency, high throughput.
Compatible with current Memcached Clients.
Data Durability, persistent to disk asynchronously.
Rebalancing (Peer to peer replication).
Fail over (Master/Slave).
vBuckets are used for consistent hashing.
O(1) to set/get/delete.
29. Redis
Distributed Data structure server.
Consistent hashing at client.
Non-blocking I/O, single threaded.
Values are binary safe strings: byte strings.
String : Key/Value Pair, set/get. O(1) many string operations.
Lists: lpush, lpop, rpush, rpop.you can use it as stack or
queue. O(1). Publisher/Subscriber is available.
Set: Collection of Unique
elements, add, pop, union, intersection etc. set operations.
Sorted Set: Unique elements sorted by scores. O(logn).
Supports range operations.
Hashes: Multiple Key/Value pairs
HMSET user 1 username foo password bar age 30
HGET user 1 age
30. Microsoft AppFabric
Add a node to the cluster easily. Elastic
scalability.
Namespaces to organize different caches.
LRU Eviction policy.
Timeout/Time to live is default to 10 min.
No persistence.
O(1) to set/get/delete.
Optimistic and pessimistic concurrency.
Supports tagging.
32. Mongodb
Data types:
bool, int, double, string, object(bson), oid, array, null, d
ate.
Database and collections are created automatically.
Lots of Language Drivers.
Capped collections are fixed size
collections, buffers, very fast, FIFO, good for logs. No
indexes.
Object id are generated by client, 12 bytes packed
data. 4 byte time, 3 byte machine, 2 byte pid, 3 byte
counter.
Possible to refer other documents in different
collections but more efficient to embed documents.
Replication is very easy to setup. You can read from
33. Mongodb
Connection pooling is done for you. Sweet.
Supports aggregation.
Map Reduce with JavaScript.
You have indexes, B-Trees. Ids are always
indexed.
Updates are atomic. Low contention locks.
Querying mongo done with a document:
Lazy, returns a cursor.
Reduceable to SQL, select, insert, update limit, sort
etc.
Several operators:
There is more: upsert (either inserts of updates)
$ne, $and, $or, $lt, $gt, $incr,$decr and so on.
Repository Pattern makes development very easy.
35. Couchdb
Availability and Partial Tolerance.
Views are used to query. Map/Reduce.
MVCC – Multiple Concurrent versions. No locks.
A little overhead with this approach due to garbage collection.
Conflict resolution.
Very simple, REST based. Schema Free.
Shared nothing, seamless peer based Bi-Directional replication.
Auto Compaction. Manual with Mongodb.
Uses B-Trees
Documents and indexes are kept in memory and flushed to disc
periodically.
Documents have states, in case of a failure, recovery can continue
from the state documents were left.
No built in auto-sharding, there are open source projects.
You can’t define your indexes.
N : Number of nodes with a replica of data.W: Number of nodes that must acknowledge the update.R : Minimum number of nodes that succeeds read operation.W + R > N Strong ConsistencyW + R <= N Weak Consistency