This document discusses Apache Cassandra and its data model. It provides an example data model called Twissandra for modeling Twitter-like data in Cassandra. Twissandra uses Cassandra's data model of keyspaces, column families, row keys, and columns to store user profiles, followers, tweets, timelines and other Twitter data in a distributed manner across a Cassandra cluster. The document also covers Cassandra architecture topics like replication, partitioning, consistency levels, compaction, and more.
Slides from the meetup co-organized by Athens OpenStack User Group and Docker Athens.
We explore the relationship and integration between OpenStack and Docker as rapidly emerging technologies.
A VPC endpoint enables you to create a private connection between your VPC and another AWS service without requiring access over the Internet, through a NAT device, a VPN connection, or AWS Direct Connect. Endpoints are virtual devices.
LinkedIn https://www.linkedin.com/in/mohan-reddy-79a57014b/detail/recent-activity/shares/
YouTube https://www.youtube.com/user/VepsunTechnologies
Vepsun http://www.vepsun.in/
Slides from the meetup co-organized by Athens OpenStack User Group and Docker Athens.
We explore the relationship and integration between OpenStack and Docker as rapidly emerging technologies.
A VPC endpoint enables you to create a private connection between your VPC and another AWS service without requiring access over the Internet, through a NAT device, a VPN connection, or AWS Direct Connect. Endpoints are virtual devices.
LinkedIn https://www.linkedin.com/in/mohan-reddy-79a57014b/detail/recent-activity/shares/
YouTube https://www.youtube.com/user/VepsunTechnologies
Vepsun http://www.vepsun.in/
Scylla Summit 2022: Making Schema Changes Safe with RaftScyllaDB
ScyllaDB adopted Raft as a consensus protocol in order to dramatically improve our operational aspects as well as provide strong consistency to the end-user. This talk will explain how Raft behaves in Scylla Open Source 5.0 and introduce the first end-user visible major improvement: schema changes. Learn how cluster configuration resides in Raft, providing consistent cluster assembly and configuration management. This makes bootstrapping safer and provides reliable disaster recovery when you lose the majority of the cluster.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
It establishes connection/communication between different vpc’s. Either you can do in the same region or different region. AWS support peering among AWS accounts.
LinkedIn https://www.linkedin.com/today/author/mohan-reddy-79a57014b
Youtube https://www.youtube.com/user/VepsunTechnologies
Vepsun http://www.vepsun.in/
A network address translation (NAT) gateway to enable instances in a private subnet to connect to the internet or other AWS services, and then sends the response back to the instances but prevent the internet from initiating a connection with those instances.
LinkedIn https://www.linkedin.com/today/author/mohan-reddy-79a57014b
youtube https://www.youtube.com/user/VepsunTechnologies
vepsun http://www.vepsun.in/
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...DataStax Academy
Astyanax is the thrift protocol based C* driver widely used and open sourced by Netflix. It was recently integrated with the Java Driver released by DataStax. This talk focusses on the different options available with Astyanax and how it complements the Java Driver.
About Puneet Oberai, Senior Software Engineer at Netflix
Senior Software Engineer at Netflix and proud team member of Netflix CDE (Cloud Data Engineering).
Hands-on Lab: re-Modernize - Updating and Consolidating MySQLAmazon Web Services
by Joyjeet Banerjee, Enterprise Solutions Architect, AWS
Database Week at the AWS Loft is an opportunity to learn about Amazon’s broad and deep family of managed database services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon RDS and Amazon Aurora relational databases, Amazon DynamoDB non-relational databases, Amazon Neptune graph databases, and Amazon ElastiCache managed Redis, along with options for database migration, caching, search and more. You'll will learn how to get started, how to support applications, and how to scale.
Securing OpenStack and Beyond with AnsibleMajor Hayden
The openstack-ansible-security role applies security hardening configurations to any system -- those running OpenStack and those that don't -- without disruption.
Introduction Features of WatchOS2.
Outline
- Architecture of WatchOS 2
- A bunch of new interface elements.
- Complications. (ClockKit)
- Taptic Engine.
- Accessibility
- Open System URL
- Hints of building Watch app
You can use a network address translation (NAT) instance in a public subnet in your VPC to enable instances in the private subnet to initiate outbound IPv4 traffic to the Internet or other AWS services, but prevent the instances from receiving inbound traffic initiated by someone on the Internet.
LinkedIn https://www.linkedin.com/today/author/mohan-reddy-79a57014b
Youtube https://www.youtube.com/user/VepsunTechnologies
Vepsun http://www.vepsun.in/
.NET developers have a lot of options when it comes to databases these days. Apache Cassandra is a scalable, fault-tolerant database that has already found its way into more than 25% of the Fortune 100 and continues to grow in popularity. But what makes it different from the myriad of other options available? In this talk, we’ll take a deep dive into Cassandra and learn about:
- Cassandra’s internals and how it works
- CQL (the SQL-like query language for Cassandra)
- Data Modeling like a pro
- Tools available for developers
- Writing .NET code that talks to Cassandra
If there’s time and interest, we’ll finish up with how some companies are already using Cassandra to power services you probably interact with in your daily life. You’ll leave with all the tools you need to start build highly available .NET applications and services on top of Cassandra.
Hands-on Lab to compare and contrast relational queries (using RDS for MySQL) with non-relational queries (using ElastiCache for Redis). You’ll need a laptop with a Firefox or Chrome browser.
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information.
In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc.
Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.
Scylla Summit 2022: Making Schema Changes Safe with RaftScyllaDB
ScyllaDB adopted Raft as a consensus protocol in order to dramatically improve our operational aspects as well as provide strong consistency to the end-user. This talk will explain how Raft behaves in Scylla Open Source 5.0 and introduce the first end-user visible major improvement: schema changes. Learn how cluster configuration resides in Raft, providing consistent cluster assembly and configuration management. This makes bootstrapping safer and provides reliable disaster recovery when you lose the majority of the cluster.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
It establishes connection/communication between different vpc’s. Either you can do in the same region or different region. AWS support peering among AWS accounts.
LinkedIn https://www.linkedin.com/today/author/mohan-reddy-79a57014b
Youtube https://www.youtube.com/user/VepsunTechnologies
Vepsun http://www.vepsun.in/
A network address translation (NAT) gateway to enable instances in a private subnet to connect to the internet or other AWS services, and then sends the response back to the instances but prevent the internet from initiating a connection with those instances.
LinkedIn https://www.linkedin.com/today/author/mohan-reddy-79a57014b
youtube https://www.youtube.com/user/VepsunTechnologies
vepsun http://www.vepsun.in/
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...DataStax Academy
Astyanax is the thrift protocol based C* driver widely used and open sourced by Netflix. It was recently integrated with the Java Driver released by DataStax. This talk focusses on the different options available with Astyanax and how it complements the Java Driver.
About Puneet Oberai, Senior Software Engineer at Netflix
Senior Software Engineer at Netflix and proud team member of Netflix CDE (Cloud Data Engineering).
Hands-on Lab: re-Modernize - Updating and Consolidating MySQLAmazon Web Services
by Joyjeet Banerjee, Enterprise Solutions Architect, AWS
Database Week at the AWS Loft is an opportunity to learn about Amazon’s broad and deep family of managed database services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon RDS and Amazon Aurora relational databases, Amazon DynamoDB non-relational databases, Amazon Neptune graph databases, and Amazon ElastiCache managed Redis, along with options for database migration, caching, search and more. You'll will learn how to get started, how to support applications, and how to scale.
Securing OpenStack and Beyond with AnsibleMajor Hayden
The openstack-ansible-security role applies security hardening configurations to any system -- those running OpenStack and those that don't -- without disruption.
Introduction Features of WatchOS2.
Outline
- Architecture of WatchOS 2
- A bunch of new interface elements.
- Complications. (ClockKit)
- Taptic Engine.
- Accessibility
- Open System URL
- Hints of building Watch app
You can use a network address translation (NAT) instance in a public subnet in your VPC to enable instances in the private subnet to initiate outbound IPv4 traffic to the Internet or other AWS services, but prevent the instances from receiving inbound traffic initiated by someone on the Internet.
LinkedIn https://www.linkedin.com/today/author/mohan-reddy-79a57014b
Youtube https://www.youtube.com/user/VepsunTechnologies
Vepsun http://www.vepsun.in/
.NET developers have a lot of options when it comes to databases these days. Apache Cassandra is a scalable, fault-tolerant database that has already found its way into more than 25% of the Fortune 100 and continues to grow in popularity. But what makes it different from the myriad of other options available? In this talk, we’ll take a deep dive into Cassandra and learn about:
- Cassandra’s internals and how it works
- CQL (the SQL-like query language for Cassandra)
- Data Modeling like a pro
- Tools available for developers
- Writing .NET code that talks to Cassandra
If there’s time and interest, we’ll finish up with how some companies are already using Cassandra to power services you probably interact with in your daily life. You’ll leave with all the tools you need to start build highly available .NET applications and services on top of Cassandra.
Hands-on Lab to compare and contrast relational queries (using RDS for MySQL) with non-relational queries (using ElastiCache for Redis). You’ll need a laptop with a Firefox or Chrome browser.
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information.
In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc.
Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.
Apache Cassandra, part 2 – data model example, machineryAndrey Lomakin
Aim of this presentation to provide enough information for enterprise architect to choose whether Cassandra will be project data store. Presentation describes each nuance of Cassandra architecture and ways to design data and work with them.
Presented at MesosCon (NA) 2015.
Apache Cotton (previously named Mysos) is an Apache Mesos framework for running MySQL instances. It dramatically simplifies the management of a MySQL cluster and is designed to offer:
- Efficient hardware utilization through multi-tenancy (in performance-isolated containers)
- High reliability through preserving the MySQL state during failure and automatic backing up to/restoring from HDFS- An automated self-service option for bringing up new MySQL clusters- High availability through automatic MySQL master failover
- An elastic solution that allows users to easily scale up and down a MySQL cluster by changing the number of slave instances
We’ll share our experience developing and using this framework.
Riga dev day: Lambda architecture at AWSAntons Kranga
My recent talk at Riga DevDay about Lambda architect at AWS. It illustrates few design simplifications that we can get when we implement Lambda Architecture in Cloud Native way
Video in french at https://www.youtube.com/watch?v=9LNnNh63rBI
Sizing an Elasticsearch cluster has to consider many dimensions. In this presentation we go through the different elements and features you should consider to handle big and varying loads of log data.
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...NETWAYS
How to store billions of time series points and access them within a few milliseconds? Chronix!
Chronix is a young but mature open source project that allows one for example to store about 15 GB (csv) of time series in 238 MB with average query times of 21 ms. Chronix is built on top of Apache Solr a bulletproof distributed NoSQL database with impressive search capabilities. In this code-intense session we show how Chronix achieves its efficiency in both respects by means of an ideal chunking, by selecting the best compression technique, by enhancing the stored data with (pre-computed) attributes, and by specialized query functions.
A Fast and Efficient Time Series Storage Based on Apache SolrQAware GmbH
OSDC 2016, Berlin: Talk by Florian Lautenschlager (@flolaut, Senior Software Engineer at QAware)
Abstract: How to store billions of time series points and access them within a few milliseconds? Chronix! Chronix is a young but mature open source project that allows one for example to store about 15 GB (csv) of time series in 238 MB with average query times of 21 ms. Chronix is built on top of Apache Solr a bulletproof distributed NoSQL database with impressive search capabilities. In this code-intense session we show how Chronix achieves its efficiency in both respects by means of an ideal chunking, by selecting the best compression technique, by enhancing the stored data with (pre-computed) attributes, and by specialized query functions.
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...Amazon Web Services
Elasticsearch is a fully featured search engine used for real-time analytics, and Amazon Elasticsearch Service makes it easy to deploy Elasticsearch clusters on AWS. With Amazon ES, you can ingest and process billions of events per day, and explore the data using Kibana to discover patterns. In this session, we use Apache web logs as example and show you how to build an end-to-end analytics solution. First, we cover how to configure an Amazon ES cluster and ingest data into it using Amazon Kinesis Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data. Then we demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we dive deep into the Elasticsearch query DSL and review approaches for generating custom, ad-hoc reports.
Cassandra's Sweet Spot - an introduction to Apache CassandraDave Gardner
Slides from my NoSQL Exchange 2011 talk introducing Apache Cassandra. This talk explained the fundamental concepts of Cassandra and then demonstrated how to build a simple ad-targeting application using PHP, with a focus on data modeling.
Video of talk: http://skillsmatter.com/podcast/home/cassandra/js-2880
Data processing use cases, from transformation to analytics, perform tasks that require various combinations of queuing, streaming & lightweight processing steps. Until now, supporting all of those needs has required different systems for each task--stream processing engines, messaging queuing middleware, & streaming messaging systems. That has led to increased complexity for development & operations.
In this session, well discuss the need to unify these capabilities in a single system & how Apache Pulsar was designed to address that. Apache Pulsar is a next generation distributed pub-sub system that was developed & deployed at Yahoo. Streamlios Karthik Ramasamy, will explain how the architecture & design of Pulsar provides the flexibility to support developers & applications needing any combination of queuing, messaging, streaming & lightweight compute.
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
At Knewton we operate across five different VPCs a total of 29 clusters, each ranging from 3 nodes to 24 nodes. For a team of three to maintain this is not herculean, however good tools to diagnose issues and gather information in a distributed manner are vital to moving quickly and minimizing engineering time spent.
The database team at Knewton has been successfully using a combination of Ansible and custom open sourced tools to maintain and improve the Cassandra deployment at Knewton. I will be talking about several of these tools and giving examples of how we are using them. Specifically I will discuss the cassandra-tracing tool, which analyzes the contents of the system_traces keyspace, and the cassandra-stat tool, which gives real-time output of the operations of a cassandra cluster. Distributed administration with ad-hoc Ansible will also be covered and I will walk through examples of using these commands to identify and remediate clusterwide issues.
About the Speaker
Jeffrey Berger Lead Database Engineer, Knewton
Dr. Jeffrey Berger is currently the lead database engineer at Knewton, an education tech startup in NYC. He joined the tech scene in NYC in 2013 and spent two years working with MongoDB, becoming a certified MongoDB administrator and a MongoDB Master. He received his Cassandra Administrator certification at Cassandra Summit 2015. He holds a Ph.D. in Theoretical Physics from Penn State and spent several years working on high energy nuclear interactions.
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...DataStax
Many companies use both elasticsearch and cassandra, typically in the form of logs or time series, but managing many softwares at a large scale can be quite challenging. Elassandra tightly integrates elasticsearch within cassandra as a secondary index, allowing near-realtime search with all existing elasticsearch APIs, plugins and tools like Kibana. We will present the core concepts of elassandra and explain how it draws benefit from internal cassandra features to make elasticsearch masterless, scalable with automatic resharding, more reliable and more efficient than deploying both softwares. We will also explore the bidirectional mapping : the way elasticsearch automatically creates the corresponding cassandra schema and the way elasticsearch indexes an existing cassandra table. Furthermore, we will share some use cases and benchmark results demonstrating practical use of elassandra to scale-out, re-index with zero-downtime, search and visualize data with various tools.
About the Speakers
Remi Trouville Consultant, Independant
Remi is an IT engineer who has worked for the last 8 years in the financial industry as a team manager responsible for all the call-center softwares managing the customer experience. At the end of this period, his team was dealing with 10,000+ agents with 100+ sites and some highly critical business processes such as storage of oral proof sales for transactions. He holds a Master's Degree in Telecommunication engineering and is now following an executive-MBA, in a French business school.
Python Utilities for Managing MySQL DatabasesMats Kindahl
Managing a MySQL database server can become a full time job. What we need are tools that bundle a set of related tasks into a common utility. While there are several such utility libraries to choose, it is often the case that you need to customize them to your needs. The MySQL Utilities library is the answer to that need. It is open source so you can modify and expand it as you see fit.
This is the presentation from OSCON 2011 in Portland.
Understanding Nidhi Software Pricing: A Quick Guide 🌟
Choosing the right software is vital for Nidhi companies to streamline operations. Our latest presentation covers Nidhi software pricing, key factors, costs, and negotiation tips.
📊 What You’ll Learn:
Key factors influencing Nidhi software price
Understanding the true cost beyond the initial price
Tips for negotiating the best deal
Affordable and customizable pricing options with Vector Nidhi Software
🔗 Learn more at: www.vectornidhisoftware.com/software-for-nidhi-company/
#NidhiSoftwarePrice #NidhiSoftware #VectorNidhi
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
In the ever-evolving landscape of technology, enterprise software development is undergoing a significant transformation. Traditional coding methods are being challenged by innovative no-code solutions, which promise to streamline and democratize the software development process.
This shift is particularly impactful for enterprises, which require robust, scalable, and efficient software to manage their operations. In this article, we will explore the various facets of enterprise software development with no-code solutions, examining their benefits, challenges, and the future potential they hold.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
AI Genie Review: World’s First Open AI WordPress Website CreatorGoogle
AI Genie Review: World’s First Open AI WordPress Website Creator
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-genie-review
AI Genie Review: Key Features
✅Creates Limitless Real-Time Unique Content, auto-publishing Posts, Pages & Images directly from Chat GPT & Open AI on WordPress in any Niche
✅First & Only Google Bard Approved Software That Publishes 100% Original, SEO Friendly Content using Open AI
✅Publish Automated Posts and Pages using AI Genie directly on Your website
✅50 DFY Websites Included Without Adding Any Images, Content Or Doing Anything Yourself
✅Integrated Chat GPT Bot gives Instant Answers on Your Website to Visitors
✅Just Enter the title, and your Content for Pages and Posts will be ready on your website
✅Automatically insert visually appealing images into posts based on keywords and titles.
✅Choose the temperature of the content and control its randomness.
✅Control the length of the content to be generated.
✅Never Worry About Paying Huge Money Monthly To Top Content Creation Platforms
✅100% Easy-to-Use, Newbie-Friendly Technology
✅30-Days Money-Back Guarantee
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIGenieApp #AIGenieBonus #AIGenieBonuses #AIGenieDemo #AIGenieDownload #AIGenieLegit #AIGenieLiveDemo #AIGenieOTO #AIGeniePreview #AIGenieReview #AIGenieReviewandBonus #AIGenieScamorLegit #AIGenieSoftware #AIGenieUpgrades #AIGenieUpsells #HowDoesAlGenie #HowtoBuyAIGenie #HowtoMakeMoneywithAIGenie #MakeMoneyOnline #MakeMoneywithAIGenie
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Looking for a reliable mobile app development company in Noida? Look no further than Drona Infotech. We specialize in creating customized apps for your business needs.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
4. 4 www.ExigenServices.com
Twissandra Use Cases
Get the friends of a username
Get the followers of a username
Get a timeline of a specific user’s tweets
Create a tweet
Create a user
Add friends to a user
14. 14 www.ExigenServices.com
Cassandra QL – User creation
BEGIN BATCH
INSERT INTO User (KEY, username, password) VALUES (‘id',
‘konstantin’, ‘******’)
INSERT INTO Username (KEY, userid) VALUES ( ‘konstantin’, ‘id’)
APPLY BATCH
15. 15 www.ExigenServices.com
Cassandra QL – following a friend
BEGIN BATCH
INSERT INTO Friends (KEY, friendid) VALUES (userid, 123456)
INSERT INTO Followers (KEY, userid) VALUES (friendid ‘, 123456)
APPLY BATCH
16. 16 www.ExigenServices.com
Cassandra QL – Tweet creation
BEGIN BATCH
INSERT INTO Tweet (KEY, userid, body, timestamp) VALUES
(‘tweetid‘, ‘userid’, ’@ericflo thanks for Twissandra, it helps!’,
123656459847)
INSERT INTO Userline (KEY, 123656459847) VALUES (
‘userid’, ‘tweetid’)
INSERT INTO Timeline (KEY, 123656459847) VALUES (
‘userid’, ‘tweetid’)
……..
INSERT INTO Timeline (KEY, 123656459847) VALUES (
‘followerid’, ‘tweetid’)
……
APPLY BATCH
17. 17 www.ExigenServices.com
Cassandra QL – Getting user tweets
SELECT * FROM Userline WHERE KEY = ‘userid’
SELECT * FROM Tweet WHERE KEY IN (‘tweetid1’, ‘tweetid2’,
‘tweetid3’, …., ‘tweetidn’)
18. 18 www.ExigenServices.com
Cassandra QL – Getting user timeline
SELECT * FROM Timeline WHERE KEY = ‘userid’
SELECT * FROM Tweet WHERE KEY IN (‘tweetid1’, ‘tweetid2’,
‘tweetid3’, …., ‘tweetidn’)
19. 19 www.ExigenServices.com
Design patterns
Materialized View
– create a second column family to represent
additional queries
Valueless Column
– use column names for values
Aggregate Key
– If you need to find sub item, use composite key
23. 23 www.ExigenServices.com
Replication
Replication controlled by the replication_factor
setting in the keyspace definition
The actual placement of replicas in the cluster is
determined by the Replica Placement Strategies.
25. 25 www.ExigenServices.com
Placement Strategies
OldNetworkTopologyStrategy - places one replica
in a different data center while placing the others
on different racks in the current data center.
27. 27 www.ExigenServices.com
Snitches
Give Cassandra information about the network
topology of the cluster
Endpoint snitch – gives information about network
topology.
Dynamic snitch – monitor read latencies
28. 28 www.ExigenServices.com
Endpoint Snitch Implementations
SimpleSnitch (default) - can be efficient for
locating nodes in clusters limited to a single data
center.
29. 29 www.ExigenServices.com
Endpoint Snitch Implementations
RackInferringSnitch - extrapolates the topolology
of the network by analyzing IP addresses.
192.168.191.71
192.168.191.21
In the same rack
192.168.191.71
192.168.171.21
In the same datacenter
192.78.19.71
192.18.11.21
In different datacenters
30. 30 www.ExigenServices.com
Endpoint Snitch Implementations
PropertyFileSnitch - determines the location of
nodes by referring to a user-defined description of
the network details located in the property file
cassandra-topology.properties.
31. 31 www.ExigenServices.com
Commit Log
• Durability
• sequential writes only
Memtable
• no disk access, batched writes
SSTable
• become read‐only
• indexes
Memtables, SSTables, Commit Logs
33. 33 www.ExigenServices.com
Read properties
Read properties
Read multiple SSTables
Slower than writes (but still fast)
Seeks can be mitigated with more RAM
Amortized lose of scalability
34. 34 www.ExigenServices.com
Commit Log durability
Periodic sync of commit log. With potential
probability for data loss.
Batch sync of commit log. Write is acknowledged
only if commit log is flushed on disk. It is strongly
recommended to have separate device for commit
log in such case.
36. 36 www.ExigenServices.com
Gossip protocol
org.apache.cassandra.gms.Gossiper
– Has the list of nodes that are alive and dead
– Chooses a random node and starts “chat” with
it. One gossip round requires three messages
Failure detection uses a suspicion level to decide
whether the node is alive or dead
38. 38 www.ExigenServices.com
Consistency level
Consistency level Write Read
ANY 1 replica (including HH) -
ONE 1 1
QUORUM N/2 + 1 N/2 + 1
LOCAL_QUORUM
(to avoid latency issues)
(dc_replicas)/2 + 1 (local
datacenter)
(dc_replicas)/2 + 1 (local
datacenter)
EACH_QUORUM
(useful in backup scenarios)
(dc_replicas)/2 + 1 (each
datacenter)
(dc_replicas)/2 + 1 (each
datacenter)
ALL N N
39. 39 www.ExigenServices.com
Tombstones
The data is not immediately deleted
Deleted values are marked
Tombstones will be suppressed during next
compaction
GCGraceSeconds – amount of seconds that
server will wait to garbage-collect a tombstone
41. 41 www.ExigenServices.com
Compaction
Minor:
– Triggered when at least N SSTables have been
flushed on disk (N is tunable, 4 – by default)
– Merging SSTables of the similar size
Major:
– Merging all SSTables
– Done manually through nodetool compact
– discarding tombstones
43. 43 www.ExigenServices.com
Anti-entropy
During major compaction the node exchanges
Merkle trees (hash of its data) with another nodes
If the trees don’t match, they are repaired
Nodes maintain timestamp index and exchange
only the most recent updates
44. 44 www.ExigenServices.com
Read repair
During read operation replicas with stale values
are brought up to date
– Week consistency level (ONE):
after the data is returned
– Strong consistency level (ALL):
before the data is returned
– Eventual consistency - QUORUM
46. 46 www.ExigenServices.com
Bloom filters
On write:
– several hashes are generated per key
– bits for each hash are marked
On read:
– hashes are generated for the key
– if all bits of this hashes are non-empty then the
key may probably exist in SSTable
– if at least one bit is empty then the key has
been never written to SSTable
Endpoint snitch can be wrapped with a dynamic snitch, which will monitor read latencies and avoid reading from hosts that have slowed (due to compaction, for instance)