The NoSQL movement has rekindled interest in data storage solutions. A few years ago, within limited scale systems, storage choices for programmers and architects were simple where relational databases were almost always the choice. However, advent of Cloud and ever increasing user bases for applications have given rise to larger scale systems. Relational databases cannot always scale to meet the needs of those systems, and as an alternative, the NoSQL movement has proposed many solutions.
For a programmer who wants to select a data model, they now have to choose from a wide variety of choices like Local memory, Relational databases, Files, Distributed Cache, Column Family Storage, Document Storage, Name value pairs, Graph DBs, Service Registries, Queue, and Tuple Space etc. Furthermore, there are different layers/access choices such as directly accessing data, using object to relation mapping layer like hibernate/JPA, or using data services. Moreover, users also need to worry about how to scale up the storage in multiple dimensions like the number of databases, the number of tables, the amount of data in a table, frequency of requests, types of requests (read/write ratio).
Consequently, choosing the right data model for a given problem is no longer trivial, and such a choice needs a clear understanding of different storage offerings, their similarities, differences, as well as associated tradeoffs. We faced the same problem while designing the data interfaces for Stratos Platform as a Service (SaaS) offering, and in this talk, we would like to share our findings and experiences of that work. We will present a survey of different data models, their differences as well as similarities, tradeoffs, and killer apps for each model. We believe the participants will walk away with a border understanding about data models and guidelines on which model to be used when.
Scalable Persistent Message Brokering with WSO2 Message BrokerSrinath Perera
A highly available and fast message broker ensures high-volume message delivery and supports reliable business operations. While most open source message broker projects don’t match commercial offerings, WSO2 Message Broker 2.0 offers broker domains, a distributed architecture, and extreme scalability. In this session, Srinath will describe the new distributed message broker architecture, message sharing across brokers via a Cassandra cluster, and an AMQP based JMS style API. He will discuss the WSO2 Message Broker 2.0 design and explore real world use cases.
Finding the Right Data Solution for your Application in the Data Storage Hays...DATAVERSITY
The NoSQL movement has rekindled interest in data storage solutions. A few years ago, within limited scale systems, storage choices for programmers and architects were simple where relational databases were almost always the choice. However, advent of Cloud and ever increasing user bases for applications have given rise to larger scale systems. Relational databases cannot always scale to meet the needs of those systems, and as an alternative, the NoSQL movement has proposed many solutions.
For a programmer who wants to select a data model, they now have to choose from a wide variety of choices like Local memory, Relational databases, Files, Distributed Cache, Column Family Storage, Document Storage, Name value pairs, Graph DBs, Service Registries, Queue, and Tuple Space etc. Furthermore, there are different layers/access choices such as directly accessing data, using object to relation mapping layer like hibernate/JPA, or using data services. Moreover, users also need to worry about how to scale up the storage in multiple dimensions like the number of databases, the number of tables, the amount of data in a table, frequency of requests, types of requests (read/write ratio).
Consequently, choosing the right data model for a given problem is no longer trivial, and such a choice needs a clear understanding of different storage offerings, their similarities, differences, as well as associated tradeoffs. We faced the same problem while designing the data interfaces for Stratos Platform as a Service (SaaS) offering, and in this talk, we would like to share our findings and experiences of that work. We will present a survey of different data models, their differences as well as similarities, tradeoffs, and killer apps for each model. We believe the participants will walk away with a border understanding about data models and guidelines on which model to be used when.
This is a ppt from Open Source Bridge that Thomas used for his session. This basically educates on why redundant power and back up power is so critical, and why you should always back up your info.
Scalable Persistent Message Brokering with WSO2 Message BrokerSrinath Perera
A highly available and fast message broker ensures high-volume message delivery and supports reliable business operations. While most open source message broker projects don’t match commercial offerings, WSO2 Message Broker 2.0 offers broker domains, a distributed architecture, and extreme scalability. In this session, Srinath will describe the new distributed message broker architecture, message sharing across brokers via a Cassandra cluster, and an AMQP based JMS style API. He will discuss the WSO2 Message Broker 2.0 design and explore real world use cases.
Finding the Right Data Solution for your Application in the Data Storage Hays...DATAVERSITY
The NoSQL movement has rekindled interest in data storage solutions. A few years ago, within limited scale systems, storage choices for programmers and architects were simple where relational databases were almost always the choice. However, advent of Cloud and ever increasing user bases for applications have given rise to larger scale systems. Relational databases cannot always scale to meet the needs of those systems, and as an alternative, the NoSQL movement has proposed many solutions.
For a programmer who wants to select a data model, they now have to choose from a wide variety of choices like Local memory, Relational databases, Files, Distributed Cache, Column Family Storage, Document Storage, Name value pairs, Graph DBs, Service Registries, Queue, and Tuple Space etc. Furthermore, there are different layers/access choices such as directly accessing data, using object to relation mapping layer like hibernate/JPA, or using data services. Moreover, users also need to worry about how to scale up the storage in multiple dimensions like the number of databases, the number of tables, the amount of data in a table, frequency of requests, types of requests (read/write ratio).
Consequently, choosing the right data model for a given problem is no longer trivial, and such a choice needs a clear understanding of different storage offerings, their similarities, differences, as well as associated tradeoffs. We faced the same problem while designing the data interfaces for Stratos Platform as a Service (SaaS) offering, and in this talk, we would like to share our findings and experiences of that work. We will present a survey of different data models, their differences as well as similarities, tradeoffs, and killer apps for each model. We believe the participants will walk away with a border understanding about data models and guidelines on which model to be used when.
This is a ppt from Open Source Bridge that Thomas used for his session. This basically educates on why redundant power and back up power is so critical, and why you should always back up your info.
Storage Systems for High Scalable Systems Presentationandyman3000
Presentation from http://www.hfadeel.com/Blog/?p=151 what kind of storage systems players like Facebook or Google use for their extreme scalability requirements
Creating an RAD Authoratative Data Environmentanicewick
Sharing data in agencies can be a burden, with users placing data on numerous desktop packages, the idea of sharing becomes impossible. However, new RAD tools allow quick web applications to be developed to replace the Excel, MSAcces, and Filemaker data stores, with real , controlled authoritative database integration.
This presentation defines both the problem space, and the proposed solution.
See www.data4USA.com for more information
As relational and NoSQL database continue to adopt characteristic of each other, it becomes more important to understand that ACID-BASE is a spectrum. Instead of making a binary choice between ACID and BASE, developers and designers choose a combination of varying levels of data consistency, availability and network partition tolerance. This presentation briefly describes the ACID-BASE spectrum, the CAP Theorem and how to find the right balance of trade-offs for your application.
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
In this O'Reilly webcast, Ben Sharma (cofounder and CEO of Zaloni) and Vikram Sreekanti (software engineer in the AMPLab at UC Berkeley) discuss the value of collecting and analyzing metadata, and its potential to impact your big data solution and your business.
Watch the replay here: http://oreil.ly/28LO7IW
Book: Software Architecture and Decision-MakingSrinath Perera
Uncertainty is the leading cause of mistakes made by practicing software architects. The primary goal of architecture is to handle uncertainty arising from user cases as well as architectural techniques. The book discusses how to make architectural decisions and manage uncertainty. From the book, You will learn common problems while designing a system, a default solution for each, more complex alternatives, and 5Q & 7P (Five Questions and Seven Principles) that help you choose.
Book, https://amzn.to/3v1MfZX
Blog: http://tinyurl.com/swdmblog
Six min video - https://youtu.be/jtnuHvPWlYU
Storage Systems for High Scalable Systems Presentationandyman3000
Presentation from http://www.hfadeel.com/Blog/?p=151 what kind of storage systems players like Facebook or Google use for their extreme scalability requirements
Creating an RAD Authoratative Data Environmentanicewick
Sharing data in agencies can be a burden, with users placing data on numerous desktop packages, the idea of sharing becomes impossible. However, new RAD tools allow quick web applications to be developed to replace the Excel, MSAcces, and Filemaker data stores, with real , controlled authoritative database integration.
This presentation defines both the problem space, and the proposed solution.
See www.data4USA.com for more information
As relational and NoSQL database continue to adopt characteristic of each other, it becomes more important to understand that ACID-BASE is a spectrum. Instead of making a binary choice between ACID and BASE, developers and designers choose a combination of varying levels of data consistency, availability and network partition tolerance. This presentation briefly describes the ACID-BASE spectrum, the CAP Theorem and how to find the right balance of trade-offs for your application.
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
In this O'Reilly webcast, Ben Sharma (cofounder and CEO of Zaloni) and Vikram Sreekanti (software engineer in the AMPLab at UC Berkeley) discuss the value of collecting and analyzing metadata, and its potential to impact your big data solution and your business.
Watch the replay here: http://oreil.ly/28LO7IW
Book: Software Architecture and Decision-MakingSrinath Perera
Uncertainty is the leading cause of mistakes made by practicing software architects. The primary goal of architecture is to handle uncertainty arising from user cases as well as architectural techniques. The book discusses how to make architectural decisions and manage uncertainty. From the book, You will learn common problems while designing a system, a default solution for each, more complex alternatives, and 5Q & 7P (Five Questions and Seven Principles) that help you choose.
Book, https://amzn.to/3v1MfZX
Blog: http://tinyurl.com/swdmblog
Six min video - https://youtu.be/jtnuHvPWlYU
We have critically evaluated how AI will shape integration use cases, their feasibility, and timelines. Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, is the methodology of our study.
We observe that AI can significantly impact integration use cases and identify 13 AI-based use case classes for integration. Points to note include:
Enabling AI in an enterprise involves collecting, cleaning up, and creating a single representation of data as well as enforcing decisions and exposing data outside, each of which leads to many integration use cases. Hence, AI indirectly creates demand for integration.
AI needs data, which in some cases lead to significant competitive advantages. The need to collect data would drive vendors to offer most AI products in the cloud through APIs.
Due to lack of expertise and data, custom AI model building will be limited to large organizations. It is hard for small and medium size organization to build and maintain custom models.
The Role of Blockchain in Future IntegrationsSrinath Perera
We have critically evaluated blockchain-based integration use cases, their feasibility, and timelines. Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, is the methodology of our study. Based on our analysis, we observe that blockchain can significantly impact integration use cases.
In our paper, we identify 30-plus blockchain-based use cases for integration and four architecture patterns. Notably, each use case we identified can be implemented using one of the architecture patterns. Furthermore, we also discuss challenges and risks posed by blockchains that would affect these architecture patterns.
Our webinar presents a critical analysis of serverless technology and our thoughts about its future. We use Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, as the methodology of our study. Based on our analysis, we believe that serverless can significantly impact applications and software development workflows.
We’ve also made two further observations:
Limitations, such as tail latencies and cold starts, are not deal breakers for adoption. There are significant use cases that can work with existing serverless technologies despite these limitations.
We see a significant gap in required tooling and IDE support, best practices, and architecture blueprints. With proper tooling, it is possible to train existing enterprise developers to program with serverless. If proper tools are forthcoming, we believe serverless can cross the chasm in 3-5 years.
A detailed analysis can be found here: A Survey of Serverless: Status Quo and Future Directions. Join our webinar as we discuss this study, our conclusions, and evidence in detail.
1. Blockchain potential impact is real. If successful, Blockchain technologies can transform the way we live our day to day lives.
2. We believe technology is ready for limited applications in Digital Currency, Lightweight financial systems, Ledgers (of identity, ownership, status, and authority), Provenance (e.g. supply chains and other B2B scenarios) and Disintermediation, which we believe will happen in next three years.
3. However, with other use cases, blockchain faces significant challenges such as performance, irrevocability, need for regulation and lack of census mechanisms. These are hard problems and
4. It is not clear whether blockchain can sustain the current level of effort for extended period of 5+ years. There are many startups and they run the risk of running out of money before markets are ready. Failure of startups can inhibit further funding and investments.
5. Value and need of decentralization compared to centralized and semi-centralized alternatives is not clear.
A Visual Canvas for Judging New TechnologiesSrinath Perera
In the fast-changing technology world, the technology landscape shifts faster and faster. The agents of thses changes are new emerging technologies, which sometimes even create, destroy, or transform segments. In a shifting world, prevailing advantages are fleeting. Organizations that can master change and ride technology waves owns the future.
Not all emerging technologies live up to their promise. Every year, as a part of annual planning, most organizations need to decide relevance, impact, and the probability of success of emerging technologies and pick their bets. Although it is a regular decision there is no widely accepted framework for evaluating emerging technologies.
As a solution to this problem, we present “Emerging Technology Analysis Canvas” (ETAC), a framework to assess an individual emerging technology as a solution to this problem. Inspired by the Business Model Canvas, It represents different aspects of technology visually on a single page. This approach includes a set of questions that probe the technology arranged around a logical narrative. The visual representation is concise, compact, and comprehensible in a glance.
The talk discusses how analytics can attack privacy and what we can do about it. It discusses the legal responses (e.g. GDPR) as well technical responses ( differential privacy and homomorphic encryption).
The video is in https://www.facebook.com/eduscopelive/videos/314847475765297/ from 1.18.
Blockchain is often cited as one of the most impactful technology along with AI. It has attracted many startups, venture investments, and academic research. If successful, Blockchain technologies can transform the way, we live our day to day lives.
However, blockchain faces significant challenges such as performance, irrevocability, need for regulation and lack of census mechanisms. They are hard problems, and likely it will take at least 5-10 years to find answers to those problems.
Given the risk involved as well as the significant potential returns, we recommend a cautiously optimistic approach for blockchain with the focus on concrete use cases.
Today's Technology and Emerging Technology LandscapeSrinath Perera
We have seen the rise and fall of many technologies, some disappearing without a trace while others redefining the world. Collectively they have shaped our world beyond recognition. In this talk, Srinath will start with past technologies exploring their behavior. Then he will explore current middleware landscape, its composition, and relationships between different segments. He will discuss significant developments and discuss their future. Further, he will discuss emerging technologies, forces that shape them, and the promise of each technology, and finally, speculate about their evolution. You will walk away with knowledge on the evolution of middleware, the status quo, and discussion about how, at WSO2, we think those technologies will evolve.
Some died, some get by, but some have woven themselves to today's middleware so much that we do not notice them. The point I want to make is that not all emerging technologies are fads. Some are, and some are too early, like AI. But some are lasting.
The Rise of Streaming SQL and Evolution of Streaming ApplicationsSrinath Perera
First-generation stream processors, such as Apache Storm, wanted us to write code. It was a great start. However, when building real-world apps, which are used for a long time and evolve, writing code gets us into trouble.
If we want to query a database or query data stored in storage with Hadoop, we use SQL. Why can't we query data streaming using SQL? We can. Almost all open source stream processors, including Storm, Flink, and Kafka, have switched to SQL.
In this webinar, Srinath will talk about the evolution of stream processing, streaming SQL, the status quo, and what this means to stream applications. He will also dissect the experience of building streaming applications by exploring common patterns and pitfalls.
Analytics and AI: The Good, the Bad and the UglySrinath Perera
Analytics let us question the data, which in effect questions the world around us. This let us understand, monitor, and shape the world. AI let us discover connections, predict the possible futures and automate tasks.
These twin technologies can change the world around us. On one hand, make us efficient, connected, and fulfilled. At the same time, the change of status quo can replace jobs, affect lives and build biases into our systems that can marginalize millions.
In this talk, we will discuss core ideas behind analytics and AI, their possible impact, both good and bad outcomes, and challenges.
The dawn of digital businesses is upon us, with reimagined business models that make the best use of digital technologies such as automation, analytics, integration and cloud. Digital businesses are efficient, continuously optimizing, proactive, flexible and are able to fully understand their customers. Analytics is a key technology that helps in doing so. It acts as the eyes and ears of the system and provides a holistic view on the past and present so that decision-makers can predict what will happen in the future. This webinar will explore
Why becoming a digital business is not a choice
The role of analytics in digital transformation with examples
How best to leverage state of the art analytics technology
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
This talk discusses Outline of the state of the art of Enterprise Software and how we get there, as I see it. Also second part describes Ballerina, a new programming language WSO2 has built for Enterprise Computing.
It is presented as a Keynote at 11th Symposium and Summer School On Service-Oriented Computing.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Finding the Right Data Solution for Your Application in the Data Storage Haystack
1. Finding the Right Data Solution
for Your Application in the Data
Storage Haystack
Srinath Perera Ph.D.
Senior Software Architect, WSO2 Inc.
Visiting Faculty, University of Moratuwa
Research Scientist, Lanka Software Foundation
2. In Search for right Data Models
§ There has been many data
models proposed (read
Stonebraker’s “What Goes
Around Comes Around” for
more details)
o Hierarchical (IMS): late 1960’s
and 1970’s
o Directed graph (CODASYL):
1970’s
o Relational: 1970’s and early
1980’s
o Entity-Relationship: 1970’s
o Extended Relational: 1980’s
o Semantic: late 1970’s and 1980’s
§ Database systems (SQL) together with transactions has been
the defacto data solution.
Copyright Greg Morss and licensed for reuse under CC License , http://www.geograph.org.uk/photo/990700
3. For many years, choice of data storage was
a easy one (use RDBMS)
Copyright by Alan Murray Walsh and licensed for reuse under CC License , http://www.geograph.org.uk/photo/1652880
4. Increasing Scale of Systems
§ However, the scale of systems
are changing due to
o Increasing user bases of
systems.
o Mobile devices, online presence
o Cloud computing and multicore
systems
§ Scaling up RDBMS
o Put it in a bigger machine
o Replicate (Cluster) the database to 2-3 more nodes. But the
approach does not scale up.
o Partition the data across many nodes (distribute, a.k.a. sharding).
However, JOIN queries across many nodes are hard, and
sometimes too slow. This often needs custom code and
configurations. Also transactions do not scale as well.
Copyright digitalART2 and licensed for reuse under CC License , http://www.flickr.com/photos/digitalart/2101765353/
5. CAP Theorem, Transactions, and Storage
§ RDBMS model provide two things
o Relational model with SQL
o ACID transactions – (Atomic,
Isolation, Consistent, Durable)
§ It was a classical one size fit all
solution, but it worked for a quite a
some time.
§ However, CAP theorem says that
you can not have it all.
o Consistency, Availability and Partition
Tolerance, pick two!
§ But there are many usecases that do not need all RDBMS
features, when those are dropped, systems could scale. (e.g.
Google Big Table)
§ However, to use them, one has to understand and utilize the
application specific behavior.
Copyright stephcarter and licensed for reuse under CC License , http://www.flickr.com/photos/stephcarter/541464462
6. NoSQL and other Storage Systems
§ Large internet
companies hit the
problem first, they
build systems that are
specific to their
problems, and those
systems did scale.
o Google Big table
o Amazon Dynamo
§ Soon many others followed, and most of them are free and
open source. Now there are couple of dozen
§ Among advantages of NoSQL are
o Scalability
o Flexible schema
o Designed to scale and support fault tolerance out of the Box
Copyright O hai :3 and licensed for reuse under CC License , http://www.flickr.com/photos/christigain/
5636887941
7. However, with NoSQL solutions, choosing a
data storage is no longer simple.
Copyright Philipp Salzgeber on and licensed for reuse under CC License http://
www.salzgeber.at/astro/pics/20081126_heart/index.html
8. Selecting the Right Data Solution
§ What are the right Questions to ask?
§ Categorize Answers for each question
§ Take different cases based on different answers and make
recommendations!
Copyright by Krzysztof Poltorak, and licensed for reuse under CC License.
http://www.fotocommunity.com/pc/pc/display/22077920
9. What are the right Questions?
o Types of data
- Structured, Semi-Structured,
Unstructured
o Need for Scalability
- Number of users
- Number of data items
- Size of files
- Read/Write ratio
o Types of Queries
- Retrieve by Key
- WHERE clauses
- JOIN queries
- Offline Queries
o Consistency
- Loose Consistency
- Single Operation Consistency
- Transactions
Copyright by romainguy, and licensed for reuse under CC License http://www.flickr.com/
photos/romainguy/249370084
10. 4Q > Types of Data > Unstructured Data
§ Data do not have a particular
structure, often retrieved
through a key (name).
o E.g. File systems.
§ Humans are good in processing
unstructured data, but
computers do not.
§ This data are often stored in storage but consumed by humans
at the end of the pipeline. (e.g. Document repository)
§ One common use case is building structured data from
unstructured data
§ Often associate Metadata to help searching
Copyright Martyn Gorman and licensed for reuse under CC License, http://www.geograph.org.uk/photo/294134
11. 4Q > Types of Data > Structured Data
§ Have a structure and often described through a Schema
§ Often a table like 2D structure is used, but other structures
also possible.
§ Main advantage of the structure is search
§ Schema can be provided at
the deployment time or at the
runtime (dynamic schema)
§ Schema can be used to
o Validate data
o Support user friendly search
o Optimize storage and queries
Copyright Marion Doss by and licensed for reuse under CC License , http://www.flickr.com/
photos/ooocha/2611398859/
12. 4Q > Types of Data > Semi-structured Data
§ Structure is not fully defined.
But there is some inherent
structure.
§ For example
o XML documents, data are
stored in a tree like structure
o Graph data
o Data structures like lists and
arrays
§ Support queries based on
structure
§ But processing data often
needs custom code.
Copyright Walter Baxter http://www.geograph.org.uk/photo/1069339
13. 4Q > Search
§ Unstructured Data – no structure to support search.
o Search based on an reverse index
o Search through Properties
§ Semi-Structured Data
o To search XML, Xpath or XQuery (Any tree like structure).
o Tuple spaces can be queried through tuple space templates
o Data registries can be searched for entries that matches with given
Metadata descriptions (search by properties)
o Graph’s can be queried based on connectivity
§ Structured Data
o Retrieve by Key
o WHERE clauses
o Queries with JOINs
o Offline Queries
Copyright bydigitalART2 and licensed for reuse under CC License ,
http://www.flickr.com/photos/digitalart/2101765353/
14. 4Q > Consistency and Scalability
§ Scalability – this is ability to
handle more users, data, or
larger files by adding more
nodes. We will have 3 categories.
o Small systems (can handle with 1-3
nodes)
o Scalable systems (can handle with
about 10 nodes)
o Highly scalable systems (anything
larger, can be 100s or 1000s of Copyright NNSANews and licensed for reuse under CC
nodes) License , http://www.flickr.com/photos/nnsanews/
5347287260/
§ Consistency – this is how to keep the replicas of same data
in many nodes synced up (e.g. replicas) how they can be
updated without data corruptions. We will have 3 categories.
o Transactional – series of operations updated in ACID manner
o Atomic operation – single operation, updated in all replicas
o Eventual consistency - data will be eventually consistent
16. Data Storage Implementations
§ Expectations from data storages
o Reliably store the data
o Efficient search and retrieval of data whenever needed
o Data management – delete, update data
Copyright Stephen Eckert and licensed for reuse under CC License , http://www.flickr.com/
photos/s_eckert/5378588233
17. Challenges of Data Storage
§ Reliability
o Replicating data
o Creating backup or recovering using backups
§ Security
§ Scaling and Parallel access
o Distribution or replications
o ACID transactions
§ Availability
o Data replications
§ Vendor lock-in
o Interoperability, standard query languages
§ Simple use experience
o Hide the physical location of data,
o Provide simple API and security models
o Expressive query languages.
18. Data Storage Choices
Queries
Join Transactio Flexible
Storage Type Advantages Disadvantages Key Where s ns Scale schema
No unless
Local memory Very fast Not durable Yes No No STMs No Yes
Rigid schema,
good for read
oriented Moder
Relational/ SQL Standardized usecases. Yes Yes Yes Yes ate No
Column High write Not Yes,
families performance, transactional, secondar
(NoSQL ) replicated no-online joins Yes y index No No High Yes
High write Not
Documents performance, transactional, Yes,
DBs replicated no-online joins Yes views No No Yes Yes
Easy to integrate
with
Object Struct programming
Databases ured languages Yes Yes Yes Yes No No
19. Queries trans
Disadvanta action Flexible
Storage Type Advantages ges Key Search s Scale schema
No
structured
Save big files whose search on
Files format not understood content Yes Indexing No Moderate Yes
Data
Registries/ Metadata search Property
Metadata Unstru based search
Catalogs ctured Yes (Where) No Moderate Yes
Representation of flow
of messages over
Queues time/ Tasks Yes N/A No Yes Yes
Used to inference, very
Triple fast relationship Relationship
Stores processing Yes search No No Yes
XML XPath/
database XML native XQuery
Distributed
Cache Fast, replicated No search Yes No No Yes Yes
Model is too
simple in
some
High write cases, not
Key-value performance, transactiona
pairs replicated l Yes No No Yes Yes
Semi- Very fast joins, natural
structur to represent Not very
Graph DBs ed relationships, scalable Yes Graph Search Yes Low N/A
21. How should We do this?
Copyright Brian
Robert Marshall and
licensed for reuse
under CC License ,
http://
www.geograph.org.uk
/photo/938546
§ Consider structured, semi-structured, and unstructured
separately.
o Then drill down based on other 3 properties: scale, consistency,
and search.
§ Structured case is more complicated, other two are bit
simpler.
§ Start by giving a defacto for each case
22. Handling Structured Data
§ There are three main considerations: scale, consistency
and queries
Small (1-3 nodes) Scalable (10 nodes) Highly Scalable (1000s
nodes)
Loose Operat ACID Loose Operat ACID Loose Operat ACID
Consist ion Transa Consi ion Transa Consi ion Transa
ency Consi ctions stency Consi ctions stency Consi ctions
stency stency stency
Primary DB/ KV/ DB/ DB KV/CF KV/CF Partitio KV/CF KV/CF No
Key CF KV/ CF ned
DB?
Where DB/ CF/ DB/ DB CF/ CF/ Partitio CF/ CF/ No
Doc CF/ Doc(?) Doc (?) ned Doc Doc
Doc DB?
JOIN DB DB DB ?? ?? ?? No No No
Offline DB/CF/ DB/CF/ DB/CF/ CF/ CF/ No CF/ CF/ No
Doc Doc Doc Doc Doc Doc Doc
*KV: Key-Value Systems, CF: Column Families, Doc: document based Systems
23. Handling Small Scale Systems (1-3 nodes)
Small (1-3 nodes) § In general using DB here for
every case might work.
Loose Operati ACID
Consi on Transa § Reason for using options
stency Consist ctions other than DB
ency o When there is potential need
Primary DB/ DB/ KV/ DB to scale later.
Key KV/ CF CF o High write throughput
Where DB/ DB/ DB § KV is 1-D where as other two
CF/ CF/Doc
Doc
are 2D
JOIN DB DB DB
Offline DB/ DB/CF/ DB/CF/
CF/ Doc Doc
Doc
*KV: Key-Value Systems, CF: Column
Families, Doc: document based
Systems
24. Handling Scalable Systems
Scalable (10 nodes) § KV, CF, and Doc can easily
handle this case.
Loose Operati ACID § If DBs used with data shredded
Consi on Transa
stenc Consist ctions across many nodes
y ency o Transactions might work given that
Primary KV/CF KV/CF Partition participants on one transaction are
Key ed DB? not too many.
Where CF/ CF/Doc Partition
o JOINs might need to transfer too
Doc ed DB? much data between nodes.
o Also should consider in Memory
JOIN ?? ?? Partition
ed DBs like Vault DB.
DB?? § Offline mode will work.
Offline CF/ CF/Doc No § Most systems let users choose
Doc
consistency, and loose
*KV-Key-Value Systems, CF-Column
consistency can scale more.
Families, Doc- document based Systems (e.g. Cassandra)
25. Highly Scalable Systems
§ Transactions do not work in
Highly Scalable (1000s
nodes) this scale. (CAP theorem).
Loose Operati ACID § Same for JOINs. The problem
Consis on Transac is sometime too much data
tency Consist tions
ency needs to be transferred
Primary KV/CF KV/CF No
between nodes to perform the
Key JOIN.
Where CF/Doc CF/Doc No § Offline case handled through
Map-Reduce. Even JOIN
JOIN No No No case is OK since there is
time.
Offline CF/Doc CF/Doc No
*KV: Key-Value Systems, CF: Column
Families, Doc: document based
Systems
26. Highly Scalable Systems + Primary Key Retrieval
Highly Scalable (1000s § This is (comparatively) the
nodes) easy one.
Loose Operat ACID § Can be solved through
Consis ion Transa
tency Consis ctions
DHT (Distributed Hash
tency table) based solutions or
Primar KV/CF KV/CF No architectures like
y Key OceanStore.
Where CF/Doc CF/Doc No § Both Key-Value storage
(?) (?)
(KV) and Column Families
JOIN No No No
(CF) can be used. But
Key-Value model is
Offline CF/Doc CF/Doc No
preferred as it is more
scalable.
*KV-Key-Value Systems, CF-Column
Families, Doc- document based
Systems
27. Highly Scalable systems + WHERE
Highly Scalable (1000s § This Generally OK, but tricky.
nodes)
§ CF work through a Secondary
Loose Operat Transa
Consis ion ctions index that do Scatter-gather
tency Consis (e.g. Cassandra).
tency
§ Doc work through Map-
Primar KV/CF KV/CF No
y Key Reduce views (e.g.
Where CF/Doc CF/Doc No
CouchDB)
(?) (?) § There is Bissa, which build a
JOIN No No No index for all possible queries
(No range queries)
Offline CF/Doc CF/Doc No § If you are doing this, you
should do pilot runs and
*KV-Key-Value Systems, CF-Column make sure things work.
Families, Doc- document based
Systems
28. Handling Unstructured Data
§ Storage Options
o Distributed File systems - generally scalable (e.g. NSF), but HDFS
(Hadoop) and Lustre are highly scalable versions.
o Metadata registries (e.g. Niravana, SDSC Resource Broker)
29. Handling Semi-Structured Data
Small Scale (1-3 Scalable (10 nodes) Highly
nodes) Scalable
XML (Queried XML DB or convert XML DB or convert to a ??
through XPath) to a structured structured model
model
Graphs Graph DBs Graph DBs if graph can ??
be partitioned
Data Structures Data Structure
Servers, Object
Databases
Queues Distributed Distributed Queues Distributed
Queues Queues
!
§ Storage Options
o Answer depends on the type of structure. If there is a server
optimized for a given type, it is often much more efficient than
using a DB. (e.g. Graph databases can support fast relationship
search)
§ Search
o Very much custom. E.g. XML or any tree = Xpath, Graph can
support very fast relationship search
30. Hybrid Approaches
§ Some solutions have many types
of data and hence need more than
one data solution (hybrid
architectures).
§ For example
o Using DB for transactional data and
CF for other data.
o Keeping metadata and actual data
separate for large data archives.
o Use GraphDB to store relationship
data while other data is in Column
Family storage. Copyright Matthew Oliphant by and licensed for
§ However, if transactions are reuse under CC License , http://www.flickr.com/
photos/fajalar/3174131216/
needed, transactions have to be
handled outside storage (e.g.
using Atomikos Zookeeper ).
31. Other parameters
§ Above list is not exhaustive, and there are other
parameters
o Read/ Write ratio – when high it is easy to scale
o High write throughput
o Very large data products – you will need a file system. May be
keep metadata in Data registry and store data in a file system.
o Flexible Schema
o Archival usecases
o Analytical usecases
o Others …
32. Take Home Message is ..
There is no silver
bullet
You have to use
right too for the
job
Copyright eschipul, Siomuzzz and licensed for
reuse under CC License , http://www.flickr.com/
photos/eschipul/4160817135 and http://
www.flickr.com/photos/siomuzzz/2577041081
33. Sample Polyglot Architectures
PaaS Structured Structured Unstructured
(Relational) (NOSQL)
WSO2 Stratos MySQL based Cassandra as HDFS as a
RDB as a a Service Service
Service
Azure MSSQL as a MS NoSQL
Service storage
AppEngine Hosted BigTable
MySQL
Our work on Data Solutions for WSO2 Stratos
motivated this work.
You can try out WSO2 Stratos Data offerings from
https://data.stratoslive.wso2.com/home/index.html
34. Conclusion
§ For last 20 years or so, DBMS were the de facto storage
solution
§ However, DBMS could not scale well, and many NoSQL
solutions have been proposed instead
§ As a results. it is no longer easy to find the best data
solution for your problem.
§ We discussed may dimensions (types of data, scalability,
queries, and consistency) and provided guidelines on when
to use which data solution.
§ Your feedback and thoughts are most welcome .. Contact
me through srinath@wso2.com