The document discusses common architectural choices for building large analytic systems to handle emerging hardware, software, and data volume needs. It outlines storage and processing principles like using a massively parallel processing (MPP) cluster architecture with distributed data storage by value rather than chunks, column-oriented storage, immutable write-once storage, and processing techniques that trade CPU for I/O bandwidth and bring processing to the data. The document also introduces Vertica's community edition.
Real questions for Network Appliance NS0-156 Data ONTAP Cluster-Mode Administrator Exam from pass4sure with unlimited access of 2500+ Exams for Life time. Pass your NS0-156 Network Appliance Specialist Exam with 100% Guaranteed or we will refund your Money.
Real questions for Network Appliance NS0-156 Data ONTAP Cluster-Mode Administrator Exam from pass4sure with unlimited access of 2500+ Exams for Life time. Pass your NS0-156 Network Appliance Specialist Exam with 100% Guaranteed or we will refund your Money.
The IT industry has shifted from internal storage, external storage and finally networked storage. Now, some companies are exploring going backwards to new forms exploiting external storage and internal storage. This session covers IBM's foray into the world of converged and hyper-converged systems.
S ss0885 spectrum-scale-elastic-edge2015-v5Tony Pearson
IBM Spectrum Scale offerings include the Spectrum Scale software that you can deploy on your own choice of hardware, Elastic Storage Server and Storwize V7000 Unified pre-built systems.
DB2 10 & 11 for z/OS System Performance Monitoring and OptimisationJohn Campbell
This is a "One day Seminar -ODS " . The objectives of this ODS are to focus on key areas
• System address space CPU, EDM pools, data set activity, logging, lock/latch contention, DBM1
virtual and real storage, buffer pools and GBP,…
• Identify the key performance indicators to be monitored
• Provide rules-of-thumb to be applied
• Typically expressed in a range, e.g. < X-Y
• If <x,>Y, need further investigation and tuning - RED
• Boundary condition if in between - AMBER
• Investigate with more detailed tracing and analysis when time available
• Provide tuning advice for common problems
S cv3179 spectrum-integration-openstack-edge2015-v5Tony Pearson
IBM is a platinum sponsor of OpenStack, and is the #1 ranked vendor of Software Defined Storage. This session explains how its Spectrum Storage family of products support Glance, Cinder, Manila, Swift and Keystone interfaces of OpenStack.
The Pendulum Swings Back - Understanding Converged and Hyperconverged Integrated Systems, presented Oct 17, 2017 at IBM Systems Technical University, New Orleans LA
The IT industry has shifted from internal storage, external storage and finally networked storage. Now, some companies are exploring going backwards to new forms exploiting external storage and internal storage. This session covers IBM's foray into the world of converged and hyper-converged systems.
S ss0885 spectrum-scale-elastic-edge2015-v5Tony Pearson
IBM Spectrum Scale offerings include the Spectrum Scale software that you can deploy on your own choice of hardware, Elastic Storage Server and Storwize V7000 Unified pre-built systems.
DB2 10 & 11 for z/OS System Performance Monitoring and OptimisationJohn Campbell
This is a "One day Seminar -ODS " . The objectives of this ODS are to focus on key areas
• System address space CPU, EDM pools, data set activity, logging, lock/latch contention, DBM1
virtual and real storage, buffer pools and GBP,…
• Identify the key performance indicators to be monitored
• Provide rules-of-thumb to be applied
• Typically expressed in a range, e.g. < X-Y
• If <x,>Y, need further investigation and tuning - RED
• Boundary condition if in between - AMBER
• Investigate with more detailed tracing and analysis when time available
• Provide tuning advice for common problems
S cv3179 spectrum-integration-openstack-edge2015-v5Tony Pearson
IBM is a platinum sponsor of OpenStack, and is the #1 ranked vendor of Software Defined Storage. This session explains how its Spectrum Storage family of products support Glance, Cinder, Manila, Swift and Keystone interfaces of OpenStack.
The Pendulum Swings Back - Understanding Converged and Hyperconverged Integrated Systems, presented Oct 17, 2017 at IBM Systems Technical University, New Orleans LA
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn. This was a presentation made at QCon 2009 and is embedded on LinkedIn's blog - http://blog.linkedin.com/
In this lecture we analyze key-values databases. At first we introduce key-value characteristics, advantages and disadvantages.
Then we analyze the major Key-Value data stores and finally we discuss about Dynamo DB.
In particular we consider how Dynamo DB: How is implemented
1. Motivation Background
2. Partitioning: Consistent Hashing
3. High Availability for writes: Vector Clocks
4. Handling temporary failures: Sloppy Quorum
5. Recovering from failures: Merkle Trees
6. Membership and failure detection: Gossip Protocol
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Lars Marowsky-Brée
A presentation discussing various aspects that affect performance of Ceph clusters, and how to map, model, and predict their performance.
This lays the groundwork for building a Ceph cluster measurement and benchmark suite that eventually will build up a data corpus on performance characteristics that can be used to answer these key questions:
- How to build a storage system that meets my requirements?
- If I build a system like this, what will its characteristics be?
- If I change XY in my existing system, how will its characteristics change?
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
The Art of the Pitch: WordPress Relationships and Sales
Xldb2011 wed 1415_andrew_lamb-buildingblocks
1. Building Blocks for Large
Analytic Systems
Andrew Lamb (alamb@vertica.com)
Vertica Systems, an HP Company
Oct 19, 2011
VERTICA CONFIDENTIAL DO NOT DISTRIBUTE
2. Outline
• Emergence of new
hardware, software and
data volume is driving the
wave of new analytic
systems in spite of mature
existing systems.
• Common architectural
choices when building these
new systems
• Choices we made when
building the Vertica Analytic
Database
• Broadly applicable (and
widely used) 2
3. Architectural Changes
• Driven by changing
– Hardware: (really) cheap
x86_64 servers, high
capacity disk, high speed
Better-ness
networking
– Software: Linux, Open
Source
– Requirements: Data Deluge
• Hard for Legacy Software
– Compatibility is brutal
– Linux x86_64 today
Time
– Solaris, HPUX, IRIX, etc. 10
years ago
• Storage Organization and Processing Principles
3
4. [Storage + Processing] MPP (no shared disk)
SMP Server
CP CP CP CP
U U U U
• Any modern system should run on a CP CP CP CP
U U U U
cluster of nodes, scale up/down
System Bus
• Mid range servers are really cheap
Main Memory/Cache
(to rent or own)
• Aggregate available resources are
enormous and scalable: Shared
Disk
– I/O Bandwidth (Disk and Network)
– Cores, Memory, etc.
Network
MPP Servers
Local Disk
4
5. [Storage] Keep Your Data Sorted
• For large data sets, extra indexes
are expensive to maintain
• Much better to index the data itself
by sorting it
• Easy to find what you are looking
for, reduces seeks
5
6. [Storage] Distribute Data by Value (not chunks)
• “Sharding” -- Distribute data so you can easily find it
again (not round robin)
• Segmenting in analytics layer simplifies app layer
• Round Robin computations won't scale: need to swizzle
data around the cluster to do most useful thing
• Replication for high availability: not logs
B A C B A C B A C
2 2 2 1 1 1 3 3 3
A B C A B C A B C
3 3 3 2 2 2 1 1 1
6
7. [Storage] Store the Data in Columns
• Rarely do all fields of a
data set appear in
analytic queries
• Really don't want to
waste I/O for data you
don't need
• Columns let you be
clever about applying
predicates individually,
further reducing I/O
• Not appropriate for row
lookup or transaction
processing systems 7
8. [Storage] Write it Once, Don't Modify
• Physical use of disk
objects should be write
once (append only)
• System should present the
illusion of mutability
• Immutable storage
drastically simplifies
coherence in a distributed
system
8
9. [Processing] Use Large Sequential IOs
• Spinning disks are very good at large sequential IOs
• You really don't want to whipsaw the read head
(another reason why secondary indexes are bad)
• Especially useful with sorted data
40 Random vs Sequential Reads
35
30
25
MB/s
20
15
10
Random
Sequential
5
0
9
10. [Processing] Trade CPU for I/O Bandwidth
• Use very aggressive compression, even if it seems/is
wasteful (keep getting more cores)
• Example: data type specific encoding, then LZO before
actually writing to disk
• Hide additional
latency with
execution
pipeline
parallelism
10
11. [Processing] Mess with the Data where it Lives
• Bring your processing to the data, not data to the
processing
• System should push computation down close to data
(even if calculation turns out to be redundant)
• Example: multi-phase
aggregation, each phase
tries to aggregate
intermediates before passing
up the memory / node
hierarchy
11
12. [Processing] Declarative & Extendable
• Give users a declarative query language for most tasks
• Writing procedural code for simple queries is wasteful
• Provide procedural extensions for complex analysis
12
13. Conclusion
• Emergence of new hardware, software and data
volumes implies certain architectural choices in
modern big data analytic systems.
• Commonly observed in new systems
• Vertica (unsurprisingly) features all of them
13
14. Introducing Community Edition
• Free version of Vertica
– help steward better analytics and the democratization of data-
closed source, but open access!
• All of the features and advanced analytics of Vertica
Enterprise Edition
• Seamless upgrade to Enterprise Edition
• Limited to 1 TB raw data on 3-node hardware cluster
• Revamping Community area of Vertica’s website for
knowledge sharing, third party tools, and code sharing
• Launching academic and non-profit research use
program as well
• Sign up at: www.vertica.com/community
14