CallFire is a cloud telephony provider that has experienced exponential growth, resulting in big data challenges. They considered various solutions like NFS, Gluster, HDFS, and Cassandra for scalable storage. Cassandra was selected because it met their requirements for fault tolerance, easy datacenter replication, scalable storage, and no single point of failure. CallFire uses a sharded Cassandra cluster with Tungsten Replicator to enable both NoSQL and SQL access to their universal data. While big data is commonly seen as a problem only for large companies, CallFire's experience shows how even small startups can rapidly accumulate large amounts of data.
Some slides on the original design of RAID, a Redundant Array of Inexpensive Disks. Demonstrates the tradeoffs between the varying RAID levels and gives some historical context.
Make Life Suck Less (Building Scalable Systems)guest0f8e278
This presentation was given at LinkedIn. It is a collection of guidelines and wisdom for re-thinking how we do engineering for massively scalable systems. Useful for anyone who cares about Big Data, Distributed Computing, Hadoop, and more.
This talk will feature: memcache, resque, a bit of metaprogramming, a look at caching in the wild and code that fixes some usual problems, and a fairly epic SQL query with some nice Postgres features you should know about.
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
What makes an application scale? What should you worry about early on and what can wait?
Over the last 3 years, Achievers has learned many lessons and gained fundamental knowledge on scaling our SaaS platform. CTO Dr. Aris Zakinthinos will present and discuss the decisions we’ve made including language choice, server architecture, and much more; join us while we share tips, tricks, and things to absolutely avoid.
Throughout the evening you will have the opportunity to talk to the development team behind the Achievers Platform and ask questions on scaling best practices.
Scaling with sync_replication using Galera and EC2Marco Tusa
Challenging architecture design, and proof of concept on a real case of study using Syncrhomous solution.
Customer asks me to investigate and design MySQL architecture to support his application serving shops around the globe.
Scale out and scale in base to sales seasons.
Some slides on the original design of RAID, a Redundant Array of Inexpensive Disks. Demonstrates the tradeoffs between the varying RAID levels and gives some historical context.
Make Life Suck Less (Building Scalable Systems)guest0f8e278
This presentation was given at LinkedIn. It is a collection of guidelines and wisdom for re-thinking how we do engineering for massively scalable systems. Useful for anyone who cares about Big Data, Distributed Computing, Hadoop, and more.
This talk will feature: memcache, resque, a bit of metaprogramming, a look at caching in the wild and code that fixes some usual problems, and a fairly epic SQL query with some nice Postgres features you should know about.
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
What makes an application scale? What should you worry about early on and what can wait?
Over the last 3 years, Achievers has learned many lessons and gained fundamental knowledge on scaling our SaaS platform. CTO Dr. Aris Zakinthinos will present and discuss the decisions we’ve made including language choice, server architecture, and much more; join us while we share tips, tricks, and things to absolutely avoid.
Throughout the evening you will have the opportunity to talk to the development team behind the Achievers Platform and ask questions on scaling best practices.
Scaling with sync_replication using Galera and EC2Marco Tusa
Challenging architecture design, and proof of concept on a real case of study using Syncrhomous solution.
Customer asks me to investigate and design MySQL architecture to support his application serving shops around the globe.
Scale out and scale in base to sales seasons.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Big data at CallFire
1. Big Data at CallFire
Vijesh Mehta (Co-Founder and CTO)
2. Agenda
• A little about CallFire
• CallFire’s technical challenges
• How CallFire deals with data
• Summary
3. Some background about myself
• I am one of the founders of CallFire.
– Started in 2005 in a small apartment
– Now 28 people
– Bootstrapped and profitable
• I’ve been writing software primarily in the
Java space for 12 years. CallFire is all
Java.
– We use : Wicket, Guice, Hibernate, MySQL,
Cassandra, ActiveMQ, XEN, Puppet
4. About CallFire
• We are a cloud telephony provider.
– Outbound Phone calls
– Phone Numbers
– SMS through long and short codes
– IVR – Interactive Voice Response
– Power Dialing
• CallFire’s call volume can get large very quickly.
– Hurricane Sandy : 1.9 million emergency calls
• 4 Engineers and 1 System admin managing
operations and new features.
• We just hired 7 more engineers this year, and still hiring!
5. Technical Challenges by Numbers
• 1.4 billion calls and texts
– Growing exponentially
• Over 50,000 accounts
• Over 6 million campaigns
• 80 million sound files
• 14 TB in storage (NFS)
• MySQL : Over 10,000 qps at peak
Big data isn’t always big company problem!
6. Growing faster each day
Campaigns
over
Time
7000000
6000000
5000000
4000000
3000000
2000000
1000000
0
7. The first challenge
• Problem : We outgrew our datacenter. New
systems need access to central storage.
Replication across a 1gb/s interconnect.
• Needed Solution:
– Must work across datacenter
– Must scale as demand increases
– Must be fault tolerant
– Must deal with over 80 million sound files
– Cheaper the better
8. Solutions Considered (2010)
NFS
GLUSTER
HDFS
CASSANDRA
Fault
Tolerant
Yes,
if
configured
Yes
Yes
Yes
Datacenter
Maybe.
Rsync
isn’t
Not
at
the
Dme
Yes
Yes
Replica>on
fun
with
lots
of
files.
Easy
to
add
storage
No
Not
at
the
Dme
Yes
Yes
No
Single
point
of
No
Yes
Not
exactly,
Yes
failure
NameNode.
Data
always
No,
hard
to
sort
No,
same
as
a
file
Yes
Yes
accessible
easily
through
file
system
systems.
Notes
Not
working
for
us.
Looks
good,
tried
it
Didn’t
like
the
name
Everything
we
Too
much
for
a
while.
Easy
at
node
issue.
May
need,
quick
to
management
and
first
because
it
was
have
been
a
good
learn.
We
went
all
downDme.
a
file
system.
way
to
go.
in!
*
Only
LAN
soluDons
considered.
Calls
had
too
much
latency
in
the
cloud,
or
even
across
datacenter.
9. Cassandra
• Storage isn’t the best use of Cassandra.
• Do not exceed 50% of drive space.
– Compaction needs the space. Hard lesson learned.
• Fault Tolerance: Replication factor of 3.
• Result
• 1 TB of data = 6 TB of storage needed!
• CallFire has a 74TB Cassandra Cluster
10. Extending the scope
• We like SQL and Hibernate.
– Pros: Easy, Flexible, Ad-Hoc Queries, Locks
– Cons: Scaling
• Solution: Sharding with Cassandra for universal data
Shard
1
Shard
2
Shard
3
Cassandra
Cluster
11. Sharding + Big Data
• Cassandra makes sharding easier
– Easy to store universal data. (Authentication)
– Performs very well
• Tungsten Replicator (Big Data with SQL)
– Sharding makes joins impossible, so fan your
data into central places.
– NoSQL can’t handle ad-hoc queries. No
worries, you can still have SQL.
12. Big Data Summary
• Not Just for big companies, data grows rapidly in
todays environment.
– Nice article about Obama’s Data Crunchers:
– http://swampland.time.com/2012/11/07/inside-the-secret-world-of-quants-and-data-crunchers-who-helped-obama-win/
• NoSQL systems have easier scaling and fault
tolerance mechanisms.
– Not uncommon to see small teams with 10-20 node
clusters.
• SQL is still a big part of the equation. (Tungsten)
– Fan in information across partitions
– Replicate across datacenters
– Keep your ad-hoc dreams alive!