SlideShare a Scribd company logo
High Performance with Java
malduarte@gmail.com
Foreword
In the beginning was the Tao. The Tao gave birth
to Space and Time. Therefore Space and Time
are Yin and Yang of programming.
Programmers that do not comprehend the Tao are
always running out of time and space for their
programs. Programmers that comprehend the
Tao always have enough time and space to
accomplish their goals.
How could it be otherwise?
From www.canonical.org/~kragen/tao-of-programming.htm
What is High Performance?
•HitachiH8 8 bit cpu, 16 MHz
•32 kb Ram
2 X Sun SPARC Enterprise
M5000
6 Quad Core 2.4ghz - 6 MB L2
Cache,Sparc VII CPUs, 48 hw
threads, 32Gb RAM
Sources:
Sun Microsystems: www.sun.com/servers/midrange/m5000/
WikiPedia: en.wikipedia.org/wiki/Lego_Mindstorms
Aad van der Steen HPC Page - www.phys.uu.nl/~steen/web08/sparc.html
High Performance is all about
“Delivering solutions which meet
requirements within time and space
constraints using available resources
rationally”
The most important resource: brain time.
HW increases performance with time, brain
decreases performance with time.
Why Java?
• Mature technology
• Speedy and Stable VMs (those who were
burned in the early days still loath it,
though)
• Lots of high quality tools
• Lots of high quality available libraries
• Large ecosystem
• NOT the language itself 
GSM 101
Source: en.wikipedia.org/wiki/GSM
A small case study
• Goal: Analyse 17 G (gzip’ed) worth of
MSC Call Detail Records (CDRs in Mobile
Operator Lingo)
Snippet:
04|001|26806XXXXXXXXXX|3519XXXXXXXX|3519800049344611||||||
081105|002559|||00062|00|000-076|015-113||||MALM1
|0|01|9XXXXXXXX|11|||2|1|MICOUT|0|0||||||||||||||||||331985|268061011305482|B
AL10A|15|22|12402523|||||||||||||||||||||||||||||||02|||||100001011305482||3e3212003
4df00|||0|1|17|||1|||3||1|01|3519XXXXXXXX||||1|01|3519XXXXXXXX||25||||||0|01|
9XXXXXXXX|002559|081105|00062||2||5||||||||||||3|||||||||||||||||||||||||||||||||||||||||||||||||||
||||||||||||||||
Note: Sensitive information was hidden
A bit more info
• Aproximatly 170 G uncompressed
• Exactly 359 014 695 cdrs
• Trivia: about 3 days worth of GSM call
logs.
• Correlate CDRs with Customer information
• Peformance goal : running time must be
below one hour.
Performance Budget
Network
Bandwith
and
Latency
Disk
Bandwith
and
Latency
Memory CPU
If you don’t take a temperature you
can’t find a fever
• Measure the progress as the system is
implemented
• Make *honest* measurements. Prove
yourself wrong.
• Avoid premature optimization – How can
you know? If you’re within your
performance budget don’t worry
(*) Fat Man’s Law X – “House of God”
Samuel Shen - http://en.wikipedia.org/wiki/The_House_of_God
"The journey of a thousand miles starts
with a single step." Lao Tse
• Line read performance
1811229 Line Sample
Sample timmings:
real 0m13.872s
user 0m13.366s
sys 0m4.056s
ETA: ~45 minutes
I/O Tips
• Use Memory Mapped Files (see
FileChannel.map and MappedByteBuffer
APIS)
• Use Buffered I/O - BufferedInputStream
• Optimal buffer size multiple of OS page
size (usually 8k)
• If the process is I/O bound and have fast
CPUs, consider processing compressed
files
One more step
• Extract date of call and customer phone
number
04|001|268061100021547|3519XXXXXXXX|3519800049344611||||||
081105|002559|||00062|00|000-076|015-113||||MALM1
|0|01|9XXXXXXXX|11|||2|1|MICOUT|0|0||||||||||||||||||331985|2680610113
05482|BAL10A|15|22|12402523|||||||||||||||||||||||||||||||02|||||100001011305
482||3e32120034df00|||0|1|17|||1|||3||1|01|3519XXXXXXXX||||1|01|351
9XXXXXXXX||25||||||0|01|9XXXXXXXX|002559|081105|00062||2||5|||||||
|||||3|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Censored numbers to protect the innocent 
Split lines by columns
String fields[] = line.split("|");
Sample timmings:
real 1m0.670s
user 1m1.252s
sys 0m6.015s
ETA: 3 hours, 18 minutes
~ 6 x slower!!! Exceeded the performance budget
When in doubt, profile
~85% spent splitting fields!
Tune
String fields[] = split(line, '|', 3,10,11);
Sample timmings:
real 0m13.450s
user 0m13.425s
sys 0m3.965s
ETA: 44 minutes e 35 seconds
14 extra lines of java code and we’re back on track
Must get SIM card data
• SIM card Type (prepaid, postpaid, ...)
• ~ 15 million record table
• Database constantly under load
• 4000 querys/s (0.25 ms/q) spare capacity
Database Tips (JDBC)
– Reuse connections!
– Read only ? setReadOnly(true)
– Allways use PreparedStatements
– Allways explicitly close ResultSet (GC
friendly)
– Turn off autocommit
– Use batched operations and transactions in
CRUD type accesses
– Large ResultSets? Increase fetch size!
rs.setFetchSize(XXX)
Ooops
• Too slow!
• Assuming an average rate of 4000 q/s:
ETA: ~ 1 day, 56 minutes
Alternatives
• TimesTen
• SolidDb
In Memory
Databases
• H2
• Hsqldb
• Derby
Emebeded
Relational
• BerkeyleyDb
• Infinitydb
Others
Embebed
Must keep a balance
Performance
Cost,
Complexity,
Learning Curve
(aka neuron
Time),
Maintenance
Remebering old times
• In C/C++ you could map structs to
memory
• The amount of information needed is 16
bytes per SIM card (phone number, start
date, end date, type of card – 4 * 4 bytes)
• ~ 343 M if stored in a compact form (int[])
• Sort the data and wrap the array in a List
• Use Collections.binarySearch to do the
heavy lifting
Way faster!
• No extra libraries, 40 lines of simple java
code
ETA: 1 hour, 30 minutes e 35
seconds
Above the budget 
Put those extra cores to work
• 6 Quad Core 2.4ghz - 6 MB L2
Cache,Sparc VII CPUs, 48 hw threads,
32Gb RAM
• Split the data in work units
• Split the work units among the threads
• Collect the results when the treads finish
Concurrent tips
• Concurrent programming is really hard!
• But you’re not going to be able to avoid it
(cpu speed increases per core stalled,
cores are increasing in number)
• Don’t share R/W data among threads
• Locking will kill performance
• Be aware of memory architecture
java.sun.com/javase/6/docs/technotes/guide
s/concurrency/index.html
Mission Acomplished
• With 8 threads of the 48 possible
Real running time: 10 minutes,
23 seconds
Near linear scaling!
There’s no point in optimizing more. We’ve
just entered the Law of Diminishing returns
en.wikipedia.org/wiki/Diminishing_returns
What about Network I/O
• 1 thread per client using blocking I/O does
not scale
• Use Nonblocking I/O
• VM implementors will (problaby) use the
best API in the host OS (/dev/epoll in
Linux Kernel 2.6 for example)
• NBIO is hard. Don’t reinvent the wheel.
See Apache Mina - mina.apache.org
• Scales to over 10.000k connections easily!
A few extra tips
• Know your VM
• Not all VMs are created equal
• Even without changing a line of code you
can improve things, if you know what
you’re doing
• If you’re using the SUN VM try the Server
VM (default is Client VM)
• Plenty of options to fiddle
blogs.sun.com/watt/resource/jvm-options-
list.html
What about designing and maintaining
complex systems
• Implement a feature complete solution in
small scale
• Learn the performance characteristics.
Implement benchmarks.
• Change the architecture if needed
• How much does it cost? It’s all about
€€€€€ (licensing, hardware, human
resources, rack space, energy, cooling
requirements, maintenance,...)
Keep measuaring after the system
goes live
“The only man I know who behaves sensibly
is my tailor; he takes my measurements
anew each time he sees me. The rest go
on with their old measurements and
expect me to fit them.”
George Bernard Shaw -
en.wikiquote.org/wiki/George_Bernard_Shaw
• Specially if you keep adding features
Code snippets – A (way) faster split
public static String[] split(String l, char sep, int... columns) {
String[] fields = new String[columns.length];
int start = 0, column = 0, end, i = 0;
while((end = l.indexOf(sep, start)) != -1) {
if(column++ == columns[i]) {
fields[i] = l.substring(start, end);
if(++i == columns.length)
return fields;
}
start = end + 1;
}
if(column == columns[i])
fields[i] = l.substring(start);
return fields;
}
String fields[] = split(line, '|', 3,10,11);
Static in-memory “database”: Poor
man’s solution (but as fast as it gets)
public class ClientFile implements List<CardInfo>, RandomAccess {
static final int CLIENT_SIZE = 16;
int[] clients;
public ClientFile() throws FileNotFoundException, IOException {
File f = new File("clientes.db");
FileInputStream fs = new FileInputStream(f);
int client_count = (int)f.length() / CLIENT_SIZE;
clients = new int[client_count * 4];
byte b[] = new byte[(int) f.length()];
fs.read(b);
for(int i = 0;i != client_count; ++i) {
clients[i * 4] = toi(b, i * CLIENT_SIZE);
clients[i * 4 + 1] = toi(b, i * CLIENT_SIZE + 4);
clients[i * 4 + 2] = toi(b, i * CLIENT_SIZE + 8);
clients[i * 4 + 3] = toi(b, i * CLIENT_SIZE + 12);
}
}
// map byte[] to integer
public int toi(byte[] b, int offset) {
return ((0xFF & b[offset]) << 24) +
((0xFF & b[offset + 1]) << 16) +
((0xFF & b[offset + 2]) << 8) +
(0xFF & b[offset + 3]);
}
(…)
Static in-memory “database”:
(continued)
(…)
public CardInfo get(int index) {
return new CardInfo(clients[index * 4],
clients[index * 4 + 1],
clients[index * 4 + 2],
clients[index * 4 + 3]);
}
public CardInfo getCardInfo(String msisdn, String yymmdd, String hhmmss){
Calendar cal = Calendar.getInstance();
cal.set(i(yymmdd, 0, 1) + 2000, i(yymmdd, 2, 3) - 1, i(yymmdd, 4, 5),
i(hhmmss, 0, 1), i(hhmmss, 2, 3), i(hhmmss, 4, 5));
int idx = Collections.binarySearch(this,
new Key(i(msisdn),
(int)(cal.getTimeInMillis() / 1000)));
if (idx < 0) {
return null;
}
return get(idx);
}
Questions?
• Answers1 €
• Answers that require thought5 €
• Correct Answers20 €
• Dumb looksFor Free!

More Related Content

What's hot

Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Performance evaluation of apache tajo
Performance evaluation of apache tajoPerformance evaluation of apache tajo
Performance evaluation of apache tajo
Jihoon Son
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
Ozgun Erdogan
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
rgrebski
 
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
PostgreSQL-Consulting
 
Move Over, Rsync
Move Over, RsyncMove Over, Rsync
Move Over, Rsync
All Things Open
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep Dive
Red_Hat_Storage
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出
iammutex
 
Gluster the ugly parts with Jeff Darcy
Gluster  the ugly parts with Jeff DarcyGluster  the ugly parts with Jeff Darcy
Gluster the ugly parts with Jeff Darcy
Gluster.org
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
confluent
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
Jihoon Son
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 
Introduction to DRBD
Introduction to DRBDIntroduction to DRBD
Introduction to DRBD
dawnlua
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
srisatish ambati
 
Apache tajo configuration
Apache tajo configurationApache tajo configuration
Apache tajo configuration
Jihoon Son
 
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganShared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Hazelcast
 
DHT2 - O Brother, Where Art Thou with Shyam Ranganathan
DHT2 - O Brother, Where Art Thou with 	Shyam RanganathanDHT2 - O Brother, Where Art Thou with 	Shyam Ranganathan
DHT2 - O Brother, Where Art Thou with Shyam Ranganathan
Gluster.org
 
An Efficient Backup and Replication of Storage
An Efficient Backup and Replication of StorageAn Efficient Backup and Replication of Storage
An Efficient Backup and Replication of Storage
Takashi Hoshino
 

What's hot (20)

Tuning Linux for MongoDB
Tuning Linux for MongoDBTuning Linux for MongoDB
Tuning Linux for MongoDB
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Performance evaluation of apache tajo
Performance evaluation of apache tajoPerformance evaluation of apache tajo
Performance evaluation of apache tajo
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
 
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
 
Move Over, Rsync
Move Over, RsyncMove Over, Rsync
Move Over, Rsync
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep Dive
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出
 
Gluster the ugly parts with Jeff Darcy
Gluster  the ugly parts with Jeff DarcyGluster  the ugly parts with Jeff Darcy
Gluster the ugly parts with Jeff Darcy
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
Introduction to DRBD
Introduction to DRBDIntroduction to DRBD
Introduction to DRBD
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
 
Apache tajo configuration
Apache tajo configurationApache tajo configuration
Apache tajo configuration
 
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganShared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
 
DHT2 - O Brother, Where Art Thou with Shyam Ranganathan
DHT2 - O Brother, Where Art Thou with 	Shyam RanganathanDHT2 - O Brother, Where Art Thou with 	Shyam Ranganathan
DHT2 - O Brother, Where Art Thou with Shyam Ranganathan
 
Cascalog internal dsl_preso
Cascalog internal dsl_presoCascalog internal dsl_preso
Cascalog internal dsl_preso
 
An Efficient Backup and Replication of Storage
An Efficient Backup and Replication of StorageAn Efficient Backup and Replication of Storage
An Efficient Backup and Replication of Storage
 

Viewers also liked

Creating High Performance Big Data Applications with the Java Persistence API
Creating High Performance Big Data Applications with the Java Persistence APICreating High Performance Big Data Applications with the Java Persistence API
Creating High Performance Big Data Applications with the Java Persistence APIDATAVERSITY
 
Optimizing Java Performance
Optimizing Java PerformanceOptimizing Java Performance
Optimizing Java Performance
Konstantin Pavlov
 
Performance van Java 8 en verder - Jeroen Borgers
Performance van Java 8 en verder - Jeroen BorgersPerformance van Java 8 en verder - Jeroen Borgers
Performance van Java 8 en verder - Jeroen Borgers
NLJUG
 
High Performance Web Design
High Performance Web DesignHigh Performance Web Design
High Performance Web Design
Koji Ishimoto
 
Java Performance
Java PerformanceJava Performance
Java PerformanceSSA KPI
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
Isuru Perera
 
WSO2 Identity Server
WSO2 Identity ServerWSO2 Identity Server
WSO2 Identity Server
Prabath Siriwardena
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
Peter Lawrey
 
Java Performance, Threading and Concurrent Data Structures
Java Performance, Threading and Concurrent Data StructuresJava Performance, Threading and Concurrent Data Structures
Java Performance, Threading and Concurrent Data Structures
Hitendra Kumar
 
High performance java ee with j cache and cdi
High performance java ee with j cache and cdiHigh performance java ee with j cache and cdi
High performance java ee with j cache and cdi
Payara
 
SSO with the WSO2 Identity Server
SSO with the WSO2 Identity ServerSSO with the WSO2 Identity Server
SSO with the WSO2 Identity ServerWSO2
 
Practical Steps For Building High Performance Teams
Practical Steps For Building High Performance TeamsPractical Steps For Building High Performance Teams
Practical Steps For Building High Performance Teams
Elijah Ezendu
 
High-performance Team Development
High-performance Team DevelopmentHigh-performance Team Development
High-performance Team DevelopmentPeter Pfeiffer
 
Leading High Performance Teams
Leading High Performance TeamsLeading High Performance Teams
Leading High Performance Teams
Ubersoldat
 
WSO2 Identity Server 5.3.0 - Product Release Webinar
WSO2 Identity Server 5.3.0 - Product Release WebinarWSO2 Identity Server 5.3.0 - Product Release Webinar
WSO2 Identity Server 5.3.0 - Product Release Webinar
WSO2
 
High Performance Java EE with JCache and CDI
High Performance Java EE with JCache and CDIHigh Performance Java EE with JCache and CDI
High Performance Java EE with JCache and CDI
Payara
 
SAML Smackdown
SAML SmackdownSAML Smackdown
SAML Smackdown
Pat Patterson
 
High Performance Flow Matching Architecture for Openflow Data Plane
High Performance Flow Matching Architecture for Openflow Data PlaneHigh Performance Flow Matching Architecture for Openflow Data Plane
High Performance Flow Matching Architecture for Openflow Data Plane
Mahesh Dananjaya
 
SAML Protocol Overview
SAML Protocol OverviewSAML Protocol Overview
SAML Protocol Overview
Mike Schwartz
 

Viewers also liked (20)

Creating High Performance Big Data Applications with the Java Persistence API
Creating High Performance Big Data Applications with the Java Persistence APICreating High Performance Big Data Applications with the Java Persistence API
Creating High Performance Big Data Applications with the Java Persistence API
 
Optimizing Java Performance
Optimizing Java PerformanceOptimizing Java Performance
Optimizing Java Performance
 
Java performance
Java performanceJava performance
Java performance
 
Performance van Java 8 en verder - Jeroen Borgers
Performance van Java 8 en verder - Jeroen BorgersPerformance van Java 8 en verder - Jeroen Borgers
Performance van Java 8 en verder - Jeroen Borgers
 
High Performance Web Design
High Performance Web DesignHigh Performance Web Design
High Performance Web Design
 
Java Performance
Java PerformanceJava Performance
Java Performance
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
 
WSO2 Identity Server
WSO2 Identity ServerWSO2 Identity Server
WSO2 Identity Server
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
 
Java Performance, Threading and Concurrent Data Structures
Java Performance, Threading and Concurrent Data StructuresJava Performance, Threading and Concurrent Data Structures
Java Performance, Threading and Concurrent Data Structures
 
High performance java ee with j cache and cdi
High performance java ee with j cache and cdiHigh performance java ee with j cache and cdi
High performance java ee with j cache and cdi
 
SSO with the WSO2 Identity Server
SSO with the WSO2 Identity ServerSSO with the WSO2 Identity Server
SSO with the WSO2 Identity Server
 
Practical Steps For Building High Performance Teams
Practical Steps For Building High Performance TeamsPractical Steps For Building High Performance Teams
Practical Steps For Building High Performance Teams
 
High-performance Team Development
High-performance Team DevelopmentHigh-performance Team Development
High-performance Team Development
 
Leading High Performance Teams
Leading High Performance TeamsLeading High Performance Teams
Leading High Performance Teams
 
WSO2 Identity Server 5.3.0 - Product Release Webinar
WSO2 Identity Server 5.3.0 - Product Release WebinarWSO2 Identity Server 5.3.0 - Product Release Webinar
WSO2 Identity Server 5.3.0 - Product Release Webinar
 
High Performance Java EE with JCache and CDI
High Performance Java EE with JCache and CDIHigh Performance Java EE with JCache and CDI
High Performance Java EE with JCache and CDI
 
SAML Smackdown
SAML SmackdownSAML Smackdown
SAML Smackdown
 
High Performance Flow Matching Architecture for Openflow Data Plane
High Performance Flow Matching Architecture for Openflow Data PlaneHigh Performance Flow Matching Architecture for Openflow Data Plane
High Performance Flow Matching Architecture for Openflow Data Plane
 
SAML Protocol Overview
SAML Protocol OverviewSAML Protocol Overview
SAML Protocol Overview
 

Similar to High Performance With Java

Alto Desempenho com Java
Alto Desempenho com JavaAlto Desempenho com Java
Alto Desempenho com Javacodebits
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
MongoDB
 
Performance
PerformancePerformance
Performance
Christophe Marchal
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
Innfinision Cloud and BigData Solutions
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
mbreternitz
 
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed SystemsSoftware Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systemsadrianionel
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
Hiram Fleitas León
 
PyTables
PyTablesPyTables
PyTables
Ali Hallaji
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
Arnab Biswas
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 
DConf2015 - Using D for Development of Large Scale Primary Storage
DConf2015 - Using D for Development  of Large Scale Primary StorageDConf2015 - Using D for Development  of Large Scale Primary Storage
DConf2015 - Using D for Development of Large Scale Primary Storage
Liran Zvibel
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
HPCC Systems
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
MongoDB
 
DIY Java Profiling
DIY Java ProfilingDIY Java Profiling
DIY Java Profiling
Roman Elizarov
 
Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare Metal
Databricks
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
Speedment, Inc.
 

Similar to High Performance With Java (20)

Alto Desempenho com Java
Alto Desempenho com JavaAlto Desempenho com Java
Alto Desempenho com Java
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Performance
PerformancePerformance
Performance
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
Py tables
Py tablesPy tables
Py tables
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
 
PyTables
PyTablesPyTables
PyTables
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
 
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed SystemsSoftware Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
PyTables
PyTablesPyTables
PyTables
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 
DConf2015 - Using D for Development of Large Scale Primary Storage
DConf2015 - Using D for Development  of Large Scale Primary StorageDConf2015 - Using D for Development  of Large Scale Primary Storage
DConf2015 - Using D for Development of Large Scale Primary Storage
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
DIY Java Profiling
DIY Java ProfilingDIY Java Profiling
DIY Java Profiling
 
Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare Metal
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

High Performance With Java

  • 1. High Performance with Java malduarte@gmail.com
  • 2. Foreword In the beginning was the Tao. The Tao gave birth to Space and Time. Therefore Space and Time are Yin and Yang of programming. Programmers that do not comprehend the Tao are always running out of time and space for their programs. Programmers that comprehend the Tao always have enough time and space to accomplish their goals. How could it be otherwise? From www.canonical.org/~kragen/tao-of-programming.htm
  • 3. What is High Performance? •HitachiH8 8 bit cpu, 16 MHz •32 kb Ram 2 X Sun SPARC Enterprise M5000 6 Quad Core 2.4ghz - 6 MB L2 Cache,Sparc VII CPUs, 48 hw threads, 32Gb RAM Sources: Sun Microsystems: www.sun.com/servers/midrange/m5000/ WikiPedia: en.wikipedia.org/wiki/Lego_Mindstorms Aad van der Steen HPC Page - www.phys.uu.nl/~steen/web08/sparc.html
  • 4. High Performance is all about “Delivering solutions which meet requirements within time and space constraints using available resources rationally” The most important resource: brain time. HW increases performance with time, brain decreases performance with time.
  • 5. Why Java? • Mature technology • Speedy and Stable VMs (those who were burned in the early days still loath it, though) • Lots of high quality tools • Lots of high quality available libraries • Large ecosystem • NOT the language itself 
  • 7. A small case study • Goal: Analyse 17 G (gzip’ed) worth of MSC Call Detail Records (CDRs in Mobile Operator Lingo) Snippet: 04|001|26806XXXXXXXXXX|3519XXXXXXXX|3519800049344611|||||| 081105|002559|||00062|00|000-076|015-113||||MALM1 |0|01|9XXXXXXXX|11|||2|1|MICOUT|0|0||||||||||||||||||331985|268061011305482|B AL10A|15|22|12402523|||||||||||||||||||||||||||||||02|||||100001011305482||3e3212003 4df00|||0|1|17|||1|||3||1|01|3519XXXXXXXX||||1|01|3519XXXXXXXX||25||||||0|01| 9XXXXXXXX|002559|081105|00062||2||5||||||||||||3||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||||| Note: Sensitive information was hidden
  • 8. A bit more info • Aproximatly 170 G uncompressed • Exactly 359 014 695 cdrs • Trivia: about 3 days worth of GSM call logs. • Correlate CDRs with Customer information • Peformance goal : running time must be below one hour.
  • 10. If you don’t take a temperature you can’t find a fever • Measure the progress as the system is implemented • Make *honest* measurements. Prove yourself wrong. • Avoid premature optimization – How can you know? If you’re within your performance budget don’t worry (*) Fat Man’s Law X – “House of God” Samuel Shen - http://en.wikipedia.org/wiki/The_House_of_God
  • 11. "The journey of a thousand miles starts with a single step." Lao Tse • Line read performance 1811229 Line Sample Sample timmings: real 0m13.872s user 0m13.366s sys 0m4.056s ETA: ~45 minutes
  • 12. I/O Tips • Use Memory Mapped Files (see FileChannel.map and MappedByteBuffer APIS) • Use Buffered I/O - BufferedInputStream • Optimal buffer size multiple of OS page size (usually 8k) • If the process is I/O bound and have fast CPUs, consider processing compressed files
  • 13. One more step • Extract date of call and customer phone number 04|001|268061100021547|3519XXXXXXXX|3519800049344611|||||| 081105|002559|||00062|00|000-076|015-113||||MALM1 |0|01|9XXXXXXXX|11|||2|1|MICOUT|0|0||||||||||||||||||331985|2680610113 05482|BAL10A|15|22|12402523|||||||||||||||||||||||||||||||02|||||100001011305 482||3e32120034df00|||0|1|17|||1|||3||1|01|3519XXXXXXXX||||1|01|351 9XXXXXXXX||25||||||0|01|9XXXXXXXX|002559|081105|00062||2||5||||||| |||||3||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Censored numbers to protect the innocent 
  • 14. Split lines by columns String fields[] = line.split("|"); Sample timmings: real 1m0.670s user 1m1.252s sys 0m6.015s ETA: 3 hours, 18 minutes ~ 6 x slower!!! Exceeded the performance budget
  • 15. When in doubt, profile ~85% spent splitting fields!
  • 16. Tune String fields[] = split(line, '|', 3,10,11); Sample timmings: real 0m13.450s user 0m13.425s sys 0m3.965s ETA: 44 minutes e 35 seconds 14 extra lines of java code and we’re back on track
  • 17. Must get SIM card data • SIM card Type (prepaid, postpaid, ...) • ~ 15 million record table • Database constantly under load • 4000 querys/s (0.25 ms/q) spare capacity
  • 18. Database Tips (JDBC) – Reuse connections! – Read only ? setReadOnly(true) – Allways use PreparedStatements – Allways explicitly close ResultSet (GC friendly) – Turn off autocommit – Use batched operations and transactions in CRUD type accesses – Large ResultSets? Increase fetch size! rs.setFetchSize(XXX)
  • 19. Ooops • Too slow! • Assuming an average rate of 4000 q/s: ETA: ~ 1 day, 56 minutes
  • 20. Alternatives • TimesTen • SolidDb In Memory Databases • H2 • Hsqldb • Derby Emebeded Relational • BerkeyleyDb • Infinitydb Others Embebed
  • 21. Must keep a balance Performance Cost, Complexity, Learning Curve (aka neuron Time), Maintenance
  • 22. Remebering old times • In C/C++ you could map structs to memory • The amount of information needed is 16 bytes per SIM card (phone number, start date, end date, type of card – 4 * 4 bytes) • ~ 343 M if stored in a compact form (int[]) • Sort the data and wrap the array in a List • Use Collections.binarySearch to do the heavy lifting
  • 23. Way faster! • No extra libraries, 40 lines of simple java code ETA: 1 hour, 30 minutes e 35 seconds Above the budget 
  • 24. Put those extra cores to work • 6 Quad Core 2.4ghz - 6 MB L2 Cache,Sparc VII CPUs, 48 hw threads, 32Gb RAM • Split the data in work units • Split the work units among the threads • Collect the results when the treads finish
  • 25. Concurrent tips • Concurrent programming is really hard! • But you’re not going to be able to avoid it (cpu speed increases per core stalled, cores are increasing in number) • Don’t share R/W data among threads • Locking will kill performance • Be aware of memory architecture java.sun.com/javase/6/docs/technotes/guide s/concurrency/index.html
  • 26. Mission Acomplished • With 8 threads of the 48 possible Real running time: 10 minutes, 23 seconds Near linear scaling! There’s no point in optimizing more. We’ve just entered the Law of Diminishing returns en.wikipedia.org/wiki/Diminishing_returns
  • 27. What about Network I/O • 1 thread per client using blocking I/O does not scale • Use Nonblocking I/O • VM implementors will (problaby) use the best API in the host OS (/dev/epoll in Linux Kernel 2.6 for example) • NBIO is hard. Don’t reinvent the wheel. See Apache Mina - mina.apache.org • Scales to over 10.000k connections easily!
  • 28. A few extra tips • Know your VM • Not all VMs are created equal • Even without changing a line of code you can improve things, if you know what you’re doing • If you’re using the SUN VM try the Server VM (default is Client VM) • Plenty of options to fiddle blogs.sun.com/watt/resource/jvm-options- list.html
  • 29. What about designing and maintaining complex systems • Implement a feature complete solution in small scale • Learn the performance characteristics. Implement benchmarks. • Change the architecture if needed • How much does it cost? It’s all about €€€€€ (licensing, hardware, human resources, rack space, energy, cooling requirements, maintenance,...)
  • 30. Keep measuaring after the system goes live “The only man I know who behaves sensibly is my tailor; he takes my measurements anew each time he sees me. The rest go on with their old measurements and expect me to fit them.” George Bernard Shaw - en.wikiquote.org/wiki/George_Bernard_Shaw • Specially if you keep adding features
  • 31. Code snippets – A (way) faster split public static String[] split(String l, char sep, int... columns) { String[] fields = new String[columns.length]; int start = 0, column = 0, end, i = 0; while((end = l.indexOf(sep, start)) != -1) { if(column++ == columns[i]) { fields[i] = l.substring(start, end); if(++i == columns.length) return fields; } start = end + 1; } if(column == columns[i]) fields[i] = l.substring(start); return fields; } String fields[] = split(line, '|', 3,10,11);
  • 32. Static in-memory “database”: Poor man’s solution (but as fast as it gets) public class ClientFile implements List<CardInfo>, RandomAccess { static final int CLIENT_SIZE = 16; int[] clients; public ClientFile() throws FileNotFoundException, IOException { File f = new File("clientes.db"); FileInputStream fs = new FileInputStream(f); int client_count = (int)f.length() / CLIENT_SIZE; clients = new int[client_count * 4]; byte b[] = new byte[(int) f.length()]; fs.read(b); for(int i = 0;i != client_count; ++i) { clients[i * 4] = toi(b, i * CLIENT_SIZE); clients[i * 4 + 1] = toi(b, i * CLIENT_SIZE + 4); clients[i * 4 + 2] = toi(b, i * CLIENT_SIZE + 8); clients[i * 4 + 3] = toi(b, i * CLIENT_SIZE + 12); } } // map byte[] to integer public int toi(byte[] b, int offset) { return ((0xFF & b[offset]) << 24) + ((0xFF & b[offset + 1]) << 16) + ((0xFF & b[offset + 2]) << 8) + (0xFF & b[offset + 3]); } (…)
  • 33. Static in-memory “database”: (continued) (…) public CardInfo get(int index) { return new CardInfo(clients[index * 4], clients[index * 4 + 1], clients[index * 4 + 2], clients[index * 4 + 3]); } public CardInfo getCardInfo(String msisdn, String yymmdd, String hhmmss){ Calendar cal = Calendar.getInstance(); cal.set(i(yymmdd, 0, 1) + 2000, i(yymmdd, 2, 3) - 1, i(yymmdd, 4, 5), i(hhmmss, 0, 1), i(hhmmss, 2, 3), i(hhmmss, 4, 5)); int idx = Collections.binarySearch(this, new Key(i(msisdn), (int)(cal.getTimeInMillis() / 1000))); if (idx < 0) { return null; } return get(idx); }
  • 34. Questions? • Answers1 € • Answers that require thought5 € • Correct Answers20 € • Dumb looksFor Free!