SlideShare a Scribd company logo
Assessing Market Risk
of an Investment Portfolio involving
15 billion
calculations
This presentation is the result of benchmarks run at Intel Innovation Labs in
Bangalore, India in 2013.
We would like to thank all the technical staff at the labs for providing us all their
facilities and guidance
Big Join in Hadoop
3 million positions x 5000 risk models
Each model consists 2M products.
To achieve =>6 months of historical
data to be readily available while
calculating risk.
Current Status=> Only 5 days of
prior data is immediately available,
rest in archives.
Business Benefit
• Allows more broader time based
risk assessment
• Solution avoids costly architectures
such as in memory JVM cache
based computing.
© 2012 Bizosys Technologies Pvt Ltd.
The Cluster
© 2012 Bizosys Technologies Pvt Ltd.
Sl. No.
Descriptio
n
Machine Name
Machine 1 Machine 2 Machine 3 Machine 4
1 Platform S4600SDP S4600SDP S4600SDP S4600SDP
2
Processor
Details
Xeon E5-4650, 2.7
GHz
Xeon E5-4650, 2.7
GHz
Xeon E5-4650, 2.7
GHz
Xeon E5-4650, 2.7
GHz
20M L3 cache, 8
Core
20M L3 cache, 8
Core
20M L3 cache, 8
Core
20M L3 cache, 8
Core
4 Memory
16 x 8GB-PC3L-
10600R
16 x 8GB-PC3L-
10600R
16 x 8GB-PC3L-
10600R
16 x 8GB-PC3L-
10600R
5 Hard disk 300GB SAS 300GB SAS 300GB SAS 300GB SAS
6 250GB SSD 250GB SSD 250GB SSD
238.5GB SSD (4 x
60GB SSD in LVM)
7 OS Details
Redhat Enterprise
Linux 6.3 x64
Redhat Enterprise
Linux 6.3 x64
Redhat Enterprise
Linux 6.3 x64
Redhat Enterprise
Linux 6.3 x64
/boot = 1GB /boot = 1GB /boot = 1GB /boot = 1GB
swap = 32GB swap = 32GB swap = 32GB swap = 32GB
/root = 100GB /root = 100GB /root = 100GB /root = 100GB
/data = 167GB /data = 167GB /data = 167GB /data = 167GB
/ssd = 250GB /ssd = 250GB /ssd = 250GB /ssd = 238.5GB
Infrastructure - Metals
© 2012 Bizosys Technologies Pvt Ltd.
1 Hadoop Hadoop 1.2
2 Dataswft Dataswft 0.94.4.41
3 JDK JDK 1.6.0_45
4 HDFS JDK Memory 4 GB
5 Dataswft JDK Memory 4 GB
Hadoop
© 2012 Bizosys Technologies Pvt Ltd.
Learning
First Run: 120Sec (No-Cache), 98Sec(Cache)
Setup
1250 Models / Machine with 1 SSD /Machine.
1 Dataswft instance/machine and max 64 threads/instance
Results
120 Sec with OS Cache Disabled. 98 Sec with OS Cache
Enabled.
Observation
High I/O wait and Low CPU usage.
Software bottleneck with sequential I/O reads.
Action Taken
Code modified to parallelized I/O reads
Second Run: 115Sec (No-Cache), 90Sec(Cache)
Setup
1250 Models / Machine with 1 SSD /Machine.
1 Dataswft instance/machine and max 64 threads/instance
Results
115 Sec with OS Cache Disabled. 90 Sec with OS Cache
Enabled.
Observation
After app log analysis we found DFSClient bottleneck.
Action Taken
Introduced 2 Dataswft Instances/Machines
Third Run: 70 Sec (No-Cache), 34 Sec(Cache)
Setup
1250 Models / Machine with 1 SSD /Machine.
2 Dataswft instance/machine and max 32 threads/instance
Results
70 Sec with OS Cache Disabled. 33.8 Sec with OS Cache
Enabled.
Observation (No Cache)
Average CPU Usage 32%, max 43%, Avg interrupt 17245 and
avg context switch 6365 and avg I/O wait 9.16.
Action Taken
4 SSD drives in a single machine.
Fourth Run: 32.8Sec (No-Cache) 30.3Sec(Cache)
Setup
1250 Models / Machine with 2 instance/machine.
4 SSDs/Machine. Max 32 threads / instance
40ms Delay on parallel thread launch
Results
32.8 Sec with OS Cache disabled. 30.3 Sec on cache enabled.
Observation (No Cache)
Average CPU Usage 75%, max 97%, Avg interrupt 48921 and
avg context switch 23376 and avg I/O wait 2.5.
Action Taken
More Delay is introduced to reduce contention.
Fifth Run: 32.5 Sec (No-Cache), 32Sec(Cache)
Setup
1250 Models / Machine with 2 instance/machine.
4 SSDs/Machine. Max 32 threads / instance
45ms Delay on parallel thread launch
Results
32.504 Sec with OS Cache Disabled. 32.060 Sec with OS
Cache Enabled.
Observation (No Cache)
Average CPU Usage 55%, max 82%, Avg interrupt 37564
and avg context switch 9419 and avg I/O wait 1.0.
Action Taken
None

More Related Content

What's hot

An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for Beginners
Amir Sedighi
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBU
Amir Sedighi
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
Louis liu
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutions
Louis liu
 
How We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising SystemHow We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising System
MongoDB
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Setting up repositories:  Technical Requirements,  Repository Software, Metad...Setting up repositories:  Technical Requirements,  Repository Software, Metad...
Setting up repositories: Technical Requirements, Repository Software, Metad...
Iryna Kuchma
 
Fluent plugin-dstat
Fluent plugin-dstatFluent plugin-dstat
Fluent plugin-dstat
shunsuke Mikami
 
JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV
Gene Leyzarovich
 
Troubleshooting Cassandra
Troubleshooting CassandraTroubleshooting Cassandra
Troubleshooting Cassandra
Jeremy Hanna
 
Tools for Metaspace
Tools for MetaspaceTools for Metaspace
Tools for Metaspace
Takahiro YAMADA
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Community
 
robust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-serverrobust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-serverTecsun Yeep
 
HDFSvTACHYON
HDFSvTACHYONHDFSvTACHYON
HDFSvTACHYON
Kevin Wong
 
Setting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutesSetting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutes
Sudheer Kondla
 
Making Ceph fast in the face of failure
Making Ceph fast in the face of failure Making Ceph fast in the face of failure
Making Ceph fast in the face of failure
mountpoint.io
 
Quick Faq - Erasure Coding
Quick Faq - Erasure Coding Quick Faq - Erasure Coding
Quick Faq - Erasure Coding
Western Digital
 
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
Sangwook Kim
 
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
DevOpsDays Tel Aviv
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 

What's hot (20)

An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for Beginners
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBU
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutions
 
How We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising SystemHow We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising System
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Setting up repositories:  Technical Requirements,  Repository Software, Metad...Setting up repositories:  Technical Requirements,  Repository Software, Metad...
Setting up repositories: Technical Requirements, Repository Software, Metad...
 
Fluent plugin-dstat
Fluent plugin-dstatFluent plugin-dstat
Fluent plugin-dstat
 
JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV
 
Troubleshooting Cassandra
Troubleshooting CassandraTroubleshooting Cassandra
Troubleshooting Cassandra
 
Tools for Metaspace
Tools for MetaspaceTools for Metaspace
Tools for Metaspace
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
 
robust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-serverrobust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-server
 
HDFSvTACHYON
HDFSvTACHYONHDFSvTACHYON
HDFSvTACHYON
 
Setting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutesSetting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutes
 
Making Ceph fast in the face of failure
Making Ceph fast in the face of failure Making Ceph fast in the face of failure
Making Ceph fast in the face of failure
 
Quick Faq - Erasure Coding
Quick Faq - Erasure Coding Quick Faq - Erasure Coding
Quick Faq - Erasure Coding
 
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
 
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 

Viewers also liked

Portfolio 8D Games
Portfolio 8D GamesPortfolio 8D Games
Portfolio 8D Games
8D Games
 
AexuberantediferençAdamulher
AexuberantediferençAdamulherAexuberantediferençAdamulher
AexuberantediferençAdamulherRachel V.
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Managementelliando dias
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
强 王
 

Viewers also liked (6)

Portfolio 8D Games
Portfolio 8D GamesPortfolio 8D Games
Portfolio 8D Games
 
AexuberantediferençAdamulher
AexuberantediferençAdamulherAexuberantediferençAdamulher
AexuberantediferençAdamulher
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Management
 
A biblia
A bibliaA biblia
A biblia
 
Quintana
QuintanaQuintana
Quintana
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 

Similar to Dataswft Intel benchmark 2013

Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
DataWorks Summit
 
April 2014 IBM announcement webcast
April 2014 IBM announcement webcastApril 2014 IBM announcement webcast
April 2014 IBM announcement webcast
HELP400
 
Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014
Shay Hassidim
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Danielle Womboldt
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Community
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red_Hat_Storage
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Intel Software Brasil
 
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
AsparuhPolyovski2
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
Louis liu
 
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Nuno Alves
 
Gp Introduction 200811
Gp Introduction 200811Gp Introduction 200811
Gp Introduction 200811iswaha
 
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Principled Technologies
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
MemVerge
 
5 Things You Need to Know About Enterprise Fl
 5 Things You Need to Know About Enterprise Fl 5 Things You Need to Know About Enterprise Fl
5 Things You Need to Know About Enterprise Fl
Western Digital
 
2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)
infortrendgroup
 
Dba tuning
Dba tuningDba tuning
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
uzzal basak
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
Searce Inc
 

Similar to Dataswft Intel benchmark 2013 (20)

Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
April 2014 IBM announcement webcast
April 2014 IBM announcement webcastApril 2014 IBM announcement webcast
April 2014 IBM announcement webcast
 
Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™
 
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
 
11g R2
11g R211g R2
11g R2
 
Gp Introduction 200811
Gp Introduction 200811Gp Introduction 200811
Gp Introduction 200811
 
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
5 Things You Need to Know About Enterprise Fl
 5 Things You Need to Know About Enterprise Fl 5 Things You Need to Know About Enterprise Fl
5 Things You Need to Know About Enterprise Fl
 
2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
 

Recently uploaded

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 

Recently uploaded (20)

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 

Dataswft Intel benchmark 2013

  • 1. Assessing Market Risk of an Investment Portfolio involving 15 billion calculations This presentation is the result of benchmarks run at Intel Innovation Labs in Bangalore, India in 2013. We would like to thank all the technical staff at the labs for providing us all their facilities and guidance
  • 2. Big Join in Hadoop 3 million positions x 5000 risk models Each model consists 2M products. To achieve =>6 months of historical data to be readily available while calculating risk. Current Status=> Only 5 days of prior data is immediately available, rest in archives.
  • 3. Business Benefit • Allows more broader time based risk assessment • Solution avoids costly architectures such as in memory JVM cache based computing.
  • 4. © 2012 Bizosys Technologies Pvt Ltd. The Cluster
  • 5. © 2012 Bizosys Technologies Pvt Ltd. Sl. No. Descriptio n Machine Name Machine 1 Machine 2 Machine 3 Machine 4 1 Platform S4600SDP S4600SDP S4600SDP S4600SDP 2 Processor Details Xeon E5-4650, 2.7 GHz Xeon E5-4650, 2.7 GHz Xeon E5-4650, 2.7 GHz Xeon E5-4650, 2.7 GHz 20M L3 cache, 8 Core 20M L3 cache, 8 Core 20M L3 cache, 8 Core 20M L3 cache, 8 Core 4 Memory 16 x 8GB-PC3L- 10600R 16 x 8GB-PC3L- 10600R 16 x 8GB-PC3L- 10600R 16 x 8GB-PC3L- 10600R 5 Hard disk 300GB SAS 300GB SAS 300GB SAS 300GB SAS 6 250GB SSD 250GB SSD 250GB SSD 238.5GB SSD (4 x 60GB SSD in LVM) 7 OS Details Redhat Enterprise Linux 6.3 x64 Redhat Enterprise Linux 6.3 x64 Redhat Enterprise Linux 6.3 x64 Redhat Enterprise Linux 6.3 x64 /boot = 1GB /boot = 1GB /boot = 1GB /boot = 1GB swap = 32GB swap = 32GB swap = 32GB swap = 32GB /root = 100GB /root = 100GB /root = 100GB /root = 100GB /data = 167GB /data = 167GB /data = 167GB /data = 167GB /ssd = 250GB /ssd = 250GB /ssd = 250GB /ssd = 238.5GB Infrastructure - Metals
  • 6. © 2012 Bizosys Technologies Pvt Ltd. 1 Hadoop Hadoop 1.2 2 Dataswft Dataswft 0.94.4.41 3 JDK JDK 1.6.0_45 4 HDFS JDK Memory 4 GB 5 Dataswft JDK Memory 4 GB Hadoop
  • 7. © 2012 Bizosys Technologies Pvt Ltd. Learning
  • 8. First Run: 120Sec (No-Cache), 98Sec(Cache) Setup 1250 Models / Machine with 1 SSD /Machine. 1 Dataswft instance/machine and max 64 threads/instance Results 120 Sec with OS Cache Disabled. 98 Sec with OS Cache Enabled. Observation High I/O wait and Low CPU usage. Software bottleneck with sequential I/O reads. Action Taken Code modified to parallelized I/O reads
  • 9. Second Run: 115Sec (No-Cache), 90Sec(Cache) Setup 1250 Models / Machine with 1 SSD /Machine. 1 Dataswft instance/machine and max 64 threads/instance Results 115 Sec with OS Cache Disabled. 90 Sec with OS Cache Enabled. Observation After app log analysis we found DFSClient bottleneck. Action Taken Introduced 2 Dataswft Instances/Machines
  • 10. Third Run: 70 Sec (No-Cache), 34 Sec(Cache) Setup 1250 Models / Machine with 1 SSD /Machine. 2 Dataswft instance/machine and max 32 threads/instance Results 70 Sec with OS Cache Disabled. 33.8 Sec with OS Cache Enabled. Observation (No Cache) Average CPU Usage 32%, max 43%, Avg interrupt 17245 and avg context switch 6365 and avg I/O wait 9.16. Action Taken 4 SSD drives in a single machine.
  • 11. Fourth Run: 32.8Sec (No-Cache) 30.3Sec(Cache) Setup 1250 Models / Machine with 2 instance/machine. 4 SSDs/Machine. Max 32 threads / instance 40ms Delay on parallel thread launch Results 32.8 Sec with OS Cache disabled. 30.3 Sec on cache enabled. Observation (No Cache) Average CPU Usage 75%, max 97%, Avg interrupt 48921 and avg context switch 23376 and avg I/O wait 2.5. Action Taken More Delay is introduced to reduce contention.
  • 12. Fifth Run: 32.5 Sec (No-Cache), 32Sec(Cache) Setup 1250 Models / Machine with 2 instance/machine. 4 SSDs/Machine. Max 32 threads / instance 45ms Delay on parallel thread launch Results 32.504 Sec with OS Cache Disabled. 32.060 Sec with OS Cache Enabled. Observation (No Cache) Average CPU Usage 55%, max 82%, Avg interrupt 37564 and avg context switch 9419 and avg I/O wait 1.0. Action Taken None