SlideShare a Scribd company logo
1 of 12
Assessing Market Risk
of an Investment Portfolio involving
15 billion
calculations
This presentation is the result of benchmarks run at Intel Innovation Labs in
Bangalore, India in 2013.
We would like to thank all the technical staff at the labs for providing us all their
facilities and guidance
Big Join in Hadoop
3 million positions x 5000 risk models
Each model consists 2M products.
To achieve =>6 months of historical
data to be readily available while
calculating risk.
Current Status=> Only 5 days of
prior data is immediately available,
rest in archives.
Business Benefit
• Allows more broader time based
risk assessment
• Solution avoids costly architectures
such as in memory JVM cache
based computing.
© 2012 Bizosys Technologies Pvt Ltd.
The Cluster
© 2012 Bizosys Technologies Pvt Ltd.
Sl. No.
Descriptio
n
Machine Name
Machine 1 Machine 2 Machine 3 Machine 4
1 Platform S4600SDP S4600SDP S4600SDP S4600SDP
2
Processor
Details
Xeon E5-4650, 2.7
GHz
Xeon E5-4650, 2.7
GHz
Xeon E5-4650, 2.7
GHz
Xeon E5-4650, 2.7
GHz
20M L3 cache, 8
Core
20M L3 cache, 8
Core
20M L3 cache, 8
Core
20M L3 cache, 8
Core
4 Memory
16 x 8GB-PC3L-
10600R
16 x 8GB-PC3L-
10600R
16 x 8GB-PC3L-
10600R
16 x 8GB-PC3L-
10600R
5 Hard disk 300GB SAS 300GB SAS 300GB SAS 300GB SAS
6 250GB SSD 250GB SSD 250GB SSD
238.5GB SSD (4 x
60GB SSD in LVM)
7 OS Details
Redhat Enterprise
Linux 6.3 x64
Redhat Enterprise
Linux 6.3 x64
Redhat Enterprise
Linux 6.3 x64
Redhat Enterprise
Linux 6.3 x64
/boot = 1GB /boot = 1GB /boot = 1GB /boot = 1GB
swap = 32GB swap = 32GB swap = 32GB swap = 32GB
/root = 100GB /root = 100GB /root = 100GB /root = 100GB
/data = 167GB /data = 167GB /data = 167GB /data = 167GB
/ssd = 250GB /ssd = 250GB /ssd = 250GB /ssd = 238.5GB
Infrastructure - Metals
© 2012 Bizosys Technologies Pvt Ltd.
1 Hadoop Hadoop 1.2
2 Dataswft Dataswft 0.94.4.41
3 JDK JDK 1.6.0_45
4 HDFS JDK Memory 4 GB
5 Dataswft JDK Memory 4 GB
Hadoop
© 2012 Bizosys Technologies Pvt Ltd.
Learning
First Run: 120Sec (No-Cache), 98Sec(Cache)
Setup
1250 Models / Machine with 1 SSD /Machine.
1 Dataswft instance/machine and max 64 threads/instance
Results
120 Sec with OS Cache Disabled. 98 Sec with OS Cache
Enabled.
Observation
High I/O wait and Low CPU usage.
Software bottleneck with sequential I/O reads.
Action Taken
Code modified to parallelized I/O reads
Second Run: 115Sec (No-Cache), 90Sec(Cache)
Setup
1250 Models / Machine with 1 SSD /Machine.
1 Dataswft instance/machine and max 64 threads/instance
Results
115 Sec with OS Cache Disabled. 90 Sec with OS Cache
Enabled.
Observation
After app log analysis we found DFSClient bottleneck.
Action Taken
Introduced 2 Dataswft Instances/Machines
Third Run: 70 Sec (No-Cache), 34 Sec(Cache)
Setup
1250 Models / Machine with 1 SSD /Machine.
2 Dataswft instance/machine and max 32 threads/instance
Results
70 Sec with OS Cache Disabled. 33.8 Sec with OS Cache
Enabled.
Observation (No Cache)
Average CPU Usage 32%, max 43%, Avg interrupt 17245 and
avg context switch 6365 and avg I/O wait 9.16.
Action Taken
4 SSD drives in a single machine.
Fourth Run: 32.8Sec (No-Cache) 30.3Sec(Cache)
Setup
1250 Models / Machine with 2 instance/machine.
4 SSDs/Machine. Max 32 threads / instance
40ms Delay on parallel thread launch
Results
32.8 Sec with OS Cache disabled. 30.3 Sec on cache enabled.
Observation (No Cache)
Average CPU Usage 75%, max 97%, Avg interrupt 48921 and
avg context switch 23376 and avg I/O wait 2.5.
Action Taken
More Delay is introduced to reduce contention.
Fifth Run: 32.5 Sec (No-Cache), 32Sec(Cache)
Setup
1250 Models / Machine with 2 instance/machine.
4 SSDs/Machine. Max 32 threads / instance
45ms Delay on parallel thread launch
Results
32.504 Sec with OS Cache Disabled. 32.060 Sec with OS
Cache Enabled.
Observation (No Cache)
Average CPU Usage 55%, max 82%, Avg interrupt 37564
and avg context switch 9419 and avg I/O wait 1.0.
Action Taken
None

More Related Content

What's hot

An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAmir Sedighi
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUAmir Sedighi
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)Amir Sedighi
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awrLouis liu
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsLouis liu
 
How We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising SystemHow We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising SystemMongoDB
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Setting up repositories:  Technical Requirements,  Repository Software, Metad...Setting up repositories:  Technical Requirements,  Repository Software, Metad...
Setting up repositories: Technical Requirements, Repository Software, Metad...Iryna Kuchma
 
JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV Gene Leyzarovich
 
Troubleshooting Cassandra
Troubleshooting CassandraTroubleshooting Cassandra
Troubleshooting CassandraJeremy Hanna
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Community
 
robust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-serverrobust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-serverTecsun Yeep
 
Setting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutesSetting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutesSudheer Kondla
 
Making Ceph fast in the face of failure
Making Ceph fast in the face of failure Making Ceph fast in the face of failure
Making Ceph fast in the face of failure mountpoint.io
 
Quick Faq - Erasure Coding
Quick Faq - Erasure Coding Quick Faq - Erasure Coding
Quick Faq - Erasure Coding Western Digital
 
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019Sangwook Kim
 
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018DevOpsDays Tel Aviv
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 

What's hot (20)

An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for Beginners
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBU
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutions
 
How We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising SystemHow We Use MongoDB in Our Advertising System
How We Use MongoDB in Our Advertising System
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Setting up repositories:  Technical Requirements,  Repository Software, Metad...Setting up repositories:  Technical Requirements,  Repository Software, Metad...
Setting up repositories: Technical Requirements, Repository Software, Metad...
 
Fluent plugin-dstat
Fluent plugin-dstatFluent plugin-dstat
Fluent plugin-dstat
 
JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV JetStor JBOD Microsoft Storage Spaces Xces BV
JetStor JBOD Microsoft Storage Spaces Xces BV
 
Troubleshooting Cassandra
Troubleshooting CassandraTroubleshooting Cassandra
Troubleshooting Cassandra
 
Tools for Metaspace
Tools for MetaspaceTools for Metaspace
Tools for Metaspace
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
 
robust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-serverrobust-twelve-plus-midtower-storage-server
robust-twelve-plus-midtower-storage-server
 
HDFSvTACHYON
HDFSvTACHYONHDFSvTACHYON
HDFSvTACHYON
 
Setting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutesSetting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutes
 
Making Ceph fast in the face of failure
Making Ceph fast in the face of failure Making Ceph fast in the face of failure
Making Ceph fast in the face of failure
 
Quick Faq - Erasure Coding
Quick Faq - Erasure Coding Quick Faq - Erasure Coding
Quick Faq - Erasure Coding
 
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
AppOS: PostgreSQL Extension for Scalable File I/O @ PGConf.Asia 2019
 
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
Data Science in DevOps/SysOps - Boaz Shuster - DevOpsDays Tel Aviv 2018
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 

Viewers also liked

Portfolio 8D Games
Portfolio 8D GamesPortfolio 8D Games
Portfolio 8D Games8D Games
 
AexuberantediferençAdamulher
AexuberantediferençAdamulherAexuberantediferençAdamulher
AexuberantediferençAdamulherRachel V.
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Managementelliando dias
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 

Viewers also liked (6)

Portfolio 8D Games
Portfolio 8D GamesPortfolio 8D Games
Portfolio 8D Games
 
AexuberantediferençAdamulher
AexuberantediferençAdamulherAexuberantediferençAdamulher
AexuberantediferençAdamulher
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Management
 
A biblia
A bibliaA biblia
A biblia
 
Quintana
QuintanaQuintana
Quintana
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 

Similar to Dataswft Intel benchmark 2013

Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIDataWorks Summit
 
April 2014 IBM announcement webcast
April 2014 IBM announcement webcastApril 2014 IBM announcement webcast
April 2014 IBM announcement webcastHELP400
 
Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Shay Hassidim
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureDanielle Womboldt
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Community
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Intel Software Brasil
 
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdfAsparuhPolyovski2
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimizationLouis liu
 
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Nuno Alves
 
Gp Introduction 200811
Gp Introduction 200811Gp Introduction 200811
Gp Introduction 200811iswaha
 
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Principled Technologies
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCMemVerge
 
5 Things You Need to Know About Enterprise Fl
 5 Things You Need to Know About Enterprise Fl 5 Things You Need to Know About Enterprise Fl
5 Things You Need to Know About Enterprise FlWestern Digital
 
2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)infortrendgroup
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Shardinguzzal basak
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engineBhuvaneshwaran R
 

Similar to Dataswft Intel benchmark 2013 (20)

Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
April 2014 IBM announcement webcast
April 2014 IBM announcement webcastApril 2014 IBM announcement webcast
April 2014 IBM announcement webcast
 
Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™
 
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
 
11g R2
11g R211g R2
11g R2
 
Gp Introduction 200811
Gp Introduction 200811Gp Introduction 200811
Gp Introduction 200811
 
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
5 Things You Need to Know About Enterprise Fl
 5 Things You Need to Know About Enterprise Fl 5 Things You Need to Know About Enterprise Fl
5 Things You Need to Know About Enterprise Fl
 
2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)2018 Infortrend All Flash Arrays Introduction (GS3025A)
2018 Infortrend All Flash Arrays Introduction (GS3025A)
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Dataswft Intel benchmark 2013

  • 1. Assessing Market Risk of an Investment Portfolio involving 15 billion calculations This presentation is the result of benchmarks run at Intel Innovation Labs in Bangalore, India in 2013. We would like to thank all the technical staff at the labs for providing us all their facilities and guidance
  • 2. Big Join in Hadoop 3 million positions x 5000 risk models Each model consists 2M products. To achieve =>6 months of historical data to be readily available while calculating risk. Current Status=> Only 5 days of prior data is immediately available, rest in archives.
  • 3. Business Benefit • Allows more broader time based risk assessment • Solution avoids costly architectures such as in memory JVM cache based computing.
  • 4. © 2012 Bizosys Technologies Pvt Ltd. The Cluster
  • 5. © 2012 Bizosys Technologies Pvt Ltd. Sl. No. Descriptio n Machine Name Machine 1 Machine 2 Machine 3 Machine 4 1 Platform S4600SDP S4600SDP S4600SDP S4600SDP 2 Processor Details Xeon E5-4650, 2.7 GHz Xeon E5-4650, 2.7 GHz Xeon E5-4650, 2.7 GHz Xeon E5-4650, 2.7 GHz 20M L3 cache, 8 Core 20M L3 cache, 8 Core 20M L3 cache, 8 Core 20M L3 cache, 8 Core 4 Memory 16 x 8GB-PC3L- 10600R 16 x 8GB-PC3L- 10600R 16 x 8GB-PC3L- 10600R 16 x 8GB-PC3L- 10600R 5 Hard disk 300GB SAS 300GB SAS 300GB SAS 300GB SAS 6 250GB SSD 250GB SSD 250GB SSD 238.5GB SSD (4 x 60GB SSD in LVM) 7 OS Details Redhat Enterprise Linux 6.3 x64 Redhat Enterprise Linux 6.3 x64 Redhat Enterprise Linux 6.3 x64 Redhat Enterprise Linux 6.3 x64 /boot = 1GB /boot = 1GB /boot = 1GB /boot = 1GB swap = 32GB swap = 32GB swap = 32GB swap = 32GB /root = 100GB /root = 100GB /root = 100GB /root = 100GB /data = 167GB /data = 167GB /data = 167GB /data = 167GB /ssd = 250GB /ssd = 250GB /ssd = 250GB /ssd = 238.5GB Infrastructure - Metals
  • 6. © 2012 Bizosys Technologies Pvt Ltd. 1 Hadoop Hadoop 1.2 2 Dataswft Dataswft 0.94.4.41 3 JDK JDK 1.6.0_45 4 HDFS JDK Memory 4 GB 5 Dataswft JDK Memory 4 GB Hadoop
  • 7. © 2012 Bizosys Technologies Pvt Ltd. Learning
  • 8. First Run: 120Sec (No-Cache), 98Sec(Cache) Setup 1250 Models / Machine with 1 SSD /Machine. 1 Dataswft instance/machine and max 64 threads/instance Results 120 Sec with OS Cache Disabled. 98 Sec with OS Cache Enabled. Observation High I/O wait and Low CPU usage. Software bottleneck with sequential I/O reads. Action Taken Code modified to parallelized I/O reads
  • 9. Second Run: 115Sec (No-Cache), 90Sec(Cache) Setup 1250 Models / Machine with 1 SSD /Machine. 1 Dataswft instance/machine and max 64 threads/instance Results 115 Sec with OS Cache Disabled. 90 Sec with OS Cache Enabled. Observation After app log analysis we found DFSClient bottleneck. Action Taken Introduced 2 Dataswft Instances/Machines
  • 10. Third Run: 70 Sec (No-Cache), 34 Sec(Cache) Setup 1250 Models / Machine with 1 SSD /Machine. 2 Dataswft instance/machine and max 32 threads/instance Results 70 Sec with OS Cache Disabled. 33.8 Sec with OS Cache Enabled. Observation (No Cache) Average CPU Usage 32%, max 43%, Avg interrupt 17245 and avg context switch 6365 and avg I/O wait 9.16. Action Taken 4 SSD drives in a single machine.
  • 11. Fourth Run: 32.8Sec (No-Cache) 30.3Sec(Cache) Setup 1250 Models / Machine with 2 instance/machine. 4 SSDs/Machine. Max 32 threads / instance 40ms Delay on parallel thread launch Results 32.8 Sec with OS Cache disabled. 30.3 Sec on cache enabled. Observation (No Cache) Average CPU Usage 75%, max 97%, Avg interrupt 48921 and avg context switch 23376 and avg I/O wait 2.5. Action Taken More Delay is introduced to reduce contention.
  • 12. Fifth Run: 32.5 Sec (No-Cache), 32Sec(Cache) Setup 1250 Models / Machine with 2 instance/machine. 4 SSDs/Machine. Max 32 threads / instance 45ms Delay on parallel thread launch Results 32.504 Sec with OS Cache Disabled. 32.060 Sec with OS Cache Enabled. Observation (No Cache) Average CPU Usage 55%, max 82%, Avg interrupt 37564 and avg context switch 9419 and avg I/O wait 1.0. Action Taken None