SlideShare a Scribd company logo
1 of 24
Download to read offline
Top 10 Perl Performance Tips

         Perrin Harkins
       We Also Walk Dogs
Devel::NYTProf
Ground Rules

● Make a repeatable test to measure progress with
   ○ Sometimes turns up surprises
● Use a profiler (Devel::NYTProf) to find where the time is
  going
   ○ Don't flail and waste time optimizing the wrong things!
● Try to weigh the cost of developer time vs buying more
  hardware
   ○ Optimization is crack for developers, hard to know when
     to stop
1. The Big Picture

● The biggest gains usually come from changing your high-
  level approach
    ○ Is there a more efficient algorithm?
    ○ Can you restructure to reduce duplicated effort?
● Sometimes you just need to tune your SQL
● A boatload of RAM hides a multitude of sins
● The bottleneck is usually I/O
    ○ Files
    ○ Database
    ○ Network
    ○ Batch I/O often makes a huge difference
2. Use DBI Efficiently

● Can make a huge difference in tight loops with many small
  queries
● connect_cached() avoids connection overhead
    ○ Or use your favorite connection cache, but beware
      overuse of ping()
● prepare_cached() avoids object creation and server-side
  prepare overhead
● Use bind parameters to reuse SQL statements instead of
  creating new ones
2. Use DBI Efficiently

● Use bind_cols() in a fetch() loop for most efficient retrieval.
    ○ Less copying is faster.
    ○ Alternatively, fetchrow_arrayref()
● prepare() and then many execute() calls is faster
  than do()
2. Use DBI Efficiently

● Turn off AutoCommit for batch changes
   ○ Commit every thousand rows or so saves work for your
     database
● Use your database's bulk loader when possible
   ○ Writing rows to CSV and using MySQL's LOAD DATA
     INFILE crushes the fastest DBI code
   ○ 10X speedup is not unusual
2. Use DBI Efficiently

● Use ORMs Wisely
   ○ Consider using straight DBI for the most performance
     sensitive sections
      ■ Removing a layer means fewer method calls and
        faster code
   ○ Write report queries by hand if they seem slow
      ■ Optimizer hints and choices about SQL variations are
        beyond the scope of ORMs but make a huge
        difference for this kind of query
3. Choose the Fastest Hash Storage

● memcached is not the fastest option for a local cache
   ○ BerkeleyDB (not DB_File!) and Cache::FastMmap are
     about twice as fast
● CHI abstracts the storage layer
   ○ Useful if you think network strategy may change later
3. Choose the Fastest Hash Storage

Cache                     Get time   Set time Run time
CHI::Driver::Memory       0.03ms     0.05ms 0.35s
BerkeleyDb                0.05ms     0.17ms   0.57s
Cache::FastMmap           0.06ms     0.09ms   0.62s
CHI::Driver::File         0.10ms     0.26ms   1.11s
Cache::Memcached::Fast    0.12ms     0.15ms   1.23s
Memcached::libmemcached   0.14ms     0.16ms   1.40s
CHI::Driver::DBI Sqlite   0.11ms     1.94ms   2.05s
Cache::Memcached          0.29ms     0.21ms   2.88s
CHI::Driver::DBI MySQL    0.45ms     0.33ms   4.41s
4. Generate Code and Compile to a
Subroutine
 ● This is how most templating tools work.
 ● Remove the cost of things that won't change for a while
    ○ Skip re-parsing templates
    ○ Skip large groups of conditionals
    ○ Choose architecture-specific code

my %subs;
my $code = qq{print "Hello $thingn";};
$subs{'hello'} = eval "sub { $code }";
$subs{'hello'}->();
5. Sling Text Efficiently

 ● Slurp files when possible.

my $text = do { local $/; <$fh>; }

 ● Seems obvious, but I still see people doing this:
my @lines = <$fh>;
my $text = join('', @lines);
 ● Consider memory with huge files.
5. Sling Text Efficiently

 ● Use a "sliding window" to search very large files.
    ○ Too big to slurp, but line-by-line is slow.
    ○ Chunks of 8K or 16K are much faster, but require book-
      keeping code.
    ○ http://www.perlmonks.org/?node_id=128925
 ● Use the cheapest string tests you can get away with.
    ○ index() beats a regex when you just want to know if a
      string contains another string
 ● Use a fast CSV parser
    ○ Text::CSV_XS is much faster than the regexes you
      copied from that web page.
6. Replace LWP With Something
Faster
● LWP is amazing, but modules built on C libraries tend to be
  faster.
    ○ LWP::Curl
    ○ HTTP::Lite
    ○ Maybe HTTP::Async for parallel

             LWP                32.8/s
             HTTP::Async        64.5/s
             HTTP::Lite         200/s
             LWP::Curl          1000/s
7. Use a Fast Serializer

 ● Data::Dumper is great for debugging, but slow for
   serialization.
 ● JSON::XS is the new speed king, and is human-readable
   and cross-language.
 ● Storable handles more and is second-best in speed.
7. Use a Fast Serializer

   YAML                84.7/s

   XML::Simple         800/s

   Data::Dumper        2143/s

   FreezeThaw          2635/s

   YAML::Syck          4307/s

   JSON::Syck          4654/s

   Storable            9774/s

   JSON::XS            41473/s
8. Avoid Startup Costs

● Use a daemon to run code persistently
   ○ Skip the costs of compiling
   ○ Cache data
   ○ Open connections ahead of time
● mod_perl, FastCGI, Plack, etc. for web
● PPerl for command-line
   ○ Or hit your web server with lwp-get
9. Sometimes You Have to Get Crazy

 ● Use the @_ array directly to avoid copying

sub add_to_sql {
    my $sqlbase = shift; # hashref
    my ($name, $value) = @_;
    if ($value) {
        push(@{ $sqlbase->{'names'} }, $name);
        push(@{ $sqlbase->{'values'} }, $value);
    }
    return $sqlbase;
}
9. Sometimes You Have to Get Crazy

sub add_to_sql {
   # takes 3 params: hashref, name, and value
   return if not $_[2];

     push(@{ $_[0]->{'names'} }, $_[1]);
     push(@{ $_[0]->{'values'} }, $_[2]);
}

    ● 40% faster than original
    ● More than 40% harder to read
10. Consider Compiling Your Own Perl

● Compiling without threads can be good for a free 15% or so.
● No code changes needed!
● Has maintenance costs.
Resources

Tim Bunce's Advanced DBI slides:
http://www.slideshare.net/Tim.Bunce/dbi-advanced-tutorial-
2007

Also see Tim's NYTProf slides:
http://www.slideshare.net/Tim.Bunce/develnytprof-v4-at-oscon-
201007

man perlperf

Programming Perl appendix on performance
Thank you!

Slides will be available on the
     conference website
Avoid tie()

 ● Slower than method calls!
 ● PITA to debug too.
Use a Fast Sort

● For sorting on derived keys, consider a GRT sort.
   ○ Faster than Schwartzian Transform
   ○ Use Sort::Maker to build it.

More Related Content

Viewers also liked

Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Tim Bunce
 
DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007Tim Bunce
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA SummitOpen Analytics
 
Managing data workflows with Luigi
Managing data workflows with LuigiManaging data workflows with Luigi
Managing data workflows with LuigiTeemu Kurppa
 
No sql e as vantagens na utilização do mongodb
No sql e as vantagens na utilização do mongodbNo sql e as vantagens na utilização do mongodb
No sql e as vantagens na utilização do mongodbfabio perrella
 
Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data ScienceErik Bernhardsson
 
A Beginner's Guide to Building Data Pipelines with Luigi
A Beginner's Guide to Building Data Pipelines with LuigiA Beginner's Guide to Building Data Pipelines with Luigi
A Beginner's Guide to Building Data Pipelines with LuigiGrowth Intelligence
 
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map ReduceEngineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map ReduceAaron Knight
 
Social institution
Social institutionSocial institution
Social institutionSandy Viceno
 

Viewers also liked (11)

Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA Summit
 
Managing data workflows with Luigi
Managing data workflows with LuigiManaging data workflows with Luigi
Managing data workflows with Luigi
 
No sql e as vantagens na utilização do mongodb
No sql e as vantagens na utilização do mongodbNo sql e as vantagens na utilização do mongodb
No sql e as vantagens na utilização do mongodb
 
DBI
DBIDBI
DBI
 
Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data Science
 
A Beginner's Guide to Building Data Pipelines with Luigi
A Beginner's Guide to Building Data Pipelines with LuigiA Beginner's Guide to Building Data Pipelines with Luigi
A Beginner's Guide to Building Data Pipelines with Luigi
 
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map ReduceEngineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
 
Social institution
Social institutionSocial institution
Social institution
 

Similar to Top 10 Perl Performance Tips

Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big DataDataStax Academy
 
Scaling Up with PHP and AWS
Scaling Up with PHP and AWSScaling Up with PHP and AWS
Scaling Up with PHP and AWSHeath Dutton ☕
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsWorkhorse Computing
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Javamalduarte
 
Pgbr 2013 postgres on aws
Pgbr 2013   postgres on awsPgbr 2013   postgres on aws
Pgbr 2013 postgres on awsEmanuel Calvo
 
How to be Successful with Scylla
How to be Successful with ScyllaHow to be Successful with Scylla
How to be Successful with ScyllaScyllaDB
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLHyderabad Scalability Meetup
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesRose Toomey
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesRose Toomey
 
Caching in (DevoxxUK 2013)
Caching in (DevoxxUK 2013)Caching in (DevoxxUK 2013)
Caching in (DevoxxUK 2013)RichardWarburton
 

Similar to Top 10 Perl Performance Tips (20)

Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 
Scaling Up with PHP and AWS
Scaling Up with PHP and AWSScaling Up with PHP and AWS
Scaling Up with PHP and AWS
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data Records
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
Scaling symfony apps
Scaling symfony appsScaling symfony apps
Scaling symfony apps
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Java
 
Hive at booking
Hive at bookingHive at booking
Hive at booking
 
Pgbr 2013 postgres on aws
Pgbr 2013   postgres on awsPgbr 2013   postgres on aws
Pgbr 2013 postgres on aws
 
How to be Successful with Scylla
How to be Successful with ScyllaHow to be Successful with Scylla
How to be Successful with Scylla
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelines
 
Demo 0.9.4
Demo 0.9.4Demo 0.9.4
Demo 0.9.4
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Caching in (DevoxxUK 2013)
Caching in (DevoxxUK 2013)Caching in (DevoxxUK 2013)
Caching in (DevoxxUK 2013)
 

More from Perrin Harkins

PyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to ProfilingPyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to ProfilingPerrin Harkins
 
Introduction to performance tuning perl web applications
Introduction to performance tuning perl web applicationsIntroduction to performance tuning perl web applications
Introduction to performance tuning perl web applicationsPerrin Harkins
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notesPerrin Harkins
 
Low maintenance perl notes
Low maintenance perl notesLow maintenance perl notes
Low maintenance perl notesPerrin Harkins
 
Choosing a Web Architecture for Perl
Choosing a Web Architecture for PerlChoosing a Web Architecture for Perl
Choosing a Web Architecture for PerlPerrin Harkins
 
Building Scalable Websites with Perl
Building Scalable Websites with PerlBuilding Scalable Websites with Perl
Building Scalable Websites with PerlPerrin Harkins
 
Efficient Shared Data in Perl
Efficient Shared Data in PerlEfficient Shared Data in Perl
Efficient Shared Data in PerlPerrin Harkins
 
Choosing a Templating System
Choosing a Templating SystemChoosing a Templating System
Choosing a Templating SystemPerrin Harkins
 
Scaling Databases with DBIx::Router
Scaling Databases with DBIx::RouterScaling Databases with DBIx::Router
Scaling Databases with DBIx::RouterPerrin Harkins
 
Care and Feeding of Large Web Applications
Care and Feeding of Large Web ApplicationsCare and Feeding of Large Web Applications
Care and Feeding of Large Web ApplicationsPerrin Harkins
 
The Most Common Template Toolkit Mistake
The Most Common Template Toolkit MistakeThe Most Common Template Toolkit Mistake
The Most Common Template Toolkit MistakePerrin Harkins
 

More from Perrin Harkins (13)

PyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to ProfilingPyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to Profiling
 
Introduction to performance tuning perl web applications
Introduction to performance tuning perl web applicationsIntroduction to performance tuning perl web applications
Introduction to performance tuning perl web applications
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notes
 
Scalable talk notes
Scalable talk notesScalable talk notes
Scalable talk notes
 
Low maintenance perl notes
Low maintenance perl notesLow maintenance perl notes
Low maintenance perl notes
 
Choosing a Web Architecture for Perl
Choosing a Web Architecture for PerlChoosing a Web Architecture for Perl
Choosing a Web Architecture for Perl
 
Building Scalable Websites with Perl
Building Scalable Websites with PerlBuilding Scalable Websites with Perl
Building Scalable Websites with Perl
 
Efficient Shared Data in Perl
Efficient Shared Data in PerlEfficient Shared Data in Perl
Efficient Shared Data in Perl
 
Choosing a Templating System
Choosing a Templating SystemChoosing a Templating System
Choosing a Templating System
 
Scaling Databases with DBIx::Router
Scaling Databases with DBIx::RouterScaling Databases with DBIx::Router
Scaling Databases with DBIx::Router
 
Low-Maintenance Perl
Low-Maintenance PerlLow-Maintenance Perl
Low-Maintenance Perl
 
Care and Feeding of Large Web Applications
Care and Feeding of Large Web ApplicationsCare and Feeding of Large Web Applications
Care and Feeding of Large Web Applications
 
The Most Common Template Toolkit Mistake
The Most Common Template Toolkit MistakeThe Most Common Template Toolkit Mistake
The Most Common Template Toolkit Mistake
 

Recently uploaded

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

Top 10 Perl Performance Tips

  • 1. Top 10 Perl Performance Tips Perrin Harkins We Also Walk Dogs
  • 3. Ground Rules ● Make a repeatable test to measure progress with ○ Sometimes turns up surprises ● Use a profiler (Devel::NYTProf) to find where the time is going ○ Don't flail and waste time optimizing the wrong things! ● Try to weigh the cost of developer time vs buying more hardware ○ Optimization is crack for developers, hard to know when to stop
  • 4. 1. The Big Picture ● The biggest gains usually come from changing your high- level approach ○ Is there a more efficient algorithm? ○ Can you restructure to reduce duplicated effort? ● Sometimes you just need to tune your SQL ● A boatload of RAM hides a multitude of sins ● The bottleneck is usually I/O ○ Files ○ Database ○ Network ○ Batch I/O often makes a huge difference
  • 5. 2. Use DBI Efficiently ● Can make a huge difference in tight loops with many small queries ● connect_cached() avoids connection overhead ○ Or use your favorite connection cache, but beware overuse of ping() ● prepare_cached() avoids object creation and server-side prepare overhead ● Use bind parameters to reuse SQL statements instead of creating new ones
  • 6. 2. Use DBI Efficiently ● Use bind_cols() in a fetch() loop for most efficient retrieval. ○ Less copying is faster. ○ Alternatively, fetchrow_arrayref() ● prepare() and then many execute() calls is faster than do()
  • 7. 2. Use DBI Efficiently ● Turn off AutoCommit for batch changes ○ Commit every thousand rows or so saves work for your database ● Use your database's bulk loader when possible ○ Writing rows to CSV and using MySQL's LOAD DATA INFILE crushes the fastest DBI code ○ 10X speedup is not unusual
  • 8. 2. Use DBI Efficiently ● Use ORMs Wisely ○ Consider using straight DBI for the most performance sensitive sections ■ Removing a layer means fewer method calls and faster code ○ Write report queries by hand if they seem slow ■ Optimizer hints and choices about SQL variations are beyond the scope of ORMs but make a huge difference for this kind of query
  • 9. 3. Choose the Fastest Hash Storage ● memcached is not the fastest option for a local cache ○ BerkeleyDB (not DB_File!) and Cache::FastMmap are about twice as fast ● CHI abstracts the storage layer ○ Useful if you think network strategy may change later
  • 10. 3. Choose the Fastest Hash Storage Cache Get time Set time Run time CHI::Driver::Memory 0.03ms 0.05ms 0.35s BerkeleyDb 0.05ms 0.17ms 0.57s Cache::FastMmap 0.06ms 0.09ms 0.62s CHI::Driver::File 0.10ms 0.26ms 1.11s Cache::Memcached::Fast 0.12ms 0.15ms 1.23s Memcached::libmemcached 0.14ms 0.16ms 1.40s CHI::Driver::DBI Sqlite 0.11ms 1.94ms 2.05s Cache::Memcached 0.29ms 0.21ms 2.88s CHI::Driver::DBI MySQL 0.45ms 0.33ms 4.41s
  • 11. 4. Generate Code and Compile to a Subroutine ● This is how most templating tools work. ● Remove the cost of things that won't change for a while ○ Skip re-parsing templates ○ Skip large groups of conditionals ○ Choose architecture-specific code my %subs; my $code = qq{print "Hello $thingn";}; $subs{'hello'} = eval "sub { $code }"; $subs{'hello'}->();
  • 12. 5. Sling Text Efficiently ● Slurp files when possible. my $text = do { local $/; <$fh>; } ● Seems obvious, but I still see people doing this: my @lines = <$fh>; my $text = join('', @lines); ● Consider memory with huge files.
  • 13. 5. Sling Text Efficiently ● Use a "sliding window" to search very large files. ○ Too big to slurp, but line-by-line is slow. ○ Chunks of 8K or 16K are much faster, but require book- keeping code. ○ http://www.perlmonks.org/?node_id=128925 ● Use the cheapest string tests you can get away with. ○ index() beats a regex when you just want to know if a string contains another string ● Use a fast CSV parser ○ Text::CSV_XS is much faster than the regexes you copied from that web page.
  • 14. 6. Replace LWP With Something Faster ● LWP is amazing, but modules built on C libraries tend to be faster. ○ LWP::Curl ○ HTTP::Lite ○ Maybe HTTP::Async for parallel LWP 32.8/s HTTP::Async 64.5/s HTTP::Lite 200/s LWP::Curl 1000/s
  • 15. 7. Use a Fast Serializer ● Data::Dumper is great for debugging, but slow for serialization. ● JSON::XS is the new speed king, and is human-readable and cross-language. ● Storable handles more and is second-best in speed.
  • 16. 7. Use a Fast Serializer YAML 84.7/s XML::Simple 800/s Data::Dumper 2143/s FreezeThaw 2635/s YAML::Syck 4307/s JSON::Syck 4654/s Storable 9774/s JSON::XS 41473/s
  • 17. 8. Avoid Startup Costs ● Use a daemon to run code persistently ○ Skip the costs of compiling ○ Cache data ○ Open connections ahead of time ● mod_perl, FastCGI, Plack, etc. for web ● PPerl for command-line ○ Or hit your web server with lwp-get
  • 18. 9. Sometimes You Have to Get Crazy ● Use the @_ array directly to avoid copying sub add_to_sql { my $sqlbase = shift; # hashref my ($name, $value) = @_; if ($value) { push(@{ $sqlbase->{'names'} }, $name); push(@{ $sqlbase->{'values'} }, $value); } return $sqlbase; }
  • 19. 9. Sometimes You Have to Get Crazy sub add_to_sql { # takes 3 params: hashref, name, and value return if not $_[2]; push(@{ $_[0]->{'names'} }, $_[1]); push(@{ $_[0]->{'values'} }, $_[2]); } ● 40% faster than original ● More than 40% harder to read
  • 20. 10. Consider Compiling Your Own Perl ● Compiling without threads can be good for a free 15% or so. ● No code changes needed! ● Has maintenance costs.
  • 21. Resources Tim Bunce's Advanced DBI slides: http://www.slideshare.net/Tim.Bunce/dbi-advanced-tutorial- 2007 Also see Tim's NYTProf slides: http://www.slideshare.net/Tim.Bunce/develnytprof-v4-at-oscon- 201007 man perlperf Programming Perl appendix on performance
  • 22. Thank you! Slides will be available on the conference website
  • 23. Avoid tie() ● Slower than method calls! ● PITA to debug too.
  • 24. Use a Fast Sort ● For sorting on derived keys, consider a GRT sort. ○ Faster than Schwartzian Transform ○ Use Sort::Maker to build it.