Working Effectively with
 Legacy Perl Code
       Erik Rantapaa
      Frozen Perl 2010
What is Legacy Code?
What is Legacy Code?

Uses old/obsolete perl and modules
Not well factored / organically grown
Original developers gone
No one fully understands it
Uses outdated practices
Hard to modify without breaking it
Long new developer ramp-up time
No documentation (or wrong documentation)
No tests

On the other hand...
  Doing useful work - generating revenue
What is Legacy Code?


Michael Feathers - "Code without tests"

Ward Cunningham - "Accumulated technical debt"

Stuart Halloway - "Legacy is the degree to which code:
                    - fails to capture intent
                    - fails to communicate intent
                    - captures irrelevant detail (ceremony)"
Example Legacy App

 e-commerce application
 over 10 years old
 "rewritten" at least a couple of times
 worked on by lots of different developers
 > 1000 .pm files
 > 150 database tables, >2000 columns
 3 templating systems, 100's of template files
 used perl 5.8.0 until recently
 lots of old versions of CPAN modules (some customized)
 didn't compile with -cw until recently
 rm -rf not an option
Overview

 Unit Testing from a Perl perspective
 Working with the Code Base
 Instrumentation
 Future Directions
The Case for Testing


"... With tests we can change our code quickly and verifiably.
Without them we really don't know if our code is getting better
or worse. ..."

  - Michael Feathers
Cost of Fixing Defects
The Feedback Cycle
Testing vs. Debugging

Debugging:
  manual set-up (set breakpoints, step code, etc.)
  temporarily modifies code (printf debugging)
  manual verification
  pay for it every time

Testing:
   no source code modification
   automated set-up and verification
   pay once when you create the test
   reap the rewards every time you run it
Your Application
A Unit Test
Unit Testing

   isolates a small piece of functionality
   eliminates dependencies on other components through
   mocking / substitution
   ideally runs quickly
   provides automated verification

Benefits:
   safety net during refactoring
   after refactoring becomes a regression test
   speeds up the Edit-Compile-Run-Debug cycle
Impediments to Unit Testing

Main impediment: the way the application is glued together.

use LWP;
...
sub notify_user {
  my ($user, $message) = @_;
  ...
  my $ua = LWP::UserAgent->new;
  ...
  $ua->request(...);
}
Strongly Coupled Concerns

sub emit_html {




}
Dependency Breaking Techniques
Adapt Parameter                Parameterize Constructor
Break Out Method Object        Parameterize Method
Definition Completion          Primitive Parameter
Encapsulate Global             Pull Up Feature
References                     Push Down Dependency
Extract and Override Call      Replace Function with
Extract and Override Factory   Function Pointer
Method                         Replace Global Reference
Extract and Override Getter    with Getter
Extract Implementer            Subclass and Override
Extract Interface              Method
Introduce Instance Delegator   Supersede Instance Variable
Introduce Static Setter        Template Redefinition
Link Substitution              Text Redefinition
                               ...
Sprouting

   replace a block of code with a subroutine/method call (even
   for new code)

Benefits:

   simple and safe code transformation
   code block can be run independently
   permits the code to be redefined
Sprouting Example - DBI calls

Any DBI call is a good candidate for sprouting.

DBI call = an action on a business concept
Sprouting permits:
   testing of SQL syntax
   testing of query operation
   removal of interaction with the database
Dependency Injection

# use LWP;
...
sub notify_user {
  my ($user, $message, $ua ) = @_;
  ...
  # my $ua = LWP::UserAgent->new;
  ...
  $ua->request(...);
}

"Ask for - Don't create"
Preserving Signatures

use LWP;
...
sub notify_user {
  my ($user, $message, $ua ) = @_;
  ...
   $ua ||= LWP::UserAgent->new;
  ...
  $ua->request(...);
}
Perl-Specific Mocking/Isolation
         Techniques
Replacing a Package

 alter @INC to load alternate code

 alter %INC to omit unneeded code
Replacing Subroutines

Monkey patch the symbol table:

  no warnings 'redefine';
  *Product::saveToDatabase = sub { $_[0]->{saved} = 1 };

  *CORE::GLOBAL::time = sub { ... }

Create accessors for package lexicals
Modifying Object Instances

Re-bless to a subclass:

package MockTemplateEngine;
our @ISA = qw(RealTemplateEngine);
sub process {
  ... new behavior ...
}
...
$template = RealTemplateEngine->new(...);
bless $template, 'MockTemplateEngine';
Use the Stack

Define behavior based on caller().

Use in conjunction with monkey patching subs.
Exploiting Perl's Dynamicism

  Use the dynamic capabilities of Perl to help you isolate code
  and create mock objects.
  Not the cleanest techniques, but they can greatly simplify
  getting dependency-laden code under test.
Manual Testing

 Not always bad
 Some times it's the best option:
    automated verification is difficult / impossible
    automated test to costly to create
Manual Testing Example
Using an Wrapper Layer
Real-World Experience

 Writing the first test is very difficult
 It gets easier!
 Biggest win: automated verification
 Next biggest win: efficient tests
 Manual testing is ok (save what you did!)
More Real-World Experience

 Isolating code will help you understand the dependency
 structure of your application
 Let your tests guide refactoring
 Testable code ~ clean code ~ modular code
Working with the Code Base
Source Control

Get everthing under source control:
   perl itself
   all CPAN modules
   shared libraries
   Apache, mod_perl, etc.
Reliable Builds

  Automate building and deployment
  Break dependencies on absolute paths, port numbers
  Enable multiple deployments to co-exist on the same
  machine
Logging

   make logs easy to access
   import logs into a database
   automate a daily synopsis

SELECT count(*), substr(message, 0, 50) as "key"
FROM log_message
WHERE log_time BETWEEN ... AND ...
GROUP BY key
ORDER BY count(*) DESC;
Wrappers

Create deployment-specific wrappers for perl and perldoc:

#!/bin/sh
PATH=...
PERL5LIB=...
LD_LIBRARY_PATH=...
ORACLE_HOME=...
...

/path/to/app/perl "$@"
Create Tools

$ compile-check Formatter.pm

Runs app-perl -cw -MApp::PaymentService::Email::Formatter

Full module name determined from current working directory.
Automated Testing

 Continuous Integration
 Smoke Tests
 Regression Tests
Instrumentation
Devel::NYTProf

Pros:
   full instrumentation of your code
   timing statistics on a per-line basis
   call-graphs, heat map
   useful for

Cons:
  big performance hit (not suitable for production)
  you need to drive it

Also see Aspect::Library::NYTProf for targeted profiling
Devel::Leak::Object

   Original purpose: find leakage of objects due to cyclic
   references

Modifications:

   Record how (the stack) objects get created
   Count the number of each type of object created
   Determine where objects of a certain class are created
   Robust facility to instrument bless and DESTROY.
dtrace


Pros:
   instrument events across the entire system
   runs at nearly full speed - usable in production

Cons:
  only available for Mac OS and Solaris
  perl probes still under development
Future Directions

More static analysis
  manually added soft type assertions
  more help in editors/IDEs
  more refactoring tools
  better compilation diagnostics
  catch mistyped subroutine names
  catch errors in calling methods/subs
  catch other type errors
Future Directions

More dynamic instrumention
  usable in production
  low performance hit
  mainly interested in:
      the call chain (stack)
      types of variables / subroutine parameters / return
      values
  dtrace? modified perl interpreter?
Future Directions

Automated Instrumentation
   build a database of the runtime behavior of your program
   collect design details not expressed in the source
   mine the database to infer internal rules:
       call graphs
       signature of subroutines
   use rules to aid static analysis
A Refactoring Problem

Want to change $self->{PRICE} to $self->getPrice:

package Product;

sub foo {
  my ($self, ...) = @_;
  ...
  ... $self->{PRICE} ...
  ...
}
A Refactoring Problem

Within package Product → easy!

package Product;

sub getPrice { ... }

sub foo {
  my ($self, ...) = @_;
  ...
  ... $self->getPrice ... # know type of $self
  ...
}
A Refactoring Problem

Can we change this?

package SomeOtherClass;

sub bar {
  ...
  ... $product->{PRICE} ...
}

Need to know if $product is from package Product (or a
subclass)
A Refactoring Problem

Look at context for help.

package SomeOtherClass;

sub bar {
  ...
  my $product = Product->new(...);
  ...
  ... $product->{PRICE} ...
}
A Refactoring Problem

Now what?

package SomeOtherClass;

sub bar {
  my ($self, $product, ...) = @_;
  ...
  ... $product->{PRICE} ...
}

Need to figure out where SomeOtherClass::bar() gets called.

Instrumentation DB → just look up the answer
References

 "Working Effectively with Legacy Code" - Michael Feathers
 "Clean Code Talks" - googletechtalks channel on YouTube
 Writing Testable Code - Misko Hevery http://misko.hevery.
 com/code_reviewers_guide
 "Clean Code: A Handbook of Agile Software
 Craftsmanship," Robert C. Martin, ed.
Thanks!

Feedback, suggestions, experiences?

                  erantapaa@gmail.com

Working Effectively With Legacy Perl Code

  • 1.
    Working Effectively with Legacy Perl Code Erik Rantapaa Frozen Perl 2010
  • 2.
  • 3.
    What is LegacyCode? Uses old/obsolete perl and modules Not well factored / organically grown Original developers gone No one fully understands it Uses outdated practices Hard to modify without breaking it Long new developer ramp-up time No documentation (or wrong documentation) No tests On the other hand... Doing useful work - generating revenue
  • 4.
    What is LegacyCode? Michael Feathers - "Code without tests" Ward Cunningham - "Accumulated technical debt" Stuart Halloway - "Legacy is the degree to which code: - fails to capture intent - fails to communicate intent - captures irrelevant detail (ceremony)"
  • 5.
    Example Legacy App e-commerce application over 10 years old "rewritten" at least a couple of times worked on by lots of different developers > 1000 .pm files > 150 database tables, >2000 columns 3 templating systems, 100's of template files used perl 5.8.0 until recently lots of old versions of CPAN modules (some customized) didn't compile with -cw until recently rm -rf not an option
  • 6.
    Overview Unit Testingfrom a Perl perspective Working with the Code Base Instrumentation Future Directions
  • 8.
    The Case forTesting "... With tests we can change our code quickly and verifiably. Without them we really don't know if our code is getting better or worse. ..." - Michael Feathers
  • 9.
  • 10.
  • 11.
    Testing vs. Debugging Debugging: manual set-up (set breakpoints, step code, etc.) temporarily modifies code (printf debugging) manual verification pay for it every time Testing: no source code modification automated set-up and verification pay once when you create the test reap the rewards every time you run it
  • 12.
  • 13.
  • 14.
    Unit Testing isolates a small piece of functionality eliminates dependencies on other components through mocking / substitution ideally runs quickly provides automated verification Benefits: safety net during refactoring after refactoring becomes a regression test speeds up the Edit-Compile-Run-Debug cycle
  • 15.
    Impediments to UnitTesting Main impediment: the way the application is glued together. use LWP; ... sub notify_user { my ($user, $message) = @_; ... my $ua = LWP::UserAgent->new; ... $ua->request(...); }
  • 16.
  • 17.
    Dependency Breaking Techniques AdaptParameter Parameterize Constructor Break Out Method Object Parameterize Method Definition Completion Primitive Parameter Encapsulate Global Pull Up Feature References Push Down Dependency Extract and Override Call Replace Function with Extract and Override Factory Function Pointer Method Replace Global Reference Extract and Override Getter with Getter Extract Implementer Subclass and Override Extract Interface Method Introduce Instance Delegator Supersede Instance Variable Introduce Static Setter Template Redefinition Link Substitution Text Redefinition ...
  • 18.
    Sprouting replace a block of code with a subroutine/method call (even for new code) Benefits: simple and safe code transformation code block can be run independently permits the code to be redefined
  • 19.
    Sprouting Example -DBI calls Any DBI call is a good candidate for sprouting. DBI call = an action on a business concept Sprouting permits: testing of SQL syntax testing of query operation removal of interaction with the database
  • 20.
    Dependency Injection # useLWP; ... sub notify_user { my ($user, $message, $ua ) = @_; ... # my $ua = LWP::UserAgent->new; ... $ua->request(...); } "Ask for - Don't create"
  • 21.
    Preserving Signatures use LWP; ... subnotify_user { my ($user, $message, $ua ) = @_; ... $ua ||= LWP::UserAgent->new; ... $ua->request(...); }
  • 22.
  • 23.
    Replacing a Package alter @INC to load alternate code alter %INC to omit unneeded code
  • 24.
    Replacing Subroutines Monkey patchthe symbol table: no warnings 'redefine'; *Product::saveToDatabase = sub { $_[0]->{saved} = 1 }; *CORE::GLOBAL::time = sub { ... } Create accessors for package lexicals
  • 25.
    Modifying Object Instances Re-blessto a subclass: package MockTemplateEngine; our @ISA = qw(RealTemplateEngine); sub process { ... new behavior ... } ... $template = RealTemplateEngine->new(...); bless $template, 'MockTemplateEngine';
  • 26.
    Use the Stack Definebehavior based on caller(). Use in conjunction with monkey patching subs.
  • 27.
    Exploiting Perl's Dynamicism Use the dynamic capabilities of Perl to help you isolate code and create mock objects. Not the cleanest techniques, but they can greatly simplify getting dependency-laden code under test.
  • 28.
    Manual Testing Notalways bad Some times it's the best option: automated verification is difficult / impossible automated test to costly to create
  • 29.
  • 30.
  • 31.
    Real-World Experience Writingthe first test is very difficult It gets easier! Biggest win: automated verification Next biggest win: efficient tests Manual testing is ok (save what you did!)
  • 32.
    More Real-World Experience Isolating code will help you understand the dependency structure of your application Let your tests guide refactoring Testable code ~ clean code ~ modular code
  • 33.
  • 34.
    Source Control Get everthingunder source control: perl itself all CPAN modules shared libraries Apache, mod_perl, etc.
  • 35.
    Reliable Builds Automate building and deployment Break dependencies on absolute paths, port numbers Enable multiple deployments to co-exist on the same machine
  • 36.
    Logging make logs easy to access import logs into a database automate a daily synopsis SELECT count(*), substr(message, 0, 50) as "key" FROM log_message WHERE log_time BETWEEN ... AND ... GROUP BY key ORDER BY count(*) DESC;
  • 37.
    Wrappers Create deployment-specific wrappersfor perl and perldoc: #!/bin/sh PATH=... PERL5LIB=... LD_LIBRARY_PATH=... ORACLE_HOME=... ... /path/to/app/perl "$@"
  • 38.
    Create Tools $ compile-checkFormatter.pm Runs app-perl -cw -MApp::PaymentService::Email::Formatter Full module name determined from current working directory.
  • 39.
    Automated Testing ContinuousIntegration Smoke Tests Regression Tests
  • 40.
  • 41.
    Devel::NYTProf Pros: full instrumentation of your code timing statistics on a per-line basis call-graphs, heat map useful for Cons: big performance hit (not suitable for production) you need to drive it Also see Aspect::Library::NYTProf for targeted profiling
  • 42.
    Devel::Leak::Object Original purpose: find leakage of objects due to cyclic references Modifications: Record how (the stack) objects get created Count the number of each type of object created Determine where objects of a certain class are created Robust facility to instrument bless and DESTROY.
  • 43.
    dtrace Pros: instrument events across the entire system runs at nearly full speed - usable in production Cons: only available for Mac OS and Solaris perl probes still under development
  • 44.
    Future Directions More staticanalysis manually added soft type assertions more help in editors/IDEs more refactoring tools better compilation diagnostics catch mistyped subroutine names catch errors in calling methods/subs catch other type errors
  • 45.
    Future Directions More dynamicinstrumention usable in production low performance hit mainly interested in: the call chain (stack) types of variables / subroutine parameters / return values dtrace? modified perl interpreter?
  • 46.
    Future Directions Automated Instrumentation build a database of the runtime behavior of your program collect design details not expressed in the source mine the database to infer internal rules: call graphs signature of subroutines use rules to aid static analysis
  • 47.
    A Refactoring Problem Wantto change $self->{PRICE} to $self->getPrice: package Product; sub foo { my ($self, ...) = @_; ... ... $self->{PRICE} ... ... }
  • 48.
    A Refactoring Problem Withinpackage Product → easy! package Product; sub getPrice { ... } sub foo { my ($self, ...) = @_; ... ... $self->getPrice ... # know type of $self ... }
  • 49.
    A Refactoring Problem Canwe change this? package SomeOtherClass; sub bar { ... ... $product->{PRICE} ... } Need to know if $product is from package Product (or a subclass)
  • 50.
    A Refactoring Problem Lookat context for help. package SomeOtherClass; sub bar { ... my $product = Product->new(...); ... ... $product->{PRICE} ... }
  • 51.
    A Refactoring Problem Nowwhat? package SomeOtherClass; sub bar { my ($self, $product, ...) = @_; ... ... $product->{PRICE} ... } Need to figure out where SomeOtherClass::bar() gets called. Instrumentation DB → just look up the answer
  • 52.
    References "Working Effectivelywith Legacy Code" - Michael Feathers "Clean Code Talks" - googletechtalks channel on YouTube Writing Testable Code - Misko Hevery http://misko.hevery. com/code_reviewers_guide "Clean Code: A Handbook of Agile Software Craftsmanship," Robert C. Martin, ed.
  • 53.