SlideShare a Scribd company logo
Perl Best
Practices
Randal L. Schwartz
merlyn@stonehenge.com
version 4.3 on 11 Jun 2017
This document is copyright 2006,2007,2008, 2017 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.
Introduction
• This talk organized according to “Perl Best Practices”
• With my own twist along the way
• Perhaps I should call this “second Best Practices”?
Chapter 2. Code Layout
• There’s more than one way to indent it
• This isn’t Python
• Blessing and a curse
• You’re writing for two readers:
• The compiler (and it doesn’t care)
• Your maintenance programmer
• Best to ensure the maintenance programmer doesn’t
hunt you down with intent to do serious bodily harm
• You are the maintenance programmer three months later
Section 2.1. Bracketing
• Indent like K&R
• Paren pairs at end of opening line, and beginning of close:

my @items = (

12,

19,

);
• Similar for control flow:

if (condition) {

true branch;

} else {

false branch;

}
Section 2.2. Keywords
• Put a space between keyword and following punctuation:

if (...) { ... } else { ... }

while (...) { ... }
• Not “if(...)”
• It’s easier to pick out the keyword from the rest of the
code that way
• Keywords are important!
Section 2.3. Subroutines andVariables
• Keep subroutine names cuddled with args:

foo(3, 5) # not foo (3, 5)
• Keep variable indexing cuddled with variables:

$x[3] # not $x [3]
• If you didn’t know you could do that, forget that you can!
Section 2.4. Builtins
• Avoid unnecessary parens for built-ins
• Good:

print $x;
• Bad:

print($x);
• But don’t avoid the necessary parens:

warn(“something is wrong”), next if $x > 3;
• Without the parens, that would have executed “next”
too early
Section 2.5. Keys and Indices
• Separate complex keys from enclosing brackets:

$x[ complex() * expression() ]
• But don’t use this for trivial keys:

$x[3]
• The extra whitespace helps the reader find the brackets
Section 2.6. Operators
• Add whitespace around binary operators if it helps to
see the operands:

print $one_value + $two_value[3];

print 3+4; # wouldn’t help much here
• Don’t add whitespace after a prefix unary operator:

$x = !$y; # no space after !
Section 2.7. Semicolons
• Put a semicolon after every statement
• Yeah, even the ones at the end of a block:
if ($condition) {
state1;
state2;
state3; # even here
}
• Very likely, this code will be edited
• Occasionally, a missing semicolon is not a syntax error
• Exception: blocks on a single line:
if ($foo) { this } else { that }
Section 2.8. Commas
• Include comma after every list element:

my @days = (

‘Sunday’,

‘Monday’,

);
• Easier to add new items
• Don’t need two different behaviors
• Easier to swap items
Section 2.9. Line Lengths
• Damian says “use 78 character lines”
• I’d say keep it even shorter
• Long lines are hard to read
• Long lines might wrap if sent in email
• Long lines are harder to copypasta for reuse
Section 2.10. Indentation
• Damian says “use 4-column indentation levels”
• I use Emacs cperl-mode, and use whatever it does
• Yeah, there’s probably some configuration to adjust it
• Better still, write your code where the indentation is
clear
Section 2.11. Tabs
• Indent with spaces, not tabs
• Tabs expand to different widths per user
• If you must use tabs, ensure all tabs are at the beginning
of line
• You can generally tell your text editor to insert spaces
• For indentation
• When you hit the tab key
• And you should do that!
Section 2.12. Blocks
• Damian says “Never place two statements on the same
line”
• Except when it’s handy
• Statements are a logical chunk
• Whitespace break helps people
• Also helps cut-n-paste
• Also keeps the lines from getting too long (see earlier)
Section 2.13. Chunking
• Code in commented paragraphs
• Whitespace between chunks is helpful
• Chunk according to logical steps
• Interpreting @_ separated from the next step in a
subroutine
• Add leading comments in front of the paragraph for a
logical-step description
• Don’t use comments for user docs, use POD
Section 2.14. Elses
• Damian says “Don’t cuddle an else”
• Cuddled:

} else {
• Uncuddled:

}

else {
• I disagree. I like cuddled elses
• They save a line
• I recognize “} else {“ as “switching from true to false”
• Think of it as the “batwing else”
Section 2.15. Vertical Alignment
• Align corresponding items vertically
• (Hard to show in a variable-pitch font)
• Something like:

$foo = $this + $that;

$bartholomew = $the + $other;
• Yeah, that’s pretty
• For me, it’d be more time fidding in the editor rather
than coding
• Again, your choice is yours
Section 2.16. Breaking Long Lines
• Break long expressions before an operator:

print $offset

+ $rate * $time

+ $fudge_factor

;
• The leading operator is “out of place” enough
• Leads the reader to know this is a continuation
• Allows for a nice comment on each line too
• Lineup should be with the piece in the first line
• Again, this seems like a lot of work in an editor
• Emacs cperl-mode can line up on an open paren
Section 2.17. Non-Terminal Expressions
• If it simplifies things, name a subexpression
• Instead of:

my $result = $foo + (very complicated thing) + $bar;
• Use

my $meaningful_name = very complicated thing;

my $result = $foo + $meaningful_name + $bar;
• Don’t use an meaningless name
• This also helps in debugging
• Set a breakpoint after the intermediate computation
• Put a watchpoint on that value
• There’s no winner in the “eliminate variables” contest
Section 2.18. Breaking by Precedence
• Break the expression at lowest precedence
• To break up: $offset + $rate * $time
• Good:

$offset

+ $rate * $time
• Bad:

$offset + $rate

* $time
• Keep same level of precedence at same visual level
Section 2.19. Assignments
• Break long assignments before the operator
• To fix:

$new_offset = $offset + $rate * $time;
• Bad:

$new_offset = $offset

+ $rate * $time;
• Good:

$new_offset

= $offset

+ $rate * $time;
• Often I put the assignment on the previous line
• Damian would disagree with me on that
Section 2.20. Ternaries
• Use ?: in columns

my $result

= $test1 ? $test1_true

: $test2 ? $test2_true

: $test3 ? $test3_true

: $else_value;
• Of course, Damian also lines up the question marks
• Conclusion: Damian must have a lot of time on his hands
Section 2.21. Lists
• Parenthesize long lists
• Helps to show beginning and ending of related things
• Also tells Emacs how to group and indent it:

print (

$one,

$two,

$three,

);
• Nice setting: (cperl-indent-parens-as-block t)
Section 2.22. Automated Layout
• Use a consistent layout
• Pick a style, and stick with it
• Use perltidy to enforce it
• But don’t waste time reformatting everything
• Life’s too short to get into fights on this
Chapter 3. Naming Conventions
• Syntactic consistency
• Related things look similar
• Semantic consistency
• Names of things reflect their purpose
Section 3.1. Identiers
• Use grammar-based identifiers
• Packages: Noun ::Adjective ::Adjective
• Disk::DVD::Rewritable
• Variables: adjective_adjective_noun
• $estimated_net_worth
• Lookup vars: adjective_noun_preposition
• %blocks_of, @sales_for
• Subs: imperative_adjective_noun[_prep]
• get_next_cookie, eat_cake
• get_cookie_of_type, eat_cake_using
Section 3.2. Booleans
• Name booleans after their test
• Scalars:

$has_potential

$has_been_checked

$is_internal

$loading_is_nished
• Subroutines:

is_excessive

has_child_record
• Often starts with is or has
Section 3.3. ReferenceVariables
• Damian says “mark variables holding a ref to end in _ref”:

my $items_ref = @items;
• I generally do this, but just use “ref”:

my $itemsref = @items;
• Rationale: ordinary scalars and refs both go in to scalar
vars
• Need some way of distinguishing typing information
• use-strict-refs helps here too
Section 3.4. Arrays and Hashes
• Give arrays plural names:

@items

@records

@input_les
• Give hashes singular names:

%option

%highest_of

%total_for
• “Mapping” hashes should end in a connector
Section 3.5. Underscores
• Use underscores to separate names
• $total_paid instead of $totalpaid
• Helps to distinguish items
• What was “remembersthelens.com”?
• Not “remembers the lens” as I thought
• It’s “remember st helens”!
• Not CamelCase (except for some package names)
Section 3.6. Capitalization
• The more capitalized, the more global it is
• All lowercase for local variables:

$item @queries %result_of
• Mixed case for package/class names:

$File::Finder
• Uppercase for constants:

$MODE $HOURS_PER_DAY
• Exception: uppercase acronyms and proper names:

CGI::Prototype
Section 3.7. Abbreviations
• Abbreviate identifiers by prefix
• $desc rather than $dscn
• $len rather than $lngth
• But $ctrl rather than $con, since it’s familiar
• $msg instead of $mess
• $tty instead of $ter
Section 3.8. Ambiguous Abbreviations
• Don’t abbreviate beyond meaning
• Is $term_val
• TerminationValid
• TerminalValue
• Is $temp
• Temperature
• Temporary
• Context and comments are useful
• con & com r u
Section 3.9. Ambiguous Names
• Avoid ambiguous names
• last (final or previous)
• set (adjust or collection)
• left (direction, what remains)
• right (direction, correct, entitlement)
• no (negative,“number”)
• record (verb or noun)
• second (time or position)
• close (nearby or shut)
• use (active usage, or category of function)
Section 3.10. Utility Subroutines
• Use underscore prefix as “internal use only”
• Obviously, this won’t stop anyone determined to do it
• But it’s a clue that it’s maybe not a good idea
• External interface:

sub you_call_this {

... _internal_tweaking(@foo) ...;

}

sub _internal_tweaking { ... }
• Historical note: broken in Perl4
• Made the subroutine a member of ALL packages
Chapter 4. Values and Expressions
• How literals look
• How expressions operate
Section 4.1. String Delimiters
• Damian says “Use double quotes when you’re
interpolating”
• Example:“hello $person”
• But ‘hello world’ (note single quotes)
• In contrast, I always use double quotes
• I’m lazy
• I often edit code later to add something that needs it
• There’s no efficiency difference
• Perl compiles “foo” and ‘foo’ identically
Section 4.2. Empty Strings
• Use q{} for the empty string
• The rest of this page intentionally left blank
• And empty
Section 4.3. Single-Character Strings
• Likewise, use q{X} for the single character X.
• Again, I can disagree with this
• But it helps these stand out:

my $result = join(q{,},

‘this’,

‘that’,

);
• That first comma deserves a bit of special treatment.
Section 4.4. Escaped Characters
• Use named escapes instead of magical constants
• Bad:“x7fx06x18Z”
• Good:

use charnames qw{:full};

“N{DELETE}N{ACKNOWLEDGE}N{CANCEL}Z”
• For some meaning of “good”
• Damian claims that is “self documenting”
• Yeah, right
Section 4.5. Constants
• Create constants with the Readonly module
• Don’t use “constant”, even though it’s core
• use constant PI => 3;
• You can’t interpolate these easily:
• print “In indiana, pi is @{[PI]}n”;
• Instead, get Readonly from the CPAN:
• Readonly my $PI => 3;
• Now you can interpolate:
• print “In indiana, pi is $PIn”;
• Or consider Const::Fast or Attribute::Constant
• 20 to 30 times faster than Readonly
• neilb.org/reviews/constants.html for more
Section 4.6. Leading Zeros
• Don’t pad decimal numbers with leading 0’s
• They become octal
• 001, 002, 004, 008
• Only barfs on 008!
• Damian says “don’t use leading 0’s ever”
• Use oct() for octal values: oct(600) instead of 0600
• Yeah, right
• Could confuse some readers
• But only those who haven’t read the Llama
Section 4.7. Long Numbers
• Use underscores in long numbers
• Instead of 123456, write 123_456
• What? You didn’t know?
• Yeah, underscores in a number are ignored
• 12_34_56 is also the same
• As is 1________23456
• Works in hex/oct too:

0xFF_FF_00_FF
Section 4.8. Multiline Strings
• Layout multiline strings over multiple lines
• Sure, you can do this:

my $message = “You idiot!

You should have enabled the frobulator!

”;
• Use this instead:

my $message

= “You idiot!n”

.“You should have enabled the formulator!n”

;
• This text can be autoindented more nicely
Section 4.9. Here Documents
• Use a here-doc for more than two lines
• Such as:

my $usage = <<”END_USAGE”;

1) insert plug

2) turn on power

3) wait 60 seconds for warm up

4) pull plug if sparks appear during warm up

END_USAGE
Section 4.10. Heredoc Indentation
• Use a “theredoc” when a heredoc would have been
indented:

use Readonly;

Readonly my $USAGE = <<”END_USAGE”;

1) insert plug

2) turn on power

END_USAGE
• Now you can use it in any indented code

if ($usage_error) {

print $USAGE;

}
Section 4.11. Heredoc Terminators
• Make heredocs end the same
• Suggestion:“END_” followed by the type of thing
• END_MESSAGE
• END_SQL
• END_OF_WORLD_AS_WE_KNOW_IT
• All caps to stand out
Section 4.12. Heredoc Quoters
• Always use single or double quotes around terminator
• Makes it clear what kind of interpolation is being used
• <<”END_FOO” gets double-quote interpolation
• <<’END_FOO’ gets single-quote
• Default is actually double-quote
• Most people don’t know that
Section 4.13. Barewords
• Grrr
• Don’t use barewords
• You can’t use them under strict anyway
• Instead of: @months = (Jan, Feb, Mar);
• Use: @months = qw(Jan Feb Mar);
Section 4.14. Fat Commas
• Reserve => for pairs
• As in:

my %score = (

win => 30,

tie => 15,

loss => 10,

);
• Don’t use it to replace comma
• Bad: rename $this => $that
• What if $this is “FOO”: use constant FOO => ‘File’
• rename FOO => $that would be a literal “FOO”
instead
Section 4.15. Thin Commas
• Don’t use commas in place of semicolons as sequence
• Bad:

($a = 3), print “$an”;
• Good:

$a = 3; print “$an”;
• If a sequence is desired where an expression must be...
• Use a do-block:

do { $a = 3; print “$an” };
Section 4.16. Low-Precedence Operators
• Don’t mix and/or/not with && || ! in the same expression
• Too much potential confusion:
not $nished || $result
• Not the same as !$finished || $result
• Reserve and/or/not for outer control flow:
print $foo if not $this;
unlink $foo or die;
Section 4.17. Lists
• Parenthesize every raw list
• Bad: @x = 3, 5, 7; (broken!)
• Good: @x = (3, 5, 7);
Section 4.18. List Membership
• Consider any() from List::MoreUtils:

if (any { $requested_slot == $_ } @allocated_slots) { ... }
• Stops checking when a true value has been seen
• However, for “eq” test, use a predefined hash:

my @ACTIONS = qw(open close read write);

my %ACTIONS = map { $_ => 1 } @ACTIONS;

....

if ($ACTIONS{$rst_word}) { ... } # seen an action
Chapter 5. Variables
• A program without variables would be rather useless
• Names are important
Section 5.1. LexicalVariables
• Use lexicals, not package variables
• Lexical access is guaranteed to be in the same file
• Unless some code passes around a reference to it
• Package variables can be accessed anywhere
• Including well-intentioned but broken code
Section 5.2. PackageVariables
• Don’t use package variables
• Except when required by law
• Example: @ISA
• Example: @EXPORT, @EXPORT_OK, etc
• Otherwise, use lexicals
Section 5.3. Localization
• Always localize a temporarily modified package var
• Example:

{

local $BYPASS_AUDIT = 1;

delete_any_traces(@items);

}
• Any exit from the block restores the global
• You can’t always predict all exits from the block
Section 5.4. Initialization
• Initialize any localized var
• Without that, you get undef!
• Example:

local $YAML::Indent = $YAML::Indent;

if ($local_indent) {

$YAML::Indent = $local_indent;

}
• Weird first assignment creates local-but-same value
Section 5.5. PunctuationVariables
• Damian says “use English”
• I say “no - use comments”
• How is $OUTPUT_FIELD_SEPARATOR better than $,
• given that you probably won’t use it anyway
• Use it, but comment it:

{

local $, = “,“; # set output separator

...

}
Section 5.6. Localizing Punctuation
Variables
• Always localize the punctuation variables
• Example: slurping:

my $le = do { local $/; <HANDLE> };
• Important to restore value at end of block
• Items called within the block still see new value
Section 5.7. MatchVariables
• Don’t “use English” on $& or $` or $’
• Always “use English qw(-no_match_vars)”
• Otherwise, your program slows down
• Consider Regexp::MatchContext
• L-value PREMATCH(), MATCH(), POSTMATCH()
Section 5.8. Dollar-Underscore
• Localize $_ if you’re using it in a subroutine
• Example:

sub foo {

...

local $_ = “something”;

s/foo/bar/;

}
• Now the replacement doesn’t mess up the outer $_
Section 5.9. Array Indices
• Use negative indices where possible
• Negative values count from the end
• $some_array[-1] same as
$some_array[$#some_array]
• $foo[-2] same as $foo[$#foo-1]
• $foo[-3] same as $foo[$#foo-2]
• Sadly, can’t use in range
• $foo[1..$#foo] is all-but-first
• Cannot replace with $foo[1..-1]
Section 5.10. Slicing
• Use slices when retrieving and storing related hash/array
items
• Example:

@frames[-1, -2, -3]

= @active{‘top’,‘prev’,‘backup’};
• Note that both array slice and hash slice use @ sigil
Section 5.11. Slice Layout
• Line up your slicing
• Example from previous slide:

@frames[ -1, -2, -3]

= @active{‘top’,‘prev’,‘backup’};
• Yeah, far more typing than I have time for
• Damian must have lots of spare time
Section 5.12. Slice Factoring
• Match up keys and values for those last examples
• For example:

Readonly my %MAPPING = (

top => -1,

prev => -2,

backup => -3,

);
• Now use keys/values with that for the assignment:

@frames[values %MAPPING]

= @active{keys %MAPPING};
• Doesn’t matter that the result is unsorted. It Just Works.
Chapter 6. Control Structures
• More than one way to control it
• Some ways better than others
Section 6.1. If Blocks
• Use block if, not postfix if.
• Bad:

$sum += $measurement if dened $measurement;
• Good:

if (defined $measurement) {

$sum += $measurement;

}
Section 6.2. Postx Selectors
• So when can we use postfix if?
• For flow of control!
• last FOO if $some_condition;
• die “horribly” unless $things_worked_out;
• next, last, redo, return, goto, die, croak, throw
Section 6.3. Other Postx Modiers
• Don’t use postfix unless/for/while/until
• Says Damian
• If the controlled expression is simple enough, go ahead
• Good:

print “ho ho $_” for @some_list;
• Bad:

defined $_ and print “ho ho $_”

for @some_list;
• Replace that with:

for (@some_list) {

print “ho ho $_” if defined $_;

}
Section 6.4. Negative Control Statements
• Don’t use unless or until at all
• Replace unless with if-not
• Replace until with while-not
• Says Damian
• My exclusion: Don’t use “unless .. else”
• unless(not $foo and $bar) { ... } else { ... }
• Arrrgh!
• And remember your DeMorgan’s laws
• Replace not(this or that) with not this and not that
• Replace not(this and that) with not this or not that
• Often, that will simplify things
Section 6.5. C-Style Loops
• Avoid C-style for loops
• Says Damian
• I say they’re ok for this:
for (my $start = 0; $start <= 100; $start += 3) { ... }
Section 6.6. Unnecessary Subscripting
• Avoid subscripting arrays and hashes in a loop
• Compare:

for my $n (0..$#items) { print $items[$n] }
• With:

for my $item (@items) { print $item }
• And:

for my $key (keys %mapper) { print $mapper{$key} }
• With:

for my $value (values %mapper) { print $value }
• Can’t replace like this if you need the index or key
Section 6.7. Necessary Subscripting
• Use Data::Alias or Lexical::Alias
• Don’t subscript more than once in a loop:

for my $k (sort keys %names) {

if (is_bad($names{$k})) {

print “$names{$k} is bad”; ...
• Instead, alias the value:

for my $k (sort keys %names) {

alias my $name = $names{$k};
• The $name is read/write!
• Requires Data::Alias or Lexical::Alias
Section 6.8. IteratorVariables
• Always name your foreach loop variable
• Unless $_ is more natural
• for my $item (@items) { ... }
• Example of $_:
s/foo/bar/ for @items;
Section 6.9. Non-Lexical Loop Iterators
• Always use “my” with foreach
• Bad:

for $foo (@bar) { ... }
• Good:

for my $foo (@bar) { ... }
Section 6.10. List Generation
• Use map instead of foreach to transform lists
• Bad:

my @output;

for (@input) {

push @output, some_func($_);

}
• Good:

my @output = map some_func($_), @input;
• Concise, faster, easier to read at a glance
• Can also stack them easier
Section 6.11. List Selections
• Use grep and first to search a list
• Bad:

my @output;

for (@input) {

push @output, $_ if some_func($_);

}
• Good:

my @output = grep some_func($_), @input;
• But first:

use List::Util qw(first);

my $item = rst { $_ > 30 } @input;
Section 6.12. List Transformation
• Transform list in place with foreach, not map
• Bad:

@items = map { make_bigger($_) } @items;
• Good:

$_ = make_bigger($_) for @items;
• Not completely equivalent
• If make_bigger in a list context returns varying items
• But generally what you’re intending
Section 6.13. Complex Mappings
• If the map block is complex, make it a subroutine
• Keeps the code easier to read
• Names the code for easier recognition
• Permits testing the subroutine separately
• Allows reusing the same code for more than one map
Section 6.14. List Processing Side Effects
• Modifying $_ in map or grep alters the input!
• Don’t do that
• Instead, make a copy:

my @result = map {

my $copy = $_;

$copy =~ s/foo/bar/g;

$copy;

} @input;
• Note that the last expression evaluated is the result
• Don’t use return!
Section 6.15. Multipart Selections
• Avoid if-elsif-elsif-else if possible
• But what to replace it with?
• Many choices follow
Section 6.16. Value Switches
• Use table lookup instead of cascaded equality tests
• Rather than:

if ($n eq ‘open’) { $op = ‘o’ }

elsif ($n eq ‘close’ } { $op = ‘c’ }

elsif ($n eq ‘erase’ } { $op = ‘e’ }
• Use:

my %OPS = (open => ‘o’, close => ‘c’, erase => ‘e’);

...

if (my $op = $OPS{$n}) { ... } else { ... }
• Table should be initialized once
Section 6.17. Tabular Ternaries
• Sometimes, cascaded ?: are the best
• Test each condition, return appropriate value
• Example:
my $result
= condition1 ? result1
: condition2 ? result2
: condition3 ? result3
: fallbackvalue
;
Section 6.18. do-while Loops
• Don’t.
• Do-while loops do not respect last/next/redo
• If you need test-last, use a naked block:

{

...

...

redo if SOME CONDITION;

}
• last/next/redo works with this block nicely
Section 6.19. Linear Coding
• Reject as many conditions as early as possible
• Isolate disqualifying conditions early in the loop
• Use “next” with separate conditions
• Example:
while (<....>) {
next unless /S/; # skip blank lines
next if /^#/; # skip comments
chomp;
next if length > 20; # skip long lines
...
}
Section 6.20. Distributed Control
• Don’t throw everything into the loop control
• “Every loop should have a single exit” is bogus
• Exit early, exit often, as shown by previous item
Section 6.21. Redoing
• Use redo with foreach to process the same item
• Better than a while loop where the item count may or
may not be incremented
• No, i didn’t really understand the point of this, either
Section 6.22. Loop Labels
• Label loops used with last/next/redo
• last/next/redo provide a limited “goto”
• But it’s still sometimes confusing
• Always label your loops for these:
LINE: while (<>) { ... next LINE if $cond; ... }
• The labels should be the noun of the thing being
processed
• Makes “last LINE” read like English!
Chapter 7. Documentation
• We don’t need any!
• Perl is self-documenting!
• Wrong
Section 7.1. Types of Documentation
• Be clear about the audience
• Is this for users? or maintainers?
• User docs should go into POD
• Maintainer docs can be in comments
• Might also be good to have those in separate PODs
Section 7.2. Boilerplates
• Create POD templates
• Understand the standard POD
• Enhance it with local standards
• Don’t keep retyping your name and email
• Consider Module::Starter
Section 7.3. Extended Boilerplates
• Add customized POD headers:
• Examples
• Frequently Asked Questions
• Common Usage Mistakes
• See Also
• Disclaimer of Warranty
• Acknowledgements
Section 7.4. Location
• Use POD for user docs
• Keep the user docs near the code
• Helps ensure the user docs are in sync
Section 7.5. Contiguity
• Keep the POD together in the source file
• Says Damian
• I like the style of “little code, little pod”
• Bit tricky though, since you have to get the right =cut:
=item FOO
text about FOO
=cut
sub FOO { ... }
Section 7.6. Position
• Place POD at the end of the file
• Says Damian
• Nice though, because you can add __END__ ahead
• This is true of the standard h2xs templates
• Hard to do if it’s intermingled though
• My preference:
• mainline code
• top of POD page
• subroutines intermingled with POD
• bottom of POD page
• “1;” __END__ and <DATA> (if needed)
Section 7.7. Technical Documentation
• Organize the tech docs
• Make each section clear as to audience and purpose
• Don’t make casual user accidentally read tech docs
• Scares them off from using your code, perhaps
Section 7.8. Comments
• Use block comments for major comments:

#########################

# Called with: $foo (int), $bar (arrayref)

# Returns: $item (in scalar context), @items (in list)
• Teach your editor a template for these
Section 7.9. Algorithmic Documentation
• Use full-line comments to explain the algorithm
• Prefix any “chunk” of code
• Don’t exceed a single line
• If longer, refactor the code to a subroutine
• Put the long explanation as a block comment
Section 7.10. Elucidating Documentation
• Use end-of-line comments to point out weird stuff
• Obviously, definition of “weird” might vary
• Example:
local $/ = “0”; # NUL, not newline
chomp; # using modied $/ here
Section 7.11. Defensive Documentation
• Comment anything that has puzzled or tricked you
• Example:

@options = map +{ $_ => 1 }, @input; # hash not block!
• Think of the reader
• Think of yourself three months from now
• Or at 3am tomorrow
Section 7.12. Indicative Documentation
• If you keep adding comments for “tricky” stuff...
• ... maybe you should just change your style
• That last example can be rewritten:

@options = map { { $_ => 1 } } @input;
• Of course, some might consider nested curlies odd
Section 7.13. Discursive Documentation
• Use “invisible POD” for longer technical notes
• Start with =for KEYWORD
• Include paragraph with no embedded blank lines
• End with blank line and =cut
• As in:
=for Clarity:
Every time we reboot, this code gets called.
Even if it wasn’t our fault. I mean really!
Yeah, that’s our story and we’re sticking to it.
=cut
Section 7.14. Proofreading
• Check spelling, syntax, sanity
• Include both user and internal docs
• Better yet, get someone else to help you
• Hard to tell your own mistakes with docs
• Could make this part of code/peer review
Chapter 8. Built-in Functions
• Use them!
• But use them wisely
Section 8.1. Sorting
• Don’t recompute sort keys inside sort
• Sorting often compares the same item against many
others
• Recomputing means wasted work
• Consider a caching solution
• Simple cache
• Schwartzian Transform
Section 8.2. Reversing Lists
• Use reverse to reverse a list
• Don’t sort { $b cmp $a } - reverse sort!
• Use reverse to make descending ranges:
my @countdown = reverse 0..10;
Section 8.3. Reversing Scalars
• Use scalar reverse to reverse a string
• Add the “scalar” keyword to make it clear
• Beware of:

foo(reverse $bar)
• This is likely list context
• Reversing a single item list is a no-op!
• Suggest: foo(scalar reverse $bar)
Section 8.4. Fixed-Width Data
• Use unpack to extract fixed-width fields
• Example:

my ($x, $y, $z)

= unpack ‘@0 A6 @8 A10 @20 A8’, $input;
• The @ ensures absolute position
Section 8.5. Separated Data
• Use split() for simple variable-width fields
• First arg of string-space for awk-like behavior:

my @columns = split ‘ ’, $input;
• Leading/trailing whitespace is ignored
• Multiple whitespace is a single delimiter
• Trailing whitespace is ignored
• Otherwise, leading delimiters are significant:

my @cols = split /s+/,“ foo bar”;
• $cols[0] = empty string
• $cols[1] = “foo”
Section 8.6. Variable-Width Data
• Text::CSV_XS is your friend
• Handles quoting of delimiters and whitespace
Section 8.7. String Evaluations
• Avoid string eval
• Harder to debug
• Slower (firing up compiler at runtime)
• Might create security holes
• Generally unnecessary
• Runtime data should not affect program space
Section 8.8. Automating Sorts
• Sort::Maker (CPAN) is your friend
• Can build
• Schwartzian Transform
• Or-cache (Orcish)
• Gutmann Rosler Transform
Section 8.9. Substrings
• Use 4-arg substr() instead of lvalue substr()
• Old way:

substr($x, 10, 3) = “Hello”; # replace 10..12 with “Hello”
• New way (for some ancient meaning of “new”):

substr($x, 10, 3,“Hello”); # replace, and return old
• Slightly faster
Section 8.10. HashValues
• Remember that values is an lvalue
• Double all values:

$_ *= 2 for values %somehash;
• Same as:

for (keys %somehash) {

$somehash{$_} *= 2;

}
• But with less typing!
Section 8.11. Globbing
• Use glob(), not <...>
• Save <...> for filehandle reading
• Works the same way:

<* .*> same as glob ‘* .*’
• Don’t use multiple arguments
Section 8.12. Sleeping
• Avoid raw select for sub-second sleeps
• use Time::HiRes qw(sleep);

sleep 1.5;
• Says Damian
• I think “select undef, undef, undef, 1.5” is just fine
• Far more portable
• If you’re scared of select, hide it behind a subroutine
Section 8.13. Mapping and Grepping
• Always use block-form of map/grep
• Never use the expression form
• Too easy to get the “expression” messed up
• Works:

map { join “:”, @$_ } @some_values
• Doesn’t work:

map join “:”, @$_, @some_values;
• Block form has no trailing comma after the block
• Might need disambiguating semicolon:

map {; ... other stuff here } @input
Section 8.14. Utilities
• Use the semi-builtins in Scalar::Util and List::Util
• These are core with modern Perl
• Scalar::Util has: blessed, refaddr, reftype, readonly, tainted,
openhandle, weaken, unweaken, is_weak, dualvar,
looks_like_number, isdual, isvstring, set_prototype
• List::Util has: reduce, any, all, none, notall, first, max,
maxstr, min, minstr, product, sum, sum0, pairs, unpairs,
pairkeys, pairvalues, pairgrep, pairfirst, pairmap, shuffle,
uniq, uniqnum, uniqstr
• reduce is cool:
• my $factorial = reduce { $a * $b } 1..20;
• my $commy = reduce { “$a, $b” } @strings;
Chapter 9. Subroutines
• Making your program modular
• Nice unit of reuse
• GIving name to a set of steps
• Isolating local variables
Section 9.1. Call Syntax
• Use parens
• Don’t use &
• Bad: &foo
• Good: foo()
• Parens help them stand out from builtins
• Lack of ampersand means (rare) prototypes are honored
• Still need the ampersand for a reference though: &foo
Section 9.2. Homonyms
• Don’t overload the built-in functions
• Perl has weird rules about which have precedence
• Sometimes your code is picked (lock)
• Sometimes, the original is picked (link)
Section 9.3. Argument Lists
• Unpack @_ into named variables
• $_[3] is just ugly
• Even compared to the rest of Perl
• Unpack your args:
• my ($name, $rank, $serial) = @_;
• Use “shift” form for more breathing room:
• my $name = shift; # “Flintstone, Fred”
• my $rank = shift; # 1..10
• my $serial = shift; # 7-digit integer
• Modern Perl can also have signatures
• Built-in starting in 5.20 (but primitive)
• Damian now recommends (his) Method::Signatures
Section 9.4. Named Arguments
• Use named parameters if more than three
• Three positional parameters is enough
• For more than three, use a hash:
• my %options = %{+shift};
• my $name = $options{name} || die “?”;
• Pass them as a hashref:
• my_routine({name => “Randal”})
• This catches the odd-number of parameters at the caller,
not the callee.
Section 9.5. Missing Arguments
• Use definedness rather than boolean to test for
existence
• Bad:
if (my $directory = shift) { ... }
• What if $directory is “0”? Legal!
• Better:
if (dened(my $directory = shift)) { ... }
• Or:
if (@_) { my $directory = shift; ... }
• Automatically processed for item-at-a-time built-ins:

while (my $f = readdir $d) { … }
Section 9.6. Default ArgumentValues
• Set up default arguments early
• Avoids temptation to use a quick boolean test later:

my ($text, $arg_ref) = @_;

my $thing = exists $arg_ref->{thing}

? $arg_ref->{thing} :“default thing”;

my $other = exists $arg_ref->{other}

? $arg_ref->{other} :“default other”;
• Or maybe:

my %args = (thing => ‘default thing’,

other => ‘default other’, %$arg_ref);
Section 9.7. Scalar ReturnValues
• Indicate scalar return with “return scalar”
• Ensures that list context doesn’t ruin your day
• Example: returning grep for count (true/false)

sub big10 { return grep { $_ > 10 } @_ }
• Works ok when used as boolean:

if (big10(@input)) { ... }
• Or even to get the count:

my $big_uns = big10(@input);
• But breaks weird when used in a list:

some_other_sub($foo, big10(@input), $bar);
• Fixed: sub big10 { return scalar grep { $_ > 10 } @_ }
Section 9.8. Contextual ReturnValues
• Make list-returning subs intuitive in scalar context
• Consider the various choices:
• Count of items (like grep/map/arrayname)
• Serialized string representation (localtime)
• First item (caller, each, select, unpack)
• “next” item (glob, readline)
• last item (splice, various slices)
• undef (sort)
• third(!) item (getpwnam)
• See http://www.perlmonks.org/index.pl?node_id=347416
Section 9.9. Multi-Contextual Return
Values
• Consider Contextual::Return
• Appropriately Damian-ized (regretfully discouraged now)

use List::Util qw(first);

use Contextual::Return;

sub defined_samples_in {

return (

LIST { grep { defined $_} @_ }

SCALAR { first { defined $_} @_ }

);

}
• It does more, much more. So much more:
• VOID, BOOL, NUM, STR, REF, FAIL, LAZY, LVALUE…
Section 9.10. Prototypes
• Don’t
• Spooky action at a distance
• Not for mortals
• Useful only by wizards
• sub LIST (;&$) { ... }
Section 9.11. Implicit Returns
• Always use explicit return
• Perl returns the last expression evaluated
• But make it explicit:
• return $this;
• Easier to see that the return value is expected
• Protects against someone adding an extra line to the
subroutine, breaking the return
Section 9.12. Returning Failure
• use “return;” for undef/empty list return
• Not “return undef;”
• Thus it falls out of lists correctly:

my @result = map { yoursub($_) } @input;
• If the caller wants the undef, they can say so:

my @result = map { scalar yoursub($_) } @input;
Chapter 10. I/O
• It’s off to work we go
• If a program didn’t have I/O
• Could you still tell it ran?
Section 10.1. Filehandles
• Don’t use bareword filehandles
• Can’t easily pass them around
• Can’t easily localize them
• Can’t make them “go out of scope”
Section 10.2. Indirect Filehandles
• Use indirect filehandles:
• open my $foo,“<”, $someinput or die;
• They close automatically
• Can be passed to/from subroutines
• Can be stored in aggregates
• Might need readline() or print {...} though
• my $line = readline $inputs[3];
• print { $output_for{$day} } @list;
Section 10.3. Localizing Filehandles
• If you must use a package filehandle, localize it
• local *HANDLE
• Beware this also stomps on other package items:
• $HANDLE
• @HANDLE
• %HANDLE
• &HANDLE
• HANDLE dirhandle
• HANDLE format
• So, use it sparingly
• That’s why indirect handles are better
Section 10.4. Opening Cleanly
• Use IO::File or 3-arg open
• IO::File:

use IO::File;

my $in_handle = IO::File->new($name,“<”) or die;
• 3-arg open (with indirect handle):

open my $in_handle,“<”, $name or die;
• 3-arg open ensures safety
• Even if $name starts with < or > or | or ends with |.
• Or whitespace (normally trimmed)
Section 10.5. Error Checking
• Add error checking or die!
• Include the $! as well
• open my $handle,“<”, $that_file
or die “Cannot open $that_file: $!”;
• Also check close and print
• Close and print can fail
• If the file isn’t opened (oops!)
• If the final flush causes an I/O error (oops!)
Section 10.6. Cleanup
• Close filehandles explicitly as soon as possible
• Says Damian
• Or just let them fall out of scope and use a tight scope
• my $line = do { open my $x,“<”,“file”; scalar <$x> };
Section 10.7. Input Loops
• Use while(<>), not foreach(<>)
• While - reads one line at a time
• Exits at undef (no more lines)
• Foreach - reads everything at once
• No reason to bring everything into memory
• Especially when you can’t reference it!
Section 10.8. Line-Based Input
• Similar to previous hint
• If you can grab it a line at a time, do it
• Bad:

print for grep /ITEM/, <INPUT>;
• Good:

while (<INPUT>) { print if /ITEM/ }
Section 10.9. Simple Slurping
• Slurp in a do-block
• Get the entire file:
my $entire_le = do { local $/; <$in> };
• local here automatically provides undef
• The localization protects nearby items
• Dangerous if you had $/ = undef in normal code
• Better/faster/stronger than: join “”, <$in>
Section 10.10. Power Slurping
• Slurp a stream with File::Slurp (from the CPAN)
• Read an entire file:

use File::Slurp qw(read_file);

my $entire_le = read_le $in;
• And a few other interesting options
Section 10.11. Standard Input
• Avoid using STDIN unless you really mean it
• It’s not necessarily the terminal (could be redirected)
• It’s almost never the files on the command line
• It means only “standard input”
• Read from ARGV instead:
my $line = <ARGV>; # explicit form
my $line = <>; # implicit (preferred) form
Section 10.12. Printing to Filehandles
• Always put filehandles in braces in any print statement
• Says Damian
• The rest of us do that when it’s complex:
print $foo @data; # simple
print { $hashref->{foo} } @data; # complex
• If it’s a bareword or a simple scalar, no need for braces
Section 10.13. Simple Prompting
• Always prompt for interactive input
• Keeps the user from wondering:
• Is it running?
• Is it waiting for me?
• Has it crashed?
• Prompt only when interactive though
• Don’t prompt if part of a pipeline
Section 10.14. Interactivity
• Test for interactivity with -t
• Simple test:

if (-t) { ... STDIN is a terminal }
• Damian makes it more complex
• Suggests IO::Interactive from CPAN
Section 10.15. Power Prompting
• Use IO::Prompt for prompting (in CPAN)
• my $line = prompt “line, please? ”;
• Automatically chomps (yeay!)
• my $password = prompt “password:”,

-echo => ‘*’;
• my $choice = prompt ‘letter’, -onechar,

-require => { ‘must be [a-e]’ => qr/[a-e]/ };
• Many more options
• Damian followed up with IO::Prompter (even more
options)
Section 10.16. Progress Indicators
• Interactive apps need progress indicators for long stuff
• People need to know “working? or hung?”
• And “Coffee? Or take the afternoon off?”
• Good estimates of completion time are handy
• Spooky estimates can backfire
• 90% of ticks come within 2 seconds
• last 10% takes 5 minutes... oops
Section 10.17. Automatic Progress
Indicators
• Damianized Smart::Comments in CPAN
• Example:
use Smart::Comments;
for my $path (split /:/, $ENV{PATH}) {
### Checking path... done
for my $file (glob “$path/*”) {
### Checking les... done
if (-x $le) { ... }
}
}
Section 10.18. Autoflushing
• Avoid using select() to set autoflushing:
select((select($fh), $| = 1)[0]);
• Yeah, so I came up with that
• I must have been, uh... confused
• Use IO::Handle (not needed in Perl 5.14 and later)
• Then you can call $fh->autoflush(1)
• And even *STDOUT->autoflush(1)
Chapter 11. References
• Can’t get jobs without them
• Can’t get to indirect data without them either
• Symbolic (soft) references are evil
Section 11.1. Dereferencing
• Prefer arrows over circumflex dereferencing
• Bad: ${$foo{bar}}[3]
• Good: $foo{bar}->[3]
• Can’t reduce: @{$foo{bar}}[3, 4]
• Until Perl 5.20
• $foo{bar}->@[3, 4]
• Might require enabling the feature
Section 11.2. Braced References
• Always use braces for circumflex dereferencing
• Bad: @$foo[3, 4]
• Good: @{$foo}[3, 4]
• Makes the deref item clearer
• Needed anyway when deref item is complex
• Or flip it around to postfix form
Section 11.3. Symbolic References
• Don’t
• use strict ‘refs’ prevents you anyway
• If you think of the symbol table as a hash...
• ... you can see how to move it down a level anyway
• Bad: $x = “K”; ${$x} = 17;
• Good: $x = “K”; $my_hash{$x} = 17;
Section 11.4. Cyclic References
• Cycles cannot be reference-counted nicely
• Break the cycle with weaken
• Use weaken for every “up” link
• When kids need to know about their parents
• use Scalar::Util qw(weaken);

my $root = {};

my $leaf1 = { parent => $root };

weaken $leaf1->{parent};

my $leaf2 = { parent => $root };

weaken $leaf2->{parent};

push @$root{kids}, $leaf1, $leaf2;
• Now we can toss $root nicely
Chapter 12. Regular Expressions
• Some people, when confronted with a problem, think:

"I know, I'll use regular expressions".

Now they have two problems.

—Jamie Zawinski
Section 12.1. Extended Formatting
• Always use /x
• Internal whitespace is ignored
• Comments are simple #-to-end-of-line
• Hard: /^(.*?)(w+).*?21$/
• Easy(er):

/^

(.*?) (w+) # some text and an alpha-word

.*? # anything

2 1 # the word followed by text again

$/x
• Beware: this will break old regex if you’re not careful
Section 12.2. Line Boundaries
• Always use /m
• ^ means
• Start of string
• Just after any embedded newline
• $ means
• End of string
• Just before any embedded newline
• Damian argues that this is closer to what people expect
• I say use it when it’s what you want
Section 12.3. String Boundaries
• Use A and z
• A is the old ^
• z is the old $
Section 12.4. End of String
• Use z not Z or $
• It’s really “end of string”
• Not “end of string but maybe before n”
• Security hole: if ($value =~ /^d+$/) { ... }
• Value could be “35n”
• That might ruin a good day
• Fix: if ($value =~ /Ad+z/) { ... }
Section 12.5. Matching Anything
• Always use /s
• Turns . into “match anything”
• Not “match anything (except the newline)”
• This is more often what people want
• Says Damian
• Use it when you need it
Section 12.6. Lazy Flags
• If you’re as crazy as Damian
• And as lazy as Damian
• Consider Regexp::Autoflags
• Automatically adds /xms to all regexp
• But then notice that the book lies
• Because it’s actually Regexp::DefaultFlags in the CPAN
Section 12.7. Brace Delimiters
• Prefer m{...} to /.../ for multiline regex
• Open-close stands out easier
• Especially when alone on a line
• Heck, use ‘em even for single line regex
• Particularly if the regex contains a slash
• Balancing is easy:
m{ abc{3,5} } # “ab” then 3-5 c’s
m{ abc{3,5} } # “abc{3,5}” all literal
Section 12.8. Other Delimiters
• Don’t use any other delimiters
• Unless you’re sure that Damian’s not looking
• But seriously...
• m#foo# is cute
• But annoying
• And m,foo, is downright weird
Section 12.9. Metacharacters
• Prefer singular character classes to escaped metachars
• Instead of “.”, use [.]
• Instead of “ “ (escaped space), use [ ] (space in [])
• Advantage: brackets are balanced
Section 12.10. Named Characters
• Prefer named characters to escaped metacharacters
• Example:

use charnames qw(:full);

if ($escape_seq =~ m{

N{DELETE}

N{ACKNOWLEDGE}

N{CANCEL} Z

}xms) { ... }
• Downside: now you have to know these names
• Is anyone really gonna type N{SPACE} ?
Section 12.11. Properties
• Prefer properties to character classes
• Works better with Unicode
• Such as: qr{ p{Uppercase} p{Alphabetic}* /xms;
• “perldoc perlunicode” lists the properties
• You can even create your own
Section 12.12. Whitespace
• Arbitrary whitespace better than constrained whitespace
• So N{SPACE}* matches even “weird” whitespace
Section 12.13. Unconstrained Repetitions
• Be careful when matching “as much as possible”
• For example:

“Activation=This=that” =~ /(.*)=(.*)/
• $1 is “Activation=This”!
• Consider .*? instead:

“Activation=This=that” =~ /(.*?)=(.*)/
• Now we have $1 = “Activation”, $2 = “This=that”
• Don’t do this blindly:

“Activation=This=that” =~ /(.*?)=(.*?)/
• Now $2 has nothing!
Section 12.14. Capturing Parentheses
• Use capturing parens only when capturing
• Every ( ... ) in a regex is a capture regex
• This affects $1, etc
• But it also affects speed
• When you don’t need to capture, don’t!
• Use (?: ... )
• Yes, a little more typing
• But it’s clear that you don’t need that value
Section 12.15. CapturedValues
• Use $1 only when you’re sure you had a match
• Common error:

$foo =~ /(.*?)=(.*)/;

push @{$hash{$1}}, $2;
• What if the match fails?
• Pushes the previous $2 onto the previous $1
• Always use $1 with a conditional
• Even if it can never match:
• Add “or die”:

$foo =~ /(.*?)=(.*)/ or die “Should not happen”;
• Then you know something broke somewhere
Section 12.16. CaptureVariables
• Name your captures
• There can be quite a distance between
• / (w+) s+ (w+) /xms
• $1, $2
• Easy to make mistake, or modify and break things
• Instead, name your captures when you can:
• my ($given, $surname) = /(w+)s+(w+)/;
• If this match fails in a scalar context, it returns false
Section 12.17. Piecewise Matching
• Tokenize input using /gc
• Also called “inchworming”
• Match string against a series of /G.../gc constructs
• If one succeeds, pos() is updated
• Example:
{ last if /Gz/gc; # exit at end of string
if (/G(w+)/gc) { push @tokens,“keyword $1” }
elsif (/G(“.*?”)/gcs) { push @tokens,“quoted $1” }
elsif (/Gs+/gc) { ... ignore whitespace ... }
else { die “Cannot parse:“, /G(.*)/s } # grab rest
redo;
}
Section 12.18. Tabular Regexes
• Build regex from tables
• To replace keys of %FOO with values of %FOO
• Construct the matching regex:

my $regex = join “|”, map quotemeta,

reverse sort keys %FOO;

$regex = qr/$regex/; # if used more than once
• Do the replacement:

$string =~ s/($regex)/$FOO{$1}/g;
• It works
Section 12.19. Constructing Regexes
• Build complex regex from simpler pieces
• Example:

my $DIGITS = qr{ d+ (?: [.] d*)? | [.] d+ }xms;

my $SIGN = qr{ [+-] }xms;

my $EXPONENT = qr{ [Ee] $SIGN? d+ }xms;

my $NUMBER = qr{ ( ($SIGN?) ($DIGITS)
($EXPONENT?) ) }xms;
• Now you can use /$NUMBER/
Section 12.20. Canned Regexes
• Use Regexp::Common
• Don’t reinvent the common regex
• Use the expert coding in Regexp::Common
• $number =~ /$RE{num}{real}/
• $balanced_parens =~ /$RE{balanced}/
• $bad_words =~ /$RE{profanity}/
• Module comes with 2,315,992 tests (over 4 minutes to
run the tests!)
Section 12.21. Alternations
• Prefer character classes to single-char alternations
• [abc] is far faster than (a|b|c)
• As in, 10 times faster
• Consider refactoring too
• Replace: (a|b|ca|cb|cc)
• With: (a|b|c[abc])
• Same job, and again faster
Section 12.22. Factoring Alternations
• An advancement of previous hint
• Slow: /with s+ foo|with s+ bar/x
• Faster: /with s+ (foo|bar)/x
• Keep refactoring until it hurts
• Beware changes to $1, etc
• Also beware changes to order of attempted matches
Section 12.23. Backtracking
• Prevent useless backtracking
• Consider ($?> ... )
• Once it has matched, it’s backtracked as a unit
Section 12.24. String Comparisons
• Prefer eq to fixed-pattern regex matches
• Don’t say $foo =~ /ABARz/
• Use $foo eq “BAR”
• Don’t say $foo =~ /ABARz/i
• Use (uc $foo) eq “BAR”
Chapter 13. Error Handling
• Hopefully, everything works
• But when things go bad, you need to know
• Sometimes, it’s minor, and just needs repair
• Sometimes, it’s major, and needs help
• Sometimes, it’s catastrophic
Section 13.1. Exceptions
• Throw exceptions instead of funny returns
• People might forget to test for “undef”
• But they have to work pretty hard to ignore an
exception
• Exceptions can be easily caught with eval {}
• But better to use Try::Tiny
• Or for more flexibility,TryCatch
Section 13.2. Builtin Failures
• Use Fatal to turn failures into exceptions
• Tired of forgetting “or die” on “open”?
• use Fatal qw(:void open);

...

open my $f,“BOGUS FILE”; # dies!
• Yes, finally open does the right thing
• Damian also suggests Lexical::Failure
Section 13.3. Contextual Failure
• You can tell Fatal to be context-sensitive
• So this still works:
• unless (open my $f,“BOGUS”) { ... }
• That’s dangerous though
• People might use it in a non-void context without
checking
• Says Damian
• I think you can use it with discipline
Section 13.4. Systemic Failure
• Be careful with system()
• Returns 0 (false) if everything is OK
• Returns non-zero if something broke
• Technically, the return value of wait():
• Exit status * 256
• Plus 128 if core was dumped
• Plus signal number that killed it, if any
• 0 = “good”, non-zero = “bad”
• Damian suggests WIFEXITED:

use POSIX qw(WIFEXITED);WIFEXITED(system ...)
• Or just add “!” in front:

!system “foo” or die “...”; # no access to $! here!
Section 13.5. Recoverable Failure
• Throw exceptions on all failures
• Including recoverable ones
• Easy enough to put an eval {} around it
• Don’t use undef
• Developers don’t always check
• Says Damian
• I say, trust your developers a bit
Section 13.6. Reporting Failure
• Carp blames someone else
• use Carp qw(croak);

... open my $f, $file or croak “$file?”;
• Now the caller is blamed
• Makes sense if it’s the caller’s fault
Section 13.7. Error Messages
• Error messages should speak user-based terminology
• Nobody knows ‘@foo’
• Better chance of knowing ‘input files’
• Be as detailed as you can
• Remember that this is all you’re likely to get in the bug
report
• And they will file bug reports
Section 13.8. Documenting Errors
• Document every error message
• Explain it in even more detail
• Again in the user’s terminology
• Nothing more frustrating than a confusing error message
• ... that isn’t explained!
• Good example:“perldoc perldiag”
Section 13.9. OO Exceptions
• Throw an object instead of text
• The object can contain relevant information
• Exception catchers can pull info apart
• Exception objects can also have nice stringifications
• This helps naive catchers that print “Error: $@”
Section 13.10. Volatile Error Messages
• Relying on catchers to parse text is dangerous
• Throwing an object isolates text changes
• Additional components can be added later
• Likely won’t break existing code
Section 13.11. Exception Hierarchies
• Use exception objects when exceptions are related
• Give them different classes, but common inheritance
• Catchers can easily test for categories of exceptions
• Or distinguish specific items when it matters
Section 13.12. Processing Exceptions
• Test the most specific exception first
• For example, log file error, before generic file error
Section 13.13. Exception Classes
• Use Exception::Class for nice exceptions
• Exception::Class-based exceptions capture:
• caller
• current filename, linenumber, package
• user data
• Stringify nicely for naive displays (customized if needed)
• Create hierarchies for classified handling
• Consider “Throwable” role for Moo or Moose
Section 13.14. Unpacking Exceptions
• Grab $@ early in your catcher
• Then parse and act on it
• One of your parsing steps may alter $@ accidentally
• Best to capture it first!
• Or consider Try::Tiny, which puts $@ safely into block-
scoped $_
Chapter 14. Command-Line Processing
• The great homecoming
• In the beginning, everything was command-line
• Perl still works well here
• But you should follow some standards and conventions
Section 14.1. Command-Line Structure
• Enforce a single consistent command-line structure
• GNU tools
• Need I say more?
• Every tool, slightly different
• Pick a standard, stick with it
• Especially amongst a tool suite
• If you use -i for input here, use it there too
Section 14.2. Command-Line
Conventions
• Use consistent APIs for arguments
• Require a flag in front of every arg, except filenames
• Use - prefix for single-letter flags
• Use -- prefix for longer flags
• Provide long flag aliases for every single-letter flag
• Always allow “-” as a filename
• Always use “--” to indicate “end of args”
Section 14.3. Meta-options
• Standardize your meta options
• --usage
• --help
• --version
• --man
Section 14.4. In-situ Arguments
• If possible, allow same file for both input and output
• foo_command -i thisfile -o thisfile
• Obviously might require some jiggery
• Or IO::InSitu from the CPAN:

my $in_name = ....;

my $out_name = ....;

my ($in, $out) = open_rw($in_name, $out_name);

while (<$in>) { print $out “$.: $_” }
Section 14.5. Command-Line Processing
• Standardize on a single approach to command-line
processing
• Look at Getopt::Long (core)
• For a more Damianized approach, Getopt::Declare
• Spell out your usage, and the args follow!
• Manpage is longer than most program manpages
Section 14.6. Interface Consistency
• Ensure interface, run-time, and docs are consistent
• Usage messages should match what the code does
• Simple with Getopt::Declare!
• Help docs should match what the manpage says
• If Getopt::Declare wasn’t enough...
• ... consider Getopt::Euclid
• Constructs your parser from POD
• Then the manpage definitely matches the code!
• Again, suitably Damianized
• Requires many “aha!” moments to fully understand
Section 14.7. Interapplication Consistency
• Factor out common CLI items to shared modules
• If every tool has -i and -o, don’t keep rewriting that
• Getopt::Euclid “pod modules” can provide common code
• Also permits readers to learn it once
• Instead of having to recognize it each time
• “This program implements L<Our::Standard::CLI>.”
Chapter 15. Objects
• Used by practically every large practical program
• Even used by many little programs
• “There’s more than one way to implement it”
• Hash-based? Array-based? Inside-out?
• Using some framework, or hand-coded?
• So what’s best?
• Here we go...
Section 15.1. Using OO
• Make OO a choice, not a default
• Don’t introduce needless objects
• Especially over-complex frameworks
• You probably don’t need a different class for every token
Section 15.2. Criteria
• Use objects if you have the right conditions
• Large systems
• Obvious large structures
• Natural hierarchy (polymorphism and inheritance)
• Many different operations on the same data
• Same or similar operations on related data
• Likely to add new types later
• Operator overloading can be used logically
• Implementations need to be hidden
• System design is already OO
• Large numbers of other programmers use your code
Section 15.3. Pseudohashes
• Don’t
• “We’re sorry”
• Removed in 5.10 anyway (long time ago!)
Section 15.4. Restricted Hashes
• Don’t
• Not really restricted
• For every lock, there’s a means to unlock
• Inside-out objects handle most of the needs here
Section 15.5. Encapsulation
• Don’t bleed your guts
• Provide methods for all accessors
• If you want to provide hashref access
• Provide a hashref overload
• Prepare to pay the price for such access
• Consider inside-out objects to ensure guts are private
Section 15.6. Constructors
• Give every constructor the same standard name
• And that’s “new”, duh
• Says Damian
• Or, just use what’s natural
• After all, DBI->connect does it.
• Or rewrite that as DBI->new({ connect => [...] })
• Not likely any time soon
Section 15.7. Cloning
• Don’t clone in your constructor
• $instance->new should be an error
• If you want an “object of the same type as...”, use ref:

my $similar = (ref $instance)->new(...);
• If you want a proper clone, write a clone/copy routine:

my $clone = $instance->clone;
• Note: please stop cargo-culting:

sub new {

my $self = shift;

my $class = ref $self || $self; # bad

...

}
Section 15.8. Destructors
• Inside out objects need destructors
• Or they leak
• And be sure to destroy all the parent classes too:

sub DESTROY {

my $deadbody = shift;

... destroy my attributes here ...

for (our @ISA) {

my $can = $_->can(“DESTROY”);

$deadbody->$can if $can;

}

}
• Or use “NEXT” in the CPAN
Section 15.9. Methods
• Methods are subroutines
• Except methods can be same-named as built-ins
• They’ll never be confused with subroutines
Section 15.10. Accessors
• Provide separate read and write accessors
• Same-named accessors might have spooky non-behavior:

$person->name(“new name”);

$person->rank(“new rank”);

$person->serial_number(“new number”); # read only!
• Of course, these double-duty routines should abort
• But they often don’t
• Also, differently named accessors easier to grep for
• Help understand places a value could change
Section 15.11. Lvalue Accessors
• Don’t use lvalue accessors
• Require returning direct access to a variable
• No place to do error checking
• Or require returning tied var
• Slow
• Really slow
• Like, slower than just a method call
Section 15.12. Indirect Objects
• Indirect object syntax is broken
• Bad: my $item = new Thing;
• Good: my $item = Thing->new;
• No chance of mis-parse
• Mis-parse has bitten someVery Smart People
• If you think you’re smarter than someVery Smart People
• ... let us know how that’s working out for you
Section 15.13. Class Interfaces
• Provide optimal interface, not minimal one
• Consider common operations and implement them
• As class author, you can optimize internally
• Class users have to use your published interface
• Worse, they might repeat the code, or build layers
• Resulting in more wrappers or repeated code!
Section 15.14. Operator Overloading
• Overload only for natural-mapping operator
• Don’t just pick operators and map them
• You’ll get bit by precedence, most likely
• Or leave one out, and really confuse users
• Don’t make the C++ mistake:“>>” meaning write-to
Section 15.15. Coercions
• If possible, provide sane coercions:
• Boolean (for tests)
• Numeric
• String
• If your object is “logically empty”, then:
if ($your_object) { ... } should return false else true
• And “$your_object” shouldn’t provide a hex address
• Overloaded numerics and strings are also used in sorting
Chapter 16. Class Hierarchies
• Objects get their power from inheritance
• Designing inheritance well takes practice
Section 16.1. Inheritance
• Don’t manipulate @ISA directly
• Prefer “use parent”:
use parent qw(This That Other);
• Nice side-effect of loading those packages automatically
• Or use a framework that establishes this directly
• Replaces the heavier-weight “use base”
Section 16.2. Objects
• Use inside-out objects
• Said Damian back in the day
• Avoids attribute collisions
• Encapsulates attributes securely
• These days, somewhat discouraged
• Makes objects a bit too opaque
• Still useful for hostile subclassing though
Section 16.3. Blessing Objects
• Never use one-argument bless
• Blessing into “the current package” breaks subclassing:

package Parent;

sub new { bless {} }

package Child;

use base qw(Parent);

package main;

my $kid = Child->new;

print ref $kid; # Parent??
• That should have been:

package Parent; sub new { bless {}, shift }
Section 16.4. Constructor Arguments
• Pass constructor args as a hashref
• Gives you key/value pairs easily
• Broken hashref caught at the caller, not “new”
• “Odd number of elements...”
• Derived class constructor can pick out what it wants
• Pass the remaining items up to base-class constructor
Section 16.5. Base Class Initialization
• Distinguish initializers by class name
• To avoid collisions:

my $thing = Child->new(

child => { this => 3, that => 5 },

parent => { other => 7 });
• Says Damian
• I say it reveals too much of the hierarchy
• Gets complex if Parent/Child are refactored
• Use logical names for attributes, not class-based names
Section 16.6. Construction and
Destruction
• Separate construction, initialization, and destruction
• Multiple inheritance breaks if every class does its own
bless
• Instead, all classes should separate bless from initialize
• Suggested name “BUILD”
• Constructor can call base constructor, then all BUILD
routines in hierarchy
• Similar strategy for DESTROY
• Or just use Moose or Moo
Section 16.7. Automating Class
Hierarchies
• Build standard class infrastructure automatically
• Getting accessors and privacy correct can take special
care
• Use inside-out objects for safety and best development
speed
• Class::Std or Object::InsideOut
Section 16.8. Attribute Demolition
• Use Class:Std or Object::InsideOut to ensure
deallocation of attributes
• Calls DESTROY in all the right places
Section 16.9. Attribute Building
• Initialize and verify attributes automatically
• Again, Class::Std or Object::InsideOut to the rescue
• Calls BUILD in all the right places
• Or attribute hashes can be annotated for the common
case
Section 16.10. Coercions
• Specify coercions as attributed methods
• Again, provided with Class::Std/Object::InsideOut
• For numeric context:

sub count : NUMERIFY { return ... }
• For string context:

sub as_string : STRINGIFY { return ... }
• For boolean context:

sub is_true : BOOLIFY { return ... }
Section 16.11. Cumulative Methods
• Use attributed cumulative methods instead of SUPER
• Class::Std/Object::InsideOut provide CUMULATIVE:

sub items : CUMULATIVE { ... }
• When called in a list context, all calls are made
• Result is the concatenated separate lists
• Scalar context similar
• Result is the string concatenation of all calls
• No need for “SUPER::”
• Works even with multiple inheritance
Section 16.12. Autoloading
• Don’t use AUTOLOAD
• Effectively creates a runtime interface
• Not a compile time interface
• Must be prepared for
• Methods you don’t want to handle
• DESTROY
• Calling “NEXT” for either of those is tricky
Chapter 17. Modules
• Reusable collections of subroutines and their data
• Unit of exchange for the CPAN
• Unit of testability internally
Section 17.1. Interfaces
• Design the module’s interface first
• A bad interface makes a module unusable
• Think about how your module will be used
• Name things from the user’s perspective
• Not from how it’s implemented
• Create examples, or even “use cases” for your interface
• Keep the examples around for your documentation!
Section 17.2. Refactoring
• Place original code inline
• Place duplicated code in a subroutine
• Place duplicated subroutines in a module
Section 17.3. Version Numbers
• Use 3-part version numbers
• But vstrings are broken
• Didn’t get them right until 5.8.1
• Deprecating them for 5.10!
• Use the “version” module from the CPAN:
use version; our $VERSION = qv(‘1.0.3’);
• Also supports “development versions”:
use version; our $VERSION = qv(‘1.5_12’);
Section 17.4. Version Requirements
• Enforce version requirements programatically
• Enforce Perl version:

use 5.008; # 5.8 or greater, only
• Enforce package versions:

use List::Util 1.13 qw(max);
• Use “use only” for more precise control
• Get “only” from the CPAN
Section 17.5. Exporting
• Export carefully
• Your default export list can’t change in the future
• Might break something if new items added
• Where possible, only on request
• Flooding the namespace is counterproductive
• Especially with common names, like “fail” or “display”
Section 17.6. Declarative Exporting
• Export declaratively
• Perl6::Export::Attrs module:

package Test::Utils; use Perl6::Export::Attrs;

sub ok :Export(:DEFAULT, :TEST, :PASS) { ... }

use pass :Export(:TEST, :PASS) { ... }

use fail :Export(:TEST) { ... }

use skip :Export () { ... } # can leave () off
• Now @EXPORT will contain only “ok” (:DEFAULT)
• And @EXPORT_OK will contain all 4
• And the :TEST and :PASS groups will be set up:

use Test::Utils qw(:PASS); # ok, pass
Section 17.7. InterfaceVariables
• Export subroutines, not data
• Yes, let’s create a module, but then reintroduce global
variables
• OK, that’s a bad idea
• Variables have only get/set interface
• And very little checking (unless you “tie”)
• Export subroutines to give more control
• And more flexibility for the future
Section 17.8. Creating Modules
• Build new module frameworks automatically
• The classic: h2xs
• Modern:
• ExtUtils::ModuleMaker
• Module::Starter
Section 17.9. The Standard Library
• Use core modules when possible
• See “perldoc perlmodlib”
• Lots and lots of things
• The core gets bigger over time
• See Module::Corelist to know when
• Command line “corelist”
Section 17.10. CPAN
• Use CPAN modules where feasible
• Smarter people than you have hacked the code
• Or at least had more time than you have
• Reuse generally means higher quality code
• Also means communities can form
• Questions answered
• Pool of talent available
• Someone else might fix bugs
• metacpan.org has the best interface
Chapter 18. Testing and Debugging
• We all write perfect software!
• In our dreams...
• Every code has one bug or more
• The trick is... where?
• Testing and debugging for the win
Section 18.1. Test Cases
• Write tests first
• Turn your examples into tests
• Create the tests before coding
• Run the tests, which should fail
• Code (correctly!)
• Now the tests should succeed
• Tests are also documentation
• So, comment your tests
Section 18.2. Modular Testing
• Use Test::More
• Testing testing testing!
• Don’t code your tests ad-hoc
• Use the proven Perl framework
• “perldoc Test::Tutorial”
Section 18.3. Test Suites
• For local things, create additional Test::* mixins
• Use Test::Harness as a basis for additional tests
• New harness is being developed
• Separates reporting from detecting
Section 18.4. Failure
• Write test cases to make sure that failures actually fail
• This is psychologically hard sometimes
• Have to out-trick yourself
• Good task for pairing with someone else
Section 18.5. What to Test
• Test the likely and the unlikely
• Some ideas:
• Minimum and maximum, and near those
• Empty strings, multiline strings, control characters
• undef and lists of undef,“0 but true”
• Inputs that might never be entered
• Non-numeric data for numeric parameters
• Missing arguments, extra arguments
• Using the wrong version of a module
• And every bug you’ve ever encountered
• The best testers have the most scars
Section 18.6. Debugging and Testing
• Add new test cases before debugging
• A bug report is a test case!
• Add it to your test suite
• (You do have a test suite, right?)
• Then debug.
• You’ll know the bug is gone when the test passes
• And you’ll never accidentally reintroduce that bug!
Section 18.7. Strictures
• Use strict
• Barewords in the wrong place:

my @months = (jan, feb, mar, apr, may, jun,

jul, aug, sep, oct, nov, dec);
• Variables that are probably typos:

my $bamm_bamm = 3; .... $bammbamm;
• Evil symbolic references:

$data{fred} = “flintstone”;

...

$data{fred}{age} = 30;
• You just set $flintstone{age} to 30!
Section 18.8. Warnings
• Use warnings
• “use warnings” catches most beginner mistakes
• Beware “-w” on the command line
• Unless you want to debug someone else’s mistakes when
you use a module
Section 18.9. Correctness
• Never assume that a warning-free compile implies
correctness
• Oh, if life were only that simple!
• You have to be more clever at debugging than you were
at coding
Section 18.10. Overriding Strictures
• Reduce warnings for troublesome spots
• Add “no warnings” in a block:

{

no warnings ‘redefine’;

...;

}
• The block will narrow down the influence
• Note that this has lexical influence
• Not dynamic influence
Section 18.11. The Debugger
• Learn a subset of the debugger:
• Starting and exiting
• Setting breakpoints
• Setting watchpoints
• Step into, out of, around
• Examining variables
• Getting the methods of a class/instance
• Getting help
Section 18.12. Manual Debugging
• Used serialized warnings when debugging “manually”
• Reserve “warn” for “if debugging”
• You shouldn’t be using warn for anything else
• Automatically goes to STDERR
• Or use Log::Log4Perl with debug levels
Section 18.13. Semi-Automatic Debugging
• Consider “smart comments” rather than warn
• As in,“Smart::Comments”:

use Smart::Comments;

...

for my $result (@some_messy_array) {

### $result

}
• Supports assertions too:

### check: ref $result

### require: $result->foo > 3
• No delivery penalty
• Just don’t include Smart::Comments!
Chapter 19. Miscellanea
• Sure, it’s gotta fit somewhere
• Might as well be at the end
Section 19.1. Revision Control
• Use revision control
• Essential for team programming
• But even good for one person
• Great for sharing updates
• Helpful for “when did it break”
• And more importantly “who broke it”
• Also prevents potential loss from hardware failures and
human failures
Section 19.2. Other Languages
• Integrate non-Perl code with Inline:: modules
• Relatively simple
• Compared to raw XS codewriting
• Caches nicely
• Can be shipped pre-expanded with a distro
• Can be installed per-user as needed
Section 19.3. Conguration Files
• Consider Config::Std
• Simple to use:

use Config::Std;

read_config ‘~/.demorc’ => my %config;

$config{Interface}{Disclaimer} = ‘Whatever, dude!’;
• Allow user changes to be rewritten for the next run:

write_cong %cong;
• Whitespace and comments are preserved!
• Config::General: all that and XML blocks, heredocs,
includes
Section 19.4. Formats
• Don’t
• Consider Perl6::Form instead
Section 19.5. Ties
• Don’t
• Fairly inefficient
• Can be spooky to the naive
• Every tied operation can be replaced with an explicit
method call
Section 19.6. Cleverness
• Don’t
• Example:

my $max_result =

[$result1=>$result2]->[$result2<=$result1];
• Can you spot the bug?
• Yeah, it’s min, not max
• Better:

use List::Util qw(max);

$max_result = max($result1, $result2);
Section 19.7. Encapsulated Cleverness
• if you must be clever, at least hide it well!
• Worked for the Wizard of Oz for many years
• “Pay no attention to the man behind the curtain!”
• Easy-to-understand interfaces better than scary
comments
• Be clever, and hide it
Section 19.8. Benchmarking
• Don’t optimize until you benchmark
• Get the code right first
• Then make it go faster, if necessary
• Use the tools
• Benchmark
• Devel::DProf
• Devel::SmallProf
• Devel::NYTProf
Section 19.9. Memory
• Measure data before optimizing it
• Use Devel::Size for details:

size(%hash) # measures overhead

total_size(%hash) # measures overhead + data
• Overhead is often surprising
Section 19.10. Caching
• Look for opportunities to use caches
• Subroutines that return the same result
• Example: SHA digest of data:

BEGIN {

my %cache;

sub sha1 {

my $input = shift;

return $cache{$input} ||= do {

... expensive calculation here ...

};

}

}
Section 19.11. Memoization
• Automate subroutine caching via Memoize
• No that’s not a typo
• No need to rewrite the same cache logic
• Simpler example:

use Memoize;

memoize(‘sha1’);

sub sha1 {

expensive calculation here

}
• Lots of options (defining key, time-based cache)
Section 19.12. Caching for Optimization
• Benchmark any caching
• Cache is always a trade-off
• time vs space
• time to compute vs time to fetch/store cached value
• time to compute vs time to figure out cache is fresh
• Caching isn’t necessarily a win
Section 19.13. Proling
• Don’t optimize apps: profile them
• Find the hotspots that are slow, and fix them
• Use Devel::DProf for subroutine-level visibilities
• Use Devel::SmallProf for statement-level
Section 19.14. Enbugging
• Carefully preserve semantics when refactoring
• The point of refactoring is not to break things
Bonus
• Damian isn’t the only one with good ideas
Use Perl::Critic
• Automatically test your code against various suggestions
made here
• Can be configured to require or ignore guidelines
• Consider it “lint for Perl”
In summary
• Learn from the ones
who have the most scars
• Buy the book...
• Damian could use the
money!

More Related Content

Perl best practices v4

  • 1. Perl Best Practices Randal L. Schwartz merlyn@stonehenge.com version 4.3 on 11 Jun 2017 This document is copyright 2006,2007,2008, 2017 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.
  • 2. Introduction • This talk organized according to “Perl Best Practices” • With my own twist along the way • Perhaps I should call this “second Best Practices”?
  • 3. Chapter 2. Code Layout • There’s more than one way to indent it • This isn’t Python • Blessing and a curse • You’re writing for two readers: • The compiler (and it doesn’t care) • Your maintenance programmer • Best to ensure the maintenance programmer doesn’t hunt you down with intent to do serious bodily harm • You are the maintenance programmer three months later
  • 4. Section 2.1. Bracketing • Indent like K&R • Paren pairs at end of opening line, and beginning of close:
 my @items = (
 12,
 19,
 ); • Similar for control flow:
 if (condition) {
 true branch;
 } else {
 false branch;
 }
  • 5. Section 2.2. Keywords • Put a space between keyword and following punctuation:
 if (...) { ... } else { ... }
 while (...) { ... } • Not “if(...)” • It’s easier to pick out the keyword from the rest of the code that way • Keywords are important!
  • 6. Section 2.3. Subroutines andVariables • Keep subroutine names cuddled with args:
 foo(3, 5) # not foo (3, 5) • Keep variable indexing cuddled with variables:
 $x[3] # not $x [3] • If you didn’t know you could do that, forget that you can!
  • 7. Section 2.4. Builtins • Avoid unnecessary parens for built-ins • Good:
 print $x; • Bad:
 print($x); • But don’t avoid the necessary parens:
 warn(“something is wrong”), next if $x > 3; • Without the parens, that would have executed “next” too early
  • 8. Section 2.5. Keys and Indices • Separate complex keys from enclosing brackets:
 $x[ complex() * expression() ] • But don’t use this for trivial keys:
 $x[3] • The extra whitespace helps the reader nd the brackets
  • 9. Section 2.6. Operators • Add whitespace around binary operators if it helps to see the operands:
 print $one_value + $two_value[3];
 print 3+4; # wouldn’t help much here • Don’t add whitespace after a prex unary operator:
 $x = !$y; # no space after !
  • 10. Section 2.7. Semicolons • Put a semicolon after every statement • Yeah, even the ones at the end of a block: if ($condition) { state1; state2; state3; # even here } • Very likely, this code will be edited • Occasionally, a missing semicolon is not a syntax error • Exception: blocks on a single line: if ($foo) { this } else { that }
  • 11. Section 2.8. Commas • Include comma after every list element:
 my @days = (
 ‘Sunday’,
 ‘Monday’,
 ); • Easier to add new items • Don’t need two different behaviors • Easier to swap items
  • 12. Section 2.9. Line Lengths • Damian says “use 78 character lines” • I’d say keep it even shorter • Long lines are hard to read • Long lines might wrap if sent in email • Long lines are harder to copypasta for reuse
  • 13. Section 2.10. Indentation • Damian says “use 4-column indentation levels” • I use Emacs cperl-mode, and use whatever it does • Yeah, there’s probably some conguration to adjust it • Better still, write your code where the indentation is clear
  • 14. Section 2.11. Tabs • Indent with spaces, not tabs • Tabs expand to different widths per user • If you must use tabs, ensure all tabs are at the beginning of line • You can generally tell your text editor to insert spaces • For indentation • When you hit the tab key • And you should do that!
  • 15. Section 2.12. Blocks • Damian says “Never place two statements on the same line” • Except when it’s handy • Statements are a logical chunk • Whitespace break helps people • Also helps cut-n-paste • Also keeps the lines from getting too long (see earlier)
  • 16. Section 2.13. Chunking • Code in commented paragraphs • Whitespace between chunks is helpful • Chunk according to logical steps • Interpreting @_ separated from the next step in a subroutine • Add leading comments in front of the paragraph for a logical-step description • Don’t use comments for user docs, use POD
  • 17. Section 2.14. Elses • Damian says “Don’t cuddle an else” • Cuddled:
 } else { • Uncuddled:
 }
 else { • I disagree. I like cuddled elses • They save a line • I recognize “} else {“ as “switching from true to false” • Think of it as the “batwing else”
  • 18. Section 2.15. Vertical Alignment • Align corresponding items vertically • (Hard to show in a variable-pitch font) • Something like:
 $foo = $this + $that;
 $bartholomew = $the + $other; • Yeah, that’s pretty • For me, it’d be more time dding in the editor rather than coding • Again, your choice is yours
  • 19. Section 2.16. Breaking Long Lines • Break long expressions before an operator:
 print $offset
 + $rate * $time
 + $fudge_factor
 ; • The leading operator is “out of place” enough • Leads the reader to know this is a continuation • Allows for a nice comment on each line too • Lineup should be with the piece in the rst line • Again, this seems like a lot of work in an editor • Emacs cperl-mode can line up on an open paren
  • 20. Section 2.17. Non-Terminal Expressions • If it simplies things, name a subexpression • Instead of:
 my $result = $foo + (very complicated thing) + $bar; • Use
 my $meaningful_name = very complicated thing;
 my $result = $foo + $meaningful_name + $bar; • Don’t use an meaningless name • This also helps in debugging • Set a breakpoint after the intermediate computation • Put a watchpoint on that value • There’s no winner in the “eliminate variables” contest
  • 21. Section 2.18. Breaking by Precedence • Break the expression at lowest precedence • To break up: $offset + $rate * $time • Good:
 $offset
 + $rate * $time • Bad:
 $offset + $rate
 * $time • Keep same level of precedence at same visual level
  • 22. Section 2.19. Assignments • Break long assignments before the operator • To x:
 $new_offset = $offset + $rate * $time; • Bad:
 $new_offset = $offset
 + $rate * $time; • Good:
 $new_offset
 = $offset
 + $rate * $time; • Often I put the assignment on the previous line • Damian would disagree with me on that
  • 23. Section 2.20. Ternaries • Use ?: in columns
 my $result
 = $test1 ? $test1_true
 : $test2 ? $test2_true
 : $test3 ? $test3_true
 : $else_value; • Of course, Damian also lines up the question marks • Conclusion: Damian must have a lot of time on his hands
  • 24. Section 2.21. Lists • Parenthesize long lists • Helps to show beginning and ending of related things • Also tells Emacs how to group and indent it:
 print (
 $one,
 $two,
 $three,
 ); • Nice setting: (cperl-indent-parens-as-block t)
  • 25. Section 2.22. Automated Layout • Use a consistent layout • Pick a style, and stick with it • Use perltidy to enforce it • But don’t waste time reformatting everything • Life’s too short to get into ghts on this
  • 26. Chapter 3. Naming Conventions • Syntactic consistency • Related things look similar • Semantic consistency • Names of things reflect their purpose
  • 27. Section 3.1. Identiers • Use grammar-based identiers • Packages: Noun ::Adjective ::Adjective • Disk::DVD::Rewritable • Variables: adjective_adjective_noun • $estimated_net_worth • Lookup vars: adjective_noun_preposition • %blocks_of, @sales_for • Subs: imperative_adjective_noun[_prep] • get_next_cookie, eat_cake • get_cookie_of_type, eat_cake_using
  • 28. Section 3.2. Booleans • Name booleans after their test • Scalars:
 $has_potential
 $has_been_checked
 $is_internal
 $loading_is_nished • Subroutines:
 is_excessive
 has_child_record • Often starts with is or has
  • 29. Section 3.3. ReferenceVariables • Damian says “mark variables holding a ref to end in _ref”:
 my $items_ref = @items; • I generally do this, but just use “ref”:
 my $itemsref = @items; • Rationale: ordinary scalars and refs both go in to scalar vars • Need some way of distinguishing typing information • use-strict-refs helps here too
  • 30. Section 3.4. Arrays and Hashes • Give arrays plural names:
 @items
 @records
 @input_les • Give hashes singular names:
 %option
 %highest_of
 %total_for • “Mapping” hashes should end in a connector
  • 31. Section 3.5. Underscores • Use underscores to separate names • $total_paid instead of $totalpaid • Helps to distinguish items • What was “remembersthelens.com”? • Not “remembers the lens” as I thought • It’s “remember st helens”! • Not CamelCase (except for some package names)
  • 32. Section 3.6. Capitalization • The more capitalized, the more global it is • All lowercase for local variables:
 $item @queries %result_of • Mixed case for package/class names:
 $File::Finder • Uppercase for constants:
 $MODE $HOURS_PER_DAY • Exception: uppercase acronyms and proper names:
 CGI::Prototype
  • 33. Section 3.7. Abbreviations • Abbreviate identiers by prex • $desc rather than $dscn • $len rather than $lngth • But $ctrl rather than $con, since it’s familiar • $msg instead of $mess • $tty instead of $ter
  • 34. Section 3.8. Ambiguous Abbreviations • Don’t abbreviate beyond meaning • Is $term_val • TerminationValid • TerminalValue • Is $temp • Temperature • Temporary • Context and comments are useful • con & com r u
  • 35. Section 3.9. Ambiguous Names • Avoid ambiguous names • last (nal or previous) • set (adjust or collection) • left (direction, what remains) • right (direction, correct, entitlement) • no (negative,“number”) • record (verb or noun) • second (time or position) • close (nearby or shut) • use (active usage, or category of function)
  • 36. Section 3.10. Utility Subroutines • Use underscore prex as “internal use only” • Obviously, this won’t stop anyone determined to do it • But it’s a clue that it’s maybe not a good idea • External interface:
 sub you_call_this {
 ... _internal_tweaking(@foo) ...;
 }
 sub _internal_tweaking { ... } • Historical note: broken in Perl4 • Made the subroutine a member of ALL packages
  • 37. Chapter 4. Values and Expressions • How literals look • How expressions operate
  • 38. Section 4.1. String Delimiters • Damian says “Use double quotes when you’re interpolating” • Example:“hello $person” • But ‘hello world’ (note single quotes) • In contrast, I always use double quotes • I’m lazy • I often edit code later to add something that needs it • There’s no efciency difference • Perl compiles “foo” and ‘foo’ identically
  • 39. Section 4.2. Empty Strings • Use q{} for the empty string • The rest of this page intentionally left blank • And empty
  • 40. Section 4.3. Single-Character Strings • Likewise, use q{X} for the single character X. • Again, I can disagree with this • But it helps these stand out:
 my $result = join(q{,},
 ‘this’,
 ‘that’,
 ); • That rst comma deserves a bit of special treatment.
  • 41. Section 4.4. Escaped Characters • Use named escapes instead of magical constants • Bad:“x7fx06x18Z” • Good:
 use charnames qw{:full};
 “N{DELETE}N{ACKNOWLEDGE}N{CANCEL}Z” • For some meaning of “good” • Damian claims that is “self documenting” • Yeah, right
  • 42. Section 4.5. Constants • Create constants with the Readonly module • Don’t use “constant”, even though it’s core • use constant PI => 3; • You can’t interpolate these easily: • print “In indiana, pi is @{[PI]}n”; • Instead, get Readonly from the CPAN: • Readonly my $PI => 3; • Now you can interpolate: • print “In indiana, pi is $PIn”; • Or consider Const::Fast or Attribute::Constant • 20 to 30 times faster than Readonly • neilb.org/reviews/constants.html for more
  • 43. Section 4.6. Leading Zeros • Don’t pad decimal numbers with leading 0’s • They become octal • 001, 002, 004, 008 • Only barfs on 008! • Damian says “don’t use leading 0’s ever” • Use oct() for octal values: oct(600) instead of 0600 • Yeah, right • Could confuse some readers • But only those who haven’t read the Llama
  • 44. Section 4.7. Long Numbers • Use underscores in long numbers • Instead of 123456, write 123_456 • What? You didn’t know? • Yeah, underscores in a number are ignored • 12_34_56 is also the same • As is 1________23456 • Works in hex/oct too:
 0xFF_FF_00_FF
  • 45. Section 4.8. Multiline Strings • Layout multiline strings over multiple lines • Sure, you can do this:
 my $message = “You idiot!
 You should have enabled the frobulator!
 ”; • Use this instead:
 my $message
 = “You idiot!n”
 .“You should have enabled the formulator!n”
 ; • This text can be autoindented more nicely
  • 46. Section 4.9. Here Documents • Use a here-doc for more than two lines • Such as:
 my $usage = <<”END_USAGE”;
 1) insert plug
 2) turn on power
 3) wait 60 seconds for warm up
 4) pull plug if sparks appear during warm up
 END_USAGE
  • 47. Section 4.10. Heredoc Indentation • Use a “theredoc” when a heredoc would have been indented:
 use Readonly;
 Readonly my $USAGE = <<”END_USAGE”;
 1) insert plug
 2) turn on power
 END_USAGE • Now you can use it in any indented code
 if ($usage_error) {
 print $USAGE;
 }
  • 48. Section 4.11. Heredoc Terminators • Make heredocs end the same • Suggestion:“END_” followed by the type of thing • END_MESSAGE • END_SQL • END_OF_WORLD_AS_WE_KNOW_IT • All caps to stand out
  • 49. Section 4.12. Heredoc Quoters • Always use single or double quotes around terminator • Makes it clear what kind of interpolation is being used • <<”END_FOO” gets double-quote interpolation • <<’END_FOO’ gets single-quote • Default is actually double-quote • Most people don’t know that
  • 50. Section 4.13. Barewords • Grrr • Don’t use barewords • You can’t use them under strict anyway • Instead of: @months = (Jan, Feb, Mar); • Use: @months = qw(Jan Feb Mar);
  • 51. Section 4.14. Fat Commas • Reserve => for pairs • As in:
 my %score = (
 win => 30,
 tie => 15,
 loss => 10,
 ); • Don’t use it to replace comma • Bad: rename $this => $that • What if $this is “FOO”: use constant FOO => ‘File’ • rename FOO => $that would be a literal “FOO” instead
  • 52. Section 4.15. Thin Commas • Don’t use commas in place of semicolons as sequence • Bad:
 ($a = 3), print “$an”; • Good:
 $a = 3; print “$an”; • If a sequence is desired where an expression must be... • Use a do-block:
 do { $a = 3; print “$an” };
  • 53. Section 4.16. Low-Precedence Operators • Don’t mix and/or/not with && || ! in the same expression • Too much potential confusion: not $nished || $result • Not the same as !$nished || $result • Reserve and/or/not for outer control flow: print $foo if not $this; unlink $foo or die;
  • 54. Section 4.17. Lists • Parenthesize every raw list • Bad: @x = 3, 5, 7; (broken!) • Good: @x = (3, 5, 7);
  • 55. Section 4.18. List Membership • Consider any() from List::MoreUtils:
 if (any { $requested_slot == $_ } @allocated_slots) { ... } • Stops checking when a true value has been seen • However, for “eq” test, use a predened hash:
 my @ACTIONS = qw(open close read write);
 my %ACTIONS = map { $_ => 1 } @ACTIONS;
 ....
 if ($ACTIONS{$rst_word}) { ... } # seen an action
  • 56. Chapter 5. Variables • A program without variables would be rather useless • Names are important
  • 57. Section 5.1. LexicalVariables • Use lexicals, not package variables • Lexical access is guaranteed to be in the same le • Unless some code passes around a reference to it • Package variables can be accessed anywhere • Including well-intentioned but broken code
  • 58. Section 5.2. PackageVariables • Don’t use package variables • Except when required by law • Example: @ISA • Example: @EXPORT, @EXPORT_OK, etc • Otherwise, use lexicals
  • 59. Section 5.3. Localization • Always localize a temporarily modied package var • Example:
 {
 local $BYPASS_AUDIT = 1;
 delete_any_traces(@items);
 } • Any exit from the block restores the global • You can’t always predict all exits from the block
  • 60. Section 5.4. Initialization • Initialize any localized var • Without that, you get undef! • Example:
 local $YAML::Indent = $YAML::Indent;
 if ($local_indent) {
 $YAML::Indent = $local_indent;
 } • Weird rst assignment creates local-but-same value
  • 61. Section 5.5. PunctuationVariables • Damian says “use English” • I say “no - use comments” • How is $OUTPUT_FIELD_SEPARATOR better than $, • given that you probably won’t use it anyway • Use it, but comment it:
 {
 local $, = “,“; # set output separator
 ...
 }
  • 62. Section 5.6. Localizing Punctuation Variables • Always localize the punctuation variables • Example: slurping:
 my $le = do { local $/; <HANDLE> }; • Important to restore value at end of block • Items called within the block still see new value
  • 63. Section 5.7. MatchVariables • Don’t “use English” on $& or $` or $’ • Always “use English qw(-no_match_vars)” • Otherwise, your program slows down • Consider Regexp::MatchContext • L-value PREMATCH(), MATCH(), POSTMATCH()
  • 64. Section 5.8. Dollar-Underscore • Localize $_ if you’re using it in a subroutine • Example:
 sub foo {
 ...
 local $_ = “something”;
 s/foo/bar/;
 } • Now the replacement doesn’t mess up the outer $_
  • 65. Section 5.9. Array Indices • Use negative indices where possible • Negative values count from the end • $some_array[-1] same as $some_array[$#some_array] • $foo[-2] same as $foo[$#foo-1] • $foo[-3] same as $foo[$#foo-2] • Sadly, can’t use in range • $foo[1..$#foo] is all-but-rst • Cannot replace with $foo[1..-1]
  • 66. Section 5.10. Slicing • Use slices when retrieving and storing related hash/array items • Example:
 @frames[-1, -2, -3]
 = @active{‘top’,‘prev’,‘backup’}; • Note that both array slice and hash slice use @ sigil
  • 67. Section 5.11. Slice Layout • Line up your slicing • Example from previous slide:
 @frames[ -1, -2, -3]
 = @active{‘top’,‘prev’,‘backup’}; • Yeah, far more typing than I have time for • Damian must have lots of spare time
  • 68. Section 5.12. Slice Factoring • Match up keys and values for those last examples • For example:
 Readonly my %MAPPING = (
 top => -1,
 prev => -2,
 backup => -3,
 ); • Now use keys/values with that for the assignment:
 @frames[values %MAPPING]
 = @active{keys %MAPPING}; • Doesn’t matter that the result is unsorted. It Just Works.
  • 69. Chapter 6. Control Structures • More than one way to control it • Some ways better than others
  • 70. Section 6.1. If Blocks • Use block if, not postx if. • Bad:
 $sum += $measurement if dened $measurement; • Good:
 if (dened $measurement) {
 $sum += $measurement;
 }
  • 71. Section 6.2. Postx Selectors • So when can we use postx if? • For flow of control! • last FOO if $some_condition; • die “horribly” unless $things_worked_out; • next, last, redo, return, goto, die, croak, throw
  • 72. Section 6.3. Other Postx Modiers • Don’t use postx unless/for/while/until • Says Damian • If the controlled expression is simple enough, go ahead • Good:
 print “ho ho $_” for @some_list; • Bad:
 dened $_ and print “ho ho $_”
 for @some_list; • Replace that with:
 for (@some_list) {
 print “ho ho $_” if dened $_;
 }
  • 73. Section 6.4. Negative Control Statements • Don’t use unless or until at all • Replace unless with if-not • Replace until with while-not • Says Damian • My exclusion: Don’t use “unless .. else” • unless(not $foo and $bar) { ... } else { ... } • Arrrgh! • And remember your DeMorgan’s laws • Replace not(this or that) with not this and not that • Replace not(this and that) with not this or not that • Often, that will simplify things
  • 74. Section 6.5. C-Style Loops • Avoid C-style for loops • Says Damian • I say they’re ok for this: for (my $start = 0; $start <= 100; $start += 3) { ... }
  • 75. Section 6.6. Unnecessary Subscripting • Avoid subscripting arrays and hashes in a loop • Compare:
 for my $n (0..$#items) { print $items[$n] } • With:
 for my $item (@items) { print $item } • And:
 for my $key (keys %mapper) { print $mapper{$key} } • With:
 for my $value (values %mapper) { print $value } • Can’t replace like this if you need the index or key
  • 76. Section 6.7. Necessary Subscripting • Use Data::Alias or Lexical::Alias • Don’t subscript more than once in a loop:
 for my $k (sort keys %names) {
 if (is_bad($names{$k})) {
 print “$names{$k} is bad”; ... • Instead, alias the value:
 for my $k (sort keys %names) {
 alias my $name = $names{$k}; • The $name is read/write! • Requires Data::Alias or Lexical::Alias
  • 77. Section 6.8. IteratorVariables • Always name your foreach loop variable • Unless $_ is more natural • for my $item (@items) { ... } • Example of $_: s/foo/bar/ for @items;
  • 78. Section 6.9. Non-Lexical Loop Iterators • Always use “my” with foreach • Bad:
 for $foo (@bar) { ... } • Good:
 for my $foo (@bar) { ... }
  • 79. Section 6.10. List Generation • Use map instead of foreach to transform lists • Bad:
 my @output;
 for (@input) {
 push @output, some_func($_);
 } • Good:
 my @output = map some_func($_), @input; • Concise, faster, easier to read at a glance • Can also stack them easier
  • 80. Section 6.11. List Selections • Use grep and rst to search a list • Bad:
 my @output;
 for (@input) {
 push @output, $_ if some_func($_);
 } • Good:
 my @output = grep some_func($_), @input; • But rst:
 use List::Util qw(rst);
 my $item = rst { $_ > 30 } @input;
  • 81. Section 6.12. List Transformation • Transform list in place with foreach, not map • Bad:
 @items = map { make_bigger($_) } @items; • Good:
 $_ = make_bigger($_) for @items; • Not completely equivalent • If make_bigger in a list context returns varying items • But generally what you’re intending
  • 82. Section 6.13. Complex Mappings • If the map block is complex, make it a subroutine • Keeps the code easier to read • Names the code for easier recognition • Permits testing the subroutine separately • Allows reusing the same code for more than one map
  • 83. Section 6.14. List Processing Side Effects • Modifying $_ in map or grep alters the input! • Don’t do that • Instead, make a copy:
 my @result = map {
 my $copy = $_;
 $copy =~ s/foo/bar/g;
 $copy;
 } @input; • Note that the last expression evaluated is the result • Don’t use return!
  • 84. Section 6.15. Multipart Selections • Avoid if-elsif-elsif-else if possible • But what to replace it with? • Many choices follow
  • 85. Section 6.16. Value Switches • Use table lookup instead of cascaded equality tests • Rather than:
 if ($n eq ‘open’) { $op = ‘o’ }
 elsif ($n eq ‘close’ } { $op = ‘c’ }
 elsif ($n eq ‘erase’ } { $op = ‘e’ } • Use:
 my %OPS = (open => ‘o’, close => ‘c’, erase => ‘e’);
 ...
 if (my $op = $OPS{$n}) { ... } else { ... } • Table should be initialized once
  • 86. Section 6.17. Tabular Ternaries • Sometimes, cascaded ?: are the best • Test each condition, return appropriate value • Example: my $result = condition1 ? result1 : condition2 ? result2 : condition3 ? result3 : fallbackvalue ;
  • 87. Section 6.18. do-while Loops • Don’t. • Do-while loops do not respect last/next/redo • If you need test-last, use a naked block:
 {
 ...
 ...
 redo if SOME CONDITION;
 } • last/next/redo works with this block nicely
  • 88. Section 6.19. Linear Coding • Reject as many conditions as early as possible • Isolate disqualifying conditions early in the loop • Use “next” with separate conditions • Example: while (<....>) { next unless /S/; # skip blank lines next if /^#/; # skip comments chomp; next if length > 20; # skip long lines ... }
  • 89. Section 6.20. Distributed Control • Don’t throw everything into the loop control • “Every loop should have a single exit” is bogus • Exit early, exit often, as shown by previous item
  • 90. Section 6.21. Redoing • Use redo with foreach to process the same item • Better than a while loop where the item count may or may not be incremented • No, i didn’t really understand the point of this, either
  • 91. Section 6.22. Loop Labels • Label loops used with last/next/redo • last/next/redo provide a limited “goto” • But it’s still sometimes confusing • Always label your loops for these: LINE: while (<>) { ... next LINE if $cond; ... } • The labels should be the noun of the thing being processed • Makes “last LINE” read like English!
  • 92. Chapter 7. Documentation • We don’t need any! • Perl is self-documenting! • Wrong
  • 93. Section 7.1. Types of Documentation • Be clear about the audience • Is this for users? or maintainers? • User docs should go into POD • Maintainer docs can be in comments • Might also be good to have those in separate PODs
  • 94. Section 7.2. Boilerplates • Create POD templates • Understand the standard POD • Enhance it with local standards • Don’t keep retyping your name and email • Consider Module::Starter
  • 95. Section 7.3. Extended Boilerplates • Add customized POD headers: • Examples • Frequently Asked Questions • Common Usage Mistakes • See Also • Disclaimer of Warranty • Acknowledgements
  • 96. Section 7.4. Location • Use POD for user docs • Keep the user docs near the code • Helps ensure the user docs are in sync
  • 97. Section 7.5. Contiguity • Keep the POD together in the source le • Says Damian • I like the style of “little code, little pod” • Bit tricky though, since you have to get the right =cut: =item FOO text about FOO =cut sub FOO { ... }
  • 98. Section 7.6. Position • Place POD at the end of the le • Says Damian • Nice though, because you can add __END__ ahead • This is true of the standard h2xs templates • Hard to do if it’s intermingled though • My preference: • mainline code • top of POD page • subroutines intermingled with POD • bottom of POD page • “1;” __END__ and <DATA> (if needed)
  • 99. Section 7.7. Technical Documentation • Organize the tech docs • Make each section clear as to audience and purpose • Don’t make casual user accidentally read tech docs • Scares them off from using your code, perhaps
  • 100. Section 7.8. Comments • Use block comments for major comments:
 #########################
 # Called with: $foo (int), $bar (arrayref)
 # Returns: $item (in scalar context), @items (in list) • Teach your editor a template for these
  • 101. Section 7.9. Algorithmic Documentation • Use full-line comments to explain the algorithm • Prex any “chunk” of code • Don’t exceed a single line • If longer, refactor the code to a subroutine • Put the long explanation as a block comment
  • 102. Section 7.10. Elucidating Documentation • Use end-of-line comments to point out weird stuff • Obviously, denition of “weird” might vary • Example: local $/ = “0”; # NUL, not newline chomp; # using modied $/ here
  • 103. Section 7.11. Defensive Documentation • Comment anything that has puzzled or tricked you • Example:
 @options = map +{ $_ => 1 }, @input; # hash not block! • Think of the reader • Think of yourself three months from now • Or at 3am tomorrow
  • 104. Section 7.12. Indicative Documentation • If you keep adding comments for “tricky” stuff... • ... maybe you should just change your style • That last example can be rewritten:
 @options = map { { $_ => 1 } } @input; • Of course, some might consider nested curlies odd
  • 105. Section 7.13. Discursive Documentation • Use “invisible POD” for longer technical notes • Start with =for KEYWORD • Include paragraph with no embedded blank lines • End with blank line and =cut • As in: =for Clarity: Every time we reboot, this code gets called. Even if it wasn’t our fault. I mean really! Yeah, that’s our story and we’re sticking to it. =cut
  • 106. Section 7.14. Proofreading • Check spelling, syntax, sanity • Include both user and internal docs • Better yet, get someone else to help you • Hard to tell your own mistakes with docs • Could make this part of code/peer review
  • 107. Chapter 8. Built-in Functions • Use them! • But use them wisely
  • 108. Section 8.1. Sorting • Don’t recompute sort keys inside sort • Sorting often compares the same item against many others • Recomputing means wasted work • Consider a caching solution • Simple cache • Schwartzian Transform
  • 109. Section 8.2. Reversing Lists • Use reverse to reverse a list • Don’t sort { $b cmp $a } - reverse sort! • Use reverse to make descending ranges: my @countdown = reverse 0..10;
  • 110. Section 8.3. Reversing Scalars • Use scalar reverse to reverse a string • Add the “scalar” keyword to make it clear • Beware of:
 foo(reverse $bar) • This is likely list context • Reversing a single item list is a no-op! • Suggest: foo(scalar reverse $bar)
  • 111. Section 8.4. Fixed-Width Data • Use unpack to extract xed-width elds • Example:
 my ($x, $y, $z)
 = unpack ‘@0 A6 @8 A10 @20 A8’, $input; • The @ ensures absolute position
  • 112. Section 8.5. Separated Data • Use split() for simple variable-width elds • First arg of string-space for awk-like behavior:
 my @columns = split ‘ ’, $input; • Leading/trailing whitespace is ignored • Multiple whitespace is a single delimiter • Trailing whitespace is ignored • Otherwise, leading delimiters are signicant:
 my @cols = split /s+/,“ foo bar”; • $cols[0] = empty string • $cols[1] = “foo”
  • 113. Section 8.6. Variable-Width Data • Text::CSV_XS is your friend • Handles quoting of delimiters and whitespace
  • 114. Section 8.7. String Evaluations • Avoid string eval • Harder to debug • Slower (ring up compiler at runtime) • Might create security holes • Generally unnecessary • Runtime data should not affect program space
  • 115. Section 8.8. Automating Sorts • Sort::Maker (CPAN) is your friend • Can build • Schwartzian Transform • Or-cache (Orcish) • Gutmann Rosler Transform
  • 116. Section 8.9. Substrings • Use 4-arg substr() instead of lvalue substr() • Old way:
 substr($x, 10, 3) = “Hello”; # replace 10..12 with “Hello” • New way (for some ancient meaning of “new”):
 substr($x, 10, 3,“Hello”); # replace, and return old • Slightly faster
  • 117. Section 8.10. HashValues • Remember that values is an lvalue • Double all values:
 $_ *= 2 for values %somehash; • Same as:
 for (keys %somehash) {
 $somehash{$_} *= 2;
 } • But with less typing!
  • 118. Section 8.11. Globbing • Use glob(), not <...> • Save <...> for lehandle reading • Works the same way:
 <* .*> same as glob ‘* .*’ • Don’t use multiple arguments
  • 119. Section 8.12. Sleeping • Avoid raw select for sub-second sleeps • use Time::HiRes qw(sleep);
 sleep 1.5; • Says Damian • I think “select undef, undef, undef, 1.5” is just ne • Far more portable • If you’re scared of select, hide it behind a subroutine
  • 120. Section 8.13. Mapping and Grepping • Always use block-form of map/grep • Never use the expression form • Too easy to get the “expression” messed up • Works:
 map { join “:”, @$_ } @some_values • Doesn’t work:
 map join “:”, @$_, @some_values; • Block form has no trailing comma after the block • Might need disambiguating semicolon:
 map {; ... other stuff here } @input
  • 121. Section 8.14. Utilities • Use the semi-builtins in Scalar::Util and List::Util • These are core with modern Perl • Scalar::Util has: blessed, refaddr, reftype, readonly, tainted, openhandle, weaken, unweaken, is_weak, dualvar, looks_like_number, isdual, isvstring, set_prototype • List::Util has: reduce, any, all, none, notall, rst, max, maxstr, min, minstr, product, sum, sum0, pairs, unpairs, pairkeys, pairvalues, pairgrep, pairrst, pairmap, shuffle, uniq, uniqnum, uniqstr • reduce is cool: • my $factorial = reduce { $a * $b } 1..20; • my $commy = reduce { “$a, $b” } @strings;
  • 122. Chapter 9. Subroutines • Making your program modular • Nice unit of reuse • GIving name to a set of steps • Isolating local variables
  • 123. Section 9.1. Call Syntax • Use parens • Don’t use & • Bad: &foo • Good: foo() • Parens help them stand out from builtins • Lack of ampersand means (rare) prototypes are honored • Still need the ampersand for a reference though: &foo
  • 124. Section 9.2. Homonyms • Don’t overload the built-in functions • Perl has weird rules about which have precedence • Sometimes your code is picked (lock) • Sometimes, the original is picked (link)
  • 125. Section 9.3. Argument Lists • Unpack @_ into named variables • $_[3] is just ugly • Even compared to the rest of Perl • Unpack your args: • my ($name, $rank, $serial) = @_; • Use “shift” form for more breathing room: • my $name = shift; # “Flintstone, Fred” • my $rank = shift; # 1..10 • my $serial = shift; # 7-digit integer • Modern Perl can also have signatures • Built-in starting in 5.20 (but primitive) • Damian now recommends (his) Method::Signatures
  • 126. Section 9.4. Named Arguments • Use named parameters if more than three • Three positional parameters is enough • For more than three, use a hash: • my %options = %{+shift}; • my $name = $options{name} || die “?”; • Pass them as a hashref: • my_routine({name => “Randal”}) • This catches the odd-number of parameters at the caller, not the callee.
  • 127. Section 9.5. Missing Arguments • Use denedness rather than boolean to test for existence • Bad: if (my $directory = shift) { ... } • What if $directory is “0”? Legal! • Better: if (dened(my $directory = shift)) { ... } • Or: if (@_) { my $directory = shift; ... } • Automatically processed for item-at-a-time built-ins:
 while (my $f = readdir $d) { … }
  • 128. Section 9.6. Default ArgumentValues • Set up default arguments early • Avoids temptation to use a quick boolean test later:
 my ($text, $arg_ref) = @_;
 my $thing = exists $arg_ref->{thing}
 ? $arg_ref->{thing} :“default thing”;
 my $other = exists $arg_ref->{other}
 ? $arg_ref->{other} :“default other”; • Or maybe:
 my %args = (thing => ‘default thing’,
 other => ‘default other’, %$arg_ref);
  • 129. Section 9.7. Scalar ReturnValues • Indicate scalar return with “return scalar” • Ensures that list context doesn’t ruin your day • Example: returning grep for count (true/false)
 sub big10 { return grep { $_ > 10 } @_ } • Works ok when used as boolean:
 if (big10(@input)) { ... } • Or even to get the count:
 my $big_uns = big10(@input); • But breaks weird when used in a list:
 some_other_sub($foo, big10(@input), $bar); • Fixed: sub big10 { return scalar grep { $_ > 10 } @_ }
  • 130. Section 9.8. Contextual ReturnValues • Make list-returning subs intuitive in scalar context • Consider the various choices: • Count of items (like grep/map/arrayname) • Serialized string representation (localtime) • First item (caller, each, select, unpack) • “next” item (glob, readline) • last item (splice, various slices) • undef (sort) • third(!) item (getpwnam) • See http://www.perlmonks.org/index.pl?node_id=347416
  • 131. Section 9.9. Multi-Contextual Return Values • Consider Contextual::Return • Appropriately Damian-ized (regretfully discouraged now)
 use List::Util qw(rst);
 use Contextual::Return;
 sub dened_samples_in {
 return (
 LIST { grep { dened $_} @_ }
 SCALAR { rst { dened $_} @_ }
 );
 } • It does more, much more. So much more: • VOID, BOOL, NUM, STR, REF, FAIL, LAZY, LVALUE…
  • 132. Section 9.10. Prototypes • Don’t • Spooky action at a distance • Not for mortals • Useful only by wizards • sub LIST (;&$) { ... }
  • 133. Section 9.11. Implicit Returns • Always use explicit return • Perl returns the last expression evaluated • But make it explicit: • return $this; • Easier to see that the return value is expected • Protects against someone adding an extra line to the subroutine, breaking the return
  • 134. Section 9.12. Returning Failure • use “return;” for undef/empty list return • Not “return undef;” • Thus it falls out of lists correctly:
 my @result = map { yoursub($_) } @input; • If the caller wants the undef, they can say so:
 my @result = map { scalar yoursub($_) } @input;
  • 135. Chapter 10. I/O • It’s off to work we go • If a program didn’t have I/O • Could you still tell it ran?
  • 136. Section 10.1. Filehandles • Don’t use bareword lehandles • Can’t easily pass them around • Can’t easily localize them • Can’t make them “go out of scope”
  • 137. Section 10.2. Indirect Filehandles • Use indirect lehandles: • open my $foo,“<”, $someinput or die; • They close automatically • Can be passed to/from subroutines • Can be stored in aggregates • Might need readline() or print {...} though • my $line = readline $inputs[3]; • print { $output_for{$day} } @list;
  • 138. Section 10.3. Localizing Filehandles • If you must use a package lehandle, localize it • local *HANDLE • Beware this also stomps on other package items: • $HANDLE • @HANDLE • %HANDLE • &HANDLE • HANDLE dirhandle • HANDLE format • So, use it sparingly • That’s why indirect handles are better
  • 139. Section 10.4. Opening Cleanly • Use IO::File or 3-arg open • IO::File:
 use IO::File;
 my $in_handle = IO::File->new($name,“<”) or die; • 3-arg open (with indirect handle):
 open my $in_handle,“<”, $name or die; • 3-arg open ensures safety • Even if $name starts with < or > or | or ends with |. • Or whitespace (normally trimmed)
  • 140. Section 10.5. Error Checking • Add error checking or die! • Include the $! as well • open my $handle,“<”, $that_le or die “Cannot open $that_le: $!”; • Also check close and print • Close and print can fail • If the le isn’t opened (oops!) • If the nal flush causes an I/O error (oops!)
  • 141. Section 10.6. Cleanup • Close lehandles explicitly as soon as possible • Says Damian • Or just let them fall out of scope and use a tight scope • my $line = do { open my $x,“<”,“file”; scalar <$x> };
  • 142. Section 10.7. Input Loops • Use while(<>), not foreach(<>) • While - reads one line at a time • Exits at undef (no more lines) • Foreach - reads everything at once • No reason to bring everything into memory • Especially when you can’t reference it!
  • 143. Section 10.8. Line-Based Input • Similar to previous hint • If you can grab it a line at a time, do it • Bad:
 print for grep /ITEM/, <INPUT>; • Good:
 while (<INPUT>) { print if /ITEM/ }
  • 144. Section 10.9. Simple Slurping • Slurp in a do-block • Get the entire le: my $entire_le = do { local $/; <$in> }; • local here automatically provides undef • The localization protects nearby items • Dangerous if you had $/ = undef in normal code • Better/faster/stronger than: join “”, <$in>
  • 145. Section 10.10. Power Slurping • Slurp a stream with File::Slurp (from the CPAN) • Read an entire le:
 use File::Slurp qw(read_le);
 my $entire_le = read_le $in; • And a few other interesting options
  • 146. Section 10.11. Standard Input • Avoid using STDIN unless you really mean it • It’s not necessarily the terminal (could be redirected) • It’s almost never the les on the command line • It means only “standard input” • Read from ARGV instead: my $line = <ARGV>; # explicit form my $line = <>; # implicit (preferred) form
  • 147. Section 10.12. Printing to Filehandles • Always put lehandles in braces in any print statement • Says Damian • The rest of us do that when it’s complex: print $foo @data; # simple print { $hashref->{foo} } @data; # complex • If it’s a bareword or a simple scalar, no need for braces
  • 148. Section 10.13. Simple Prompting • Always prompt for interactive input • Keeps the user from wondering: • Is it running? • Is it waiting for me? • Has it crashed? • Prompt only when interactive though • Don’t prompt if part of a pipeline
  • 149. Section 10.14. Interactivity • Test for interactivity with -t • Simple test:
 if (-t) { ... STDIN is a terminal } • Damian makes it more complex • Suggests IO::Interactive from CPAN
  • 150. Section 10.15. Power Prompting • Use IO::Prompt for prompting (in CPAN) • my $line = prompt “line, please? ”; • Automatically chomps (yeay!) • my $password = prompt “password:”,
 -echo => ‘*’; • my $choice = prompt ‘letter’, -onechar,
 -require => { ‘must be [a-e]’ => qr/[a-e]/ }; • Many more options • Damian followed up with IO::Prompter (even more options)
  • 151. Section 10.16. Progress Indicators • Interactive apps need progress indicators for long stuff • People need to know “working? or hung?” • And “Coffee? Or take the afternoon off?” • Good estimates of completion time are handy • Spooky estimates can backre • 90% of ticks come within 2 seconds • last 10% takes 5 minutes... oops
  • 152. Section 10.17. Automatic Progress Indicators • Damianized Smart::Comments in CPAN • Example: use Smart::Comments; for my $path (split /:/, $ENV{PATH}) { ### Checking path... done for my $le (glob “$path/*”) { ### Checking les... done if (-x $le) { ... } } }
  • 153. Section 10.18. Autoflushing • Avoid using select() to set autoflushing: select((select($fh), $| = 1)[0]); • Yeah, so I came up with that • I must have been, uh... confused • Use IO::Handle (not needed in Perl 5.14 and later) • Then you can call $fh->autoflush(1) • And even *STDOUT->autoflush(1)
  • 154. Chapter 11. References • Can’t get jobs without them • Can’t get to indirect data without them either • Symbolic (soft) references are evil
  • 155. Section 11.1. Dereferencing • Prefer arrows over circumflex dereferencing • Bad: ${$foo{bar}}[3] • Good: $foo{bar}->[3] • Can’t reduce: @{$foo{bar}}[3, 4] • Until Perl 5.20 • $foo{bar}->@[3, 4] • Might require enabling the feature
  • 156. Section 11.2. Braced References • Always use braces for circumflex dereferencing • Bad: @$foo[3, 4] • Good: @{$foo}[3, 4] • Makes the deref item clearer • Needed anyway when deref item is complex • Or flip it around to postx form
  • 157. Section 11.3. Symbolic References • Don’t • use strict ‘refs’ prevents you anyway • If you think of the symbol table as a hash... • ... you can see how to move it down a level anyway • Bad: $x = “K”; ${$x} = 17; • Good: $x = “K”; $my_hash{$x} = 17;
  • 158. Section 11.4. Cyclic References • Cycles cannot be reference-counted nicely • Break the cycle with weaken • Use weaken for every “up” link • When kids need to know about their parents • use Scalar::Util qw(weaken);
 my $root = {};
 my $leaf1 = { parent => $root };
 weaken $leaf1->{parent};
 my $leaf2 = { parent => $root };
 weaken $leaf2->{parent};
 push @$root{kids}, $leaf1, $leaf2; • Now we can toss $root nicely
  • 159. Chapter 12. Regular Expressions • Some people, when confronted with a problem, think:
 "I know, I'll use regular expressions".
 Now they have two problems.
 —Jamie Zawinski
  • 160. Section 12.1. Extended Formatting • Always use /x • Internal whitespace is ignored • Comments are simple #-to-end-of-line • Hard: /^(.*?)(w+).*?21$/ • Easy(er):
 /^
 (.*?) (w+) # some text and an alpha-word
 .*? # anything
 2 1 # the word followed by text again
 $/x • Beware: this will break old regex if you’re not careful
  • 161. Section 12.2. Line Boundaries • Always use /m • ^ means • Start of string • Just after any embedded newline • $ means • End of string • Just before any embedded newline • Damian argues that this is closer to what people expect • I say use it when it’s what you want
  • 162. Section 12.3. String Boundaries • Use A and z • A is the old ^ • z is the old $
  • 163. Section 12.4. End of String • Use z not Z or $ • It’s really “end of string” • Not “end of string but maybe before n” • Security hole: if ($value =~ /^d+$/) { ... } • Value could be “35n” • That might ruin a good day • Fix: if ($value =~ /Ad+z/) { ... }
  • 164. Section 12.5. Matching Anything • Always use /s • Turns . into “match anything” • Not “match anything (except the newline)” • This is more often what people want • Says Damian • Use it when you need it
  • 165. Section 12.6. Lazy Flags • If you’re as crazy as Damian • And as lazy as Damian • Consider Regexp::Autoflags • Automatically adds /xms to all regexp • But then notice that the book lies • Because it’s actually Regexp::DefaultFlags in the CPAN
  • 166. Section 12.7. Brace Delimiters • Prefer m{...} to /.../ for multiline regex • Open-close stands out easier • Especially when alone on a line • Heck, use ‘em even for single line regex • Particularly if the regex contains a slash • Balancing is easy: m{ abc{3,5} } # “ab” then 3-5 c’s m{ abc{3,5} } # “abc{3,5}” all literal
  • 167. Section 12.8. Other Delimiters • Don’t use any other delimiters • Unless you’re sure that Damian’s not looking • But seriously... • m#foo# is cute • But annoying • And m,foo, is downright weird
  • 168. Section 12.9. Metacharacters • Prefer singular character classes to escaped metachars • Instead of “.”, use [.] • Instead of “ “ (escaped space), use [ ] (space in []) • Advantage: brackets are balanced
  • 169. Section 12.10. Named Characters • Prefer named characters to escaped metacharacters • Example:
 use charnames qw(:full);
 if ($escape_seq =~ m{
 N{DELETE}
 N{ACKNOWLEDGE}
 N{CANCEL} Z
 }xms) { ... } • Downside: now you have to know these names • Is anyone really gonna type N{SPACE} ?
  • 170. Section 12.11. Properties • Prefer properties to character classes • Works better with Unicode • Such as: qr{ p{Uppercase} p{Alphabetic}* /xms; • “perldoc perlunicode” lists the properties • You can even create your own
  • 171. Section 12.12. Whitespace • Arbitrary whitespace better than constrained whitespace • So N{SPACE}* matches even “weird” whitespace
  • 172. Section 12.13. Unconstrained Repetitions • Be careful when matching “as much as possible” • For example:
 “Activation=This=that” =~ /(.*)=(.*)/ • $1 is “Activation=This”! • Consider .*? instead:
 “Activation=This=that” =~ /(.*?)=(.*)/ • Now we have $1 = “Activation”, $2 = “This=that” • Don’t do this blindly:
 “Activation=This=that” =~ /(.*?)=(.*?)/ • Now $2 has nothing!
  • 173. Section 12.14. Capturing Parentheses • Use capturing parens only when capturing • Every ( ... ) in a regex is a capture regex • This affects $1, etc • But it also affects speed • When you don’t need to capture, don’t! • Use (?: ... ) • Yes, a little more typing • But it’s clear that you don’t need that value
  • 174. Section 12.15. CapturedValues • Use $1 only when you’re sure you had a match • Common error:
 $foo =~ /(.*?)=(.*)/;
 push @{$hash{$1}}, $2; • What if the match fails? • Pushes the previous $2 onto the previous $1 • Always use $1 with a conditional • Even if it can never match: • Add “or die”:
 $foo =~ /(.*?)=(.*)/ or die “Should not happen”; • Then you know something broke somewhere
  • 175. Section 12.16. CaptureVariables • Name your captures • There can be quite a distance between • / (w+) s+ (w+) /xms • $1, $2 • Easy to make mistake, or modify and break things • Instead, name your captures when you can: • my ($given, $surname) = /(w+)s+(w+)/; • If this match fails in a scalar context, it returns false
  • 176. Section 12.17. Piecewise Matching • Tokenize input using /gc • Also called “inchworming” • Match string against a series of /G.../gc constructs • If one succeeds, pos() is updated • Example: { last if /Gz/gc; # exit at end of string if (/G(w+)/gc) { push @tokens,“keyword $1” } elsif (/G(“.*?”)/gcs) { push @tokens,“quoted $1” } elsif (/Gs+/gc) { ... ignore whitespace ... } else { die “Cannot parse:“, /G(.*)/s } # grab rest redo; }
  • 177. Section 12.18. Tabular Regexes • Build regex from tables • To replace keys of %FOO with values of %FOO • Construct the matching regex:
 my $regex = join “|”, map quotemeta,
 reverse sort keys %FOO;
 $regex = qr/$regex/; # if used more than once • Do the replacement:
 $string =~ s/($regex)/$FOO{$1}/g; • It works
  • 178. Section 12.19. Constructing Regexes • Build complex regex from simpler pieces • Example:
 my $DIGITS = qr{ d+ (?: [.] d*)? | [.] d+ }xms;
 my $SIGN = qr{ [+-] }xms;
 my $EXPONENT = qr{ [Ee] $SIGN? d+ }xms;
 my $NUMBER = qr{ ( ($SIGN?) ($DIGITS) ($EXPONENT?) ) }xms; • Now you can use /$NUMBER/
  • 179. Section 12.20. Canned Regexes • Use Regexp::Common • Don’t reinvent the common regex • Use the expert coding in Regexp::Common • $number =~ /$RE{num}{real}/ • $balanced_parens =~ /$RE{balanced}/ • $bad_words =~ /$RE{profanity}/ • Module comes with 2,315,992 tests (over 4 minutes to run the tests!)
  • 180. Section 12.21. Alternations • Prefer character classes to single-char alternations • [abc] is far faster than (a|b|c) • As in, 10 times faster • Consider refactoring too • Replace: (a|b|ca|cb|cc) • With: (a|b|c[abc]) • Same job, and again faster
  • 181. Section 12.22. Factoring Alternations • An advancement of previous hint • Slow: /with s+ foo|with s+ bar/x • Faster: /with s+ (foo|bar)/x • Keep refactoring until it hurts • Beware changes to $1, etc • Also beware changes to order of attempted matches
  • 182. Section 12.23. Backtracking • Prevent useless backtracking • Consider ($?> ... ) • Once it has matched, it’s backtracked as a unit
  • 183. Section 12.24. String Comparisons • Prefer eq to xed-pattern regex matches • Don’t say $foo =~ /ABARz/ • Use $foo eq “BAR” • Don’t say $foo =~ /ABARz/i • Use (uc $foo) eq “BAR”
  • 184. Chapter 13. Error Handling • Hopefully, everything works • But when things go bad, you need to know • Sometimes, it’s minor, and just needs repair • Sometimes, it’s major, and needs help • Sometimes, it’s catastrophic
  • 185. Section 13.1. Exceptions • Throw exceptions instead of funny returns • People might forget to test for “undef” • But they have to work pretty hard to ignore an exception • Exceptions can be easily caught with eval {} • But better to use Try::Tiny • Or for more flexibility,TryCatch
  • 186. Section 13.2. Builtin Failures • Use Fatal to turn failures into exceptions • Tired of forgetting “or die” on “open”? • use Fatal qw(:void open);
 ...
 open my $f,“BOGUS FILE”; # dies! • Yes, nally open does the right thing • Damian also suggests Lexical::Failure
  • 187. Section 13.3. Contextual Failure • You can tell Fatal to be context-sensitive • So this still works: • unless (open my $f,“BOGUS”) { ... } • That’s dangerous though • People might use it in a non-void context without checking • Says Damian • I think you can use it with discipline
  • 188. Section 13.4. Systemic Failure • Be careful with system() • Returns 0 (false) if everything is OK • Returns non-zero if something broke • Technically, the return value of wait(): • Exit status * 256 • Plus 128 if core was dumped • Plus signal number that killed it, if any • 0 = “good”, non-zero = “bad” • Damian suggests WIFEXITED:
 use POSIX qw(WIFEXITED);WIFEXITED(system ...) • Or just add “!” in front:
 !system “foo” or die “...”; # no access to $! here!
  • 189. Section 13.5. Recoverable Failure • Throw exceptions on all failures • Including recoverable ones • Easy enough to put an eval {} around it • Don’t use undef • Developers don’t always check • Says Damian • I say, trust your developers a bit
  • 190. Section 13.6. Reporting Failure • Carp blames someone else • use Carp qw(croak);
 ... open my $f, $le or croak “$le?”; • Now the caller is blamed • Makes sense if it’s the caller’s fault
  • 191. Section 13.7. Error Messages • Error messages should speak user-based terminology • Nobody knows ‘@foo’ • Better chance of knowing ‘input les’ • Be as detailed as you can • Remember that this is all you’re likely to get in the bug report • And they will le bug reports
  • 192. Section 13.8. Documenting Errors • Document every error message • Explain it in even more detail • Again in the user’s terminology • Nothing more frustrating than a confusing error message • ... that isn’t explained! • Good example:“perldoc perldiag”
  • 193. Section 13.9. OO Exceptions • Throw an object instead of text • The object can contain relevant information • Exception catchers can pull info apart • Exception objects can also have nice stringications • This helps naive catchers that print “Error: $@”
  • 194. Section 13.10. Volatile Error Messages • Relying on catchers to parse text is dangerous • Throwing an object isolates text changes • Additional components can be added later • Likely won’t break existing code
  • 195. Section 13.11. Exception Hierarchies • Use exception objects when exceptions are related • Give them different classes, but common inheritance • Catchers can easily test for categories of exceptions • Or distinguish specic items when it matters
  • 196. Section 13.12. Processing Exceptions • Test the most specic exception rst • For example, log le error, before generic le error
  • 197. Section 13.13. Exception Classes • Use Exception::Class for nice exceptions • Exception::Class-based exceptions capture: • caller • current lename, linenumber, package • user data • Stringify nicely for naive displays (customized if needed) • Create hierarchies for classied handling • Consider “Throwable” role for Moo or Moose
  • 198. Section 13.14. Unpacking Exceptions • Grab $@ early in your catcher • Then parse and act on it • One of your parsing steps may alter $@ accidentally • Best to capture it rst! • Or consider Try::Tiny, which puts $@ safely into block- scoped $_
  • 199. Chapter 14. Command-Line Processing • The great homecoming • In the beginning, everything was command-line • Perl still works well here • But you should follow some standards and conventions
  • 200. Section 14.1. Command-Line Structure • Enforce a single consistent command-line structure • GNU tools • Need I say more? • Every tool, slightly different • Pick a standard, stick with it • Especially amongst a tool suite • If you use -i for input here, use it there too
  • 201. Section 14.2. Command-Line Conventions • Use consistent APIs for arguments • Require a flag in front of every arg, except lenames • Use - prex for single-letter flags • Use -- prex for longer flags • Provide long flag aliases for every single-letter flag • Always allow “-” as a lename • Always use “--” to indicate “end of args”
  • 202. Section 14.3. Meta-options • Standardize your meta options • --usage • --help • --version • --man
  • 203. Section 14.4. In-situ Arguments • If possible, allow same le for both input and output • foo_command -i thisle -o thisle • Obviously might require some jiggery • Or IO::InSitu from the CPAN:
 my $in_name = ....;
 my $out_name = ....;
 my ($in, $out) = open_rw($in_name, $out_name);
 while (<$in>) { print $out “$.: $_” }
  • 204. Section 14.5. Command-Line Processing • Standardize on a single approach to command-line processing • Look at Getopt::Long (core) • For a more Damianized approach, Getopt::Declare • Spell out your usage, and the args follow! • Manpage is longer than most program manpages
  • 205. Section 14.6. Interface Consistency • Ensure interface, run-time, and docs are consistent • Usage messages should match what the code does • Simple with Getopt::Declare! • Help docs should match what the manpage says • If Getopt::Declare wasn’t enough... • ... consider Getopt::Euclid • Constructs your parser from POD • Then the manpage denitely matches the code! • Again, suitably Damianized • Requires many “aha!” moments to fully understand
  • 206. Section 14.7. Interapplication Consistency • Factor out common CLI items to shared modules • If every tool has -i and -o, don’t keep rewriting that • Getopt::Euclid “pod modules” can provide common code • Also permits readers to learn it once • Instead of having to recognize it each time • “This program implements L<Our::Standard::CLI>.”
  • 207. Chapter 15. Objects • Used by practically every large practical program • Even used by many little programs • “There’s more than one way to implement it” • Hash-based? Array-based? Inside-out? • Using some framework, or hand-coded? • So what’s best? • Here we go...
  • 208. Section 15.1. Using OO • Make OO a choice, not a default • Don’t introduce needless objects • Especially over-complex frameworks • You probably don’t need a different class for every token
  • 209. Section 15.2. Criteria • Use objects if you have the right conditions • Large systems • Obvious large structures • Natural hierarchy (polymorphism and inheritance) • Many different operations on the same data • Same or similar operations on related data • Likely to add new types later • Operator overloading can be used logically • Implementations need to be hidden • System design is already OO • Large numbers of other programmers use your code
  • 210. Section 15.3. Pseudohashes • Don’t • “We’re sorry” • Removed in 5.10 anyway (long time ago!)
  • 211. Section 15.4. Restricted Hashes • Don’t • Not really restricted • For every lock, there’s a means to unlock • Inside-out objects handle most of the needs here
  • 212. Section 15.5. Encapsulation • Don’t bleed your guts • Provide methods for all accessors • If you want to provide hashref access • Provide a hashref overload • Prepare to pay the price for such access • Consider inside-out objects to ensure guts are private
  • 213. Section 15.6. Constructors • Give every constructor the same standard name • And that’s “new”, duh • Says Damian • Or, just use what’s natural • After all, DBI->connect does it. • Or rewrite that as DBI->new({ connect => [...] }) • Not likely any time soon
  • 214. Section 15.7. Cloning • Don’t clone in your constructor • $instance->new should be an error • If you want an “object of the same type as...”, use ref:
 my $similar = (ref $instance)->new(...); • If you want a proper clone, write a clone/copy routine:
 my $clone = $instance->clone; • Note: please stop cargo-culting:
 sub new {
 my $self = shift;
 my $class = ref $self || $self; # bad
 ...
 }
  • 215. Section 15.8. Destructors • Inside out objects need destructors • Or they leak • And be sure to destroy all the parent classes too:
 sub DESTROY {
 my $deadbody = shift;
 ... destroy my attributes here ...
 for (our @ISA) {
 my $can = $_->can(“DESTROY”);
 $deadbody->$can if $can;
 }
 } • Or use “NEXT” in the CPAN
  • 216. Section 15.9. Methods • Methods are subroutines • Except methods can be same-named as built-ins • They’ll never be confused with subroutines
  • 217. Section 15.10. Accessors • Provide separate read and write accessors • Same-named accessors might have spooky non-behavior:
 $person->name(“new name”);
 $person->rank(“new rank”);
 $person->serial_number(“new number”); # read only! • Of course, these double-duty routines should abort • But they often don’t • Also, differently named accessors easier to grep for • Help understand places a value could change
  • 218. Section 15.11. Lvalue Accessors • Don’t use lvalue accessors • Require returning direct access to a variable • No place to do error checking • Or require returning tied var • Slow • Really slow • Like, slower than just a method call
  • 219. Section 15.12. Indirect Objects • Indirect object syntax is broken • Bad: my $item = new Thing; • Good: my $item = Thing->new; • No chance of mis-parse • Mis-parse has bitten someVery Smart People • If you think you’re smarter than someVery Smart People • ... let us know how that’s working out for you
  • 220. Section 15.13. Class Interfaces • Provide optimal interface, not minimal one • Consider common operations and implement them • As class author, you can optimize internally • Class users have to use your published interface • Worse, they might repeat the code, or build layers • Resulting in more wrappers or repeated code!
  • 221. Section 15.14. Operator Overloading • Overload only for natural-mapping operator • Don’t just pick operators and map them • You’ll get bit by precedence, most likely • Or leave one out, and really confuse users • Don’t make the C++ mistake:“>>” meaning write-to
  • 222. Section 15.15. Coercions • If possible, provide sane coercions: • Boolean (for tests) • Numeric • String • If your object is “logically empty”, then: if ($your_object) { ... } should return false else true • And “$your_object” shouldn’t provide a hex address • Overloaded numerics and strings are also used in sorting
  • 223. Chapter 16. Class Hierarchies • Objects get their power from inheritance • Designing inheritance well takes practice
  • 224. Section 16.1. Inheritance • Don’t manipulate @ISA directly • Prefer “use parent”: use parent qw(This That Other); • Nice side-effect of loading those packages automatically • Or use a framework that establishes this directly • Replaces the heavier-weight “use base”
  • 225. Section 16.2. Objects • Use inside-out objects • Said Damian back in the day • Avoids attribute collisions • Encapsulates attributes securely • These days, somewhat discouraged • Makes objects a bit too opaque • Still useful for hostile subclassing though
  • 226. Section 16.3. Blessing Objects • Never use one-argument bless • Blessing into “the current package” breaks subclassing:
 package Parent;
 sub new { bless {} }
 package Child;
 use base qw(Parent);
 package main;
 my $kid = Child->new;
 print ref $kid; # Parent?? • That should have been:
 package Parent; sub new { bless {}, shift }
  • 227. Section 16.4. Constructor Arguments • Pass constructor args as a hashref • Gives you key/value pairs easily • Broken hashref caught at the caller, not “new” • “Odd number of elements...” • Derived class constructor can pick out what it wants • Pass the remaining items up to base-class constructor
  • 228. Section 16.5. Base Class Initialization • Distinguish initializers by class name • To avoid collisions:
 my $thing = Child->new(
 child => { this => 3, that => 5 },
 parent => { other => 7 }); • Says Damian • I say it reveals too much of the hierarchy • Gets complex if Parent/Child are refactored • Use logical names for attributes, not class-based names
  • 229. Section 16.6. Construction and Destruction • Separate construction, initialization, and destruction • Multiple inheritance breaks if every class does its own bless • Instead, all classes should separate bless from initialize • Suggested name “BUILD” • Constructor can call base constructor, then all BUILD routines in hierarchy • Similar strategy for DESTROY • Or just use Moose or Moo
  • 230. Section 16.7. Automating Class Hierarchies • Build standard class infrastructure automatically • Getting accessors and privacy correct can take special care • Use inside-out objects for safety and best development speed • Class::Std or Object::InsideOut
  • 231. Section 16.8. Attribute Demolition • Use Class:Std or Object::InsideOut to ensure deallocation of attributes • Calls DESTROY in all the right places
  • 232. Section 16.9. Attribute Building • Initialize and verify attributes automatically • Again, Class::Std or Object::InsideOut to the rescue • Calls BUILD in all the right places • Or attribute hashes can be annotated for the common case
  • 233. Section 16.10. Coercions • Specify coercions as attributed methods • Again, provided with Class::Std/Object::InsideOut • For numeric context:
 sub count : NUMERIFY { return ... } • For string context:
 sub as_string : STRINGIFY { return ... } • For boolean context:
 sub is_true : BOOLIFY { return ... }
  • 234. Section 16.11. Cumulative Methods • Use attributed cumulative methods instead of SUPER • Class::Std/Object::InsideOut provide CUMULATIVE:
 sub items : CUMULATIVE { ... } • When called in a list context, all calls are made • Result is the concatenated separate lists • Scalar context similar • Result is the string concatenation of all calls • No need for “SUPER::” • Works even with multiple inheritance
  • 235. Section 16.12. Autoloading • Don’t use AUTOLOAD • Effectively creates a runtime interface • Not a compile time interface • Must be prepared for • Methods you don’t want to handle • DESTROY • Calling “NEXT” for either of those is tricky
  • 236. Chapter 17. Modules • Reusable collections of subroutines and their data • Unit of exchange for the CPAN • Unit of testability internally
  • 237. Section 17.1. Interfaces • Design the module’s interface rst • A bad interface makes a module unusable • Think about how your module will be used • Name things from the user’s perspective • Not from how it’s implemented • Create examples, or even “use cases” for your interface • Keep the examples around for your documentation!
  • 238. Section 17.2. Refactoring • Place original code inline • Place duplicated code in a subroutine • Place duplicated subroutines in a module
  • 239. Section 17.3. Version Numbers • Use 3-part version numbers • But vstrings are broken • Didn’t get them right until 5.8.1 • Deprecating them for 5.10! • Use the “version” module from the CPAN: use version; our $VERSION = qv(‘1.0.3’); • Also supports “development versions”: use version; our $VERSION = qv(‘1.5_12’);
  • 240. Section 17.4. Version Requirements • Enforce version requirements programatically • Enforce Perl version:
 use 5.008; # 5.8 or greater, only • Enforce package versions:
 use List::Util 1.13 qw(max); • Use “use only” for more precise control • Get “only” from the CPAN
  • 241. Section 17.5. Exporting • Export carefully • Your default export list can’t change in the future • Might break something if new items added • Where possible, only on request • Flooding the namespace is counterproductive • Especially with common names, like “fail” or “display”
  • 242. Section 17.6. Declarative Exporting • Export declaratively • Perl6::Export::Attrs module:
 package Test::Utils; use Perl6::Export::Attrs;
 sub ok :Export(:DEFAULT, :TEST, :PASS) { ... }
 use pass :Export(:TEST, :PASS) { ... }
 use fail :Export(:TEST) { ... }
 use skip :Export () { ... } # can leave () off • Now @EXPORT will contain only “ok” (:DEFAULT) • And @EXPORT_OK will contain all 4 • And the :TEST and :PASS groups will be set up:
 use Test::Utils qw(:PASS); # ok, pass
  • 243. Section 17.7. InterfaceVariables • Export subroutines, not data • Yes, let’s create a module, but then reintroduce global variables • OK, that’s a bad idea • Variables have only get/set interface • And very little checking (unless you “tie”) • Export subroutines to give more control • And more flexibility for the future
  • 244. Section 17.8. Creating Modules • Build new module frameworks automatically • The classic: h2xs • Modern: • ExtUtils::ModuleMaker • Module::Starter
  • 245. Section 17.9. The Standard Library • Use core modules when possible • See “perldoc perlmodlib” • Lots and lots of things • The core gets bigger over time • See Module::Corelist to know when • Command line “corelist”
  • 246. Section 17.10. CPAN • Use CPAN modules where feasible • Smarter people than you have hacked the code • Or at least had more time than you have • Reuse generally means higher quality code • Also means communities can form • Questions answered • Pool of talent available • Someone else might x bugs • metacpan.org has the best interface
  • 247. Chapter 18. Testing and Debugging • We all write perfect software! • In our dreams... • Every code has one bug or more • The trick is... where? • Testing and debugging for the win
  • 248. Section 18.1. Test Cases • Write tests rst • Turn your examples into tests • Create the tests before coding • Run the tests, which should fail • Code (correctly!) • Now the tests should succeed • Tests are also documentation • So, comment your tests
  • 249. Section 18.2. Modular Testing • Use Test::More • Testing testing testing! • Don’t code your tests ad-hoc • Use the proven Perl framework • “perldoc Test::Tutorial”
  • 250. Section 18.3. Test Suites • For local things, create additional Test::* mixins • Use Test::Harness as a basis for additional tests • New harness is being developed • Separates reporting from detecting
  • 251. Section 18.4. Failure • Write test cases to make sure that failures actually fail • This is psychologically hard sometimes • Have to out-trick yourself • Good task for pairing with someone else
  • 252. Section 18.5. What to Test • Test the likely and the unlikely • Some ideas: • Minimum and maximum, and near those • Empty strings, multiline strings, control characters • undef and lists of undef,“0 but true” • Inputs that might never be entered • Non-numeric data for numeric parameters • Missing arguments, extra arguments • Using the wrong version of a module • And every bug you’ve ever encountered • The best testers have the most scars
  • 253. Section 18.6. Debugging and Testing • Add new test cases before debugging • A bug report is a test case! • Add it to your test suite • (You do have a test suite, right?) • Then debug. • You’ll know the bug is gone when the test passes • And you’ll never accidentally reintroduce that bug!
  • 254. Section 18.7. Strictures • Use strict • Barewords in the wrong place:
 my @months = (jan, feb, mar, apr, may, jun,
 jul, aug, sep, oct, nov, dec); • Variables that are probably typos:
 my $bamm_bamm = 3; .... $bammbamm; • Evil symbolic references:
 $data{fred} = “flintstone”;
 ...
 $data{fred}{age} = 30; • You just set $flintstone{age} to 30!
  • 255. Section 18.8. Warnings • Use warnings • “use warnings” catches most beginner mistakes • Beware “-w” on the command line • Unless you want to debug someone else’s mistakes when you use a module
  • 256. Section 18.9. Correctness • Never assume that a warning-free compile implies correctness • Oh, if life were only that simple! • You have to be more clever at debugging than you were at coding
  • 257. Section 18.10. Overriding Strictures • Reduce warnings for troublesome spots • Add “no warnings” in a block:
 {
 no warnings ‘redene’;
 ...;
 } • The block will narrow down the influence • Note that this has lexical influence • Not dynamic influence
  • 258. Section 18.11. The Debugger • Learn a subset of the debugger: • Starting and exiting • Setting breakpoints • Setting watchpoints • Step into, out of, around • Examining variables • Getting the methods of a class/instance • Getting help
  • 259. Section 18.12. Manual Debugging • Used serialized warnings when debugging “manually” • Reserve “warn” for “if debugging” • You shouldn’t be using warn for anything else • Automatically goes to STDERR • Or use Log::Log4Perl with debug levels
  • 260. Section 18.13. Semi-Automatic Debugging • Consider “smart comments” rather than warn • As in,“Smart::Comments”:
 use Smart::Comments;
 ...
 for my $result (@some_messy_array) {
 ### $result
 } • Supports assertions too:
 ### check: ref $result
 ### require: $result->foo > 3 • No delivery penalty • Just don’t include Smart::Comments!
  • 261. Chapter 19. Miscellanea • Sure, it’s gotta t somewhere • Might as well be at the end
  • 262. Section 19.1. Revision Control • Use revision control • Essential for team programming • But even good for one person • Great for sharing updates • Helpful for “when did it break” • And more importantly “who broke it” • Also prevents potential loss from hardware failures and human failures
  • 263. Section 19.2. Other Languages • Integrate non-Perl code with Inline:: modules • Relatively simple • Compared to raw XS codewriting • Caches nicely • Can be shipped pre-expanded with a distro • Can be installed per-user as needed
  • 264. Section 19.3. Conguration Files • Consider Cong::Std • Simple to use:
 use Cong::Std;
 read_cong ‘~/.demorc’ => my %cong;
 $cong{Interface}{Disclaimer} = ‘Whatever, dude!’; • Allow user changes to be rewritten for the next run:
 write_cong %cong; • Whitespace and comments are preserved! • Cong::General: all that and XML blocks, heredocs, includes
  • 265. Section 19.4. Formats • Don’t • Consider Perl6::Form instead
  • 266. Section 19.5. Ties • Don’t • Fairly inefcient • Can be spooky to the naive • Every tied operation can be replaced with an explicit method call
  • 267. Section 19.6. Cleverness • Don’t • Example:
 my $max_result =
 [$result1=>$result2]->[$result2<=$result1]; • Can you spot the bug? • Yeah, it’s min, not max • Better:
 use List::Util qw(max);
 $max_result = max($result1, $result2);
  • 268. Section 19.7. Encapsulated Cleverness • if you must be clever, at least hide it well! • Worked for the Wizard of Oz for many years • “Pay no attention to the man behind the curtain!” • Easy-to-understand interfaces better than scary comments • Be clever, and hide it
  • 269. Section 19.8. Benchmarking • Don’t optimize until you benchmark • Get the code right rst • Then make it go faster, if necessary • Use the tools • Benchmark • Devel::DProf • Devel::SmallProf • Devel::NYTProf
  • 270. Section 19.9. Memory • Measure data before optimizing it • Use Devel::Size for details:
 size(%hash) # measures overhead
 total_size(%hash) # measures overhead + data • Overhead is often surprising
  • 271. Section 19.10. Caching • Look for opportunities to use caches • Subroutines that return the same result • Example: SHA digest of data:
 BEGIN {
 my %cache;
 sub sha1 {
 my $input = shift;
 return $cache{$input} ||= do {
 ... expensive calculation here ...
 };
 }
 }
  • 272. Section 19.11. Memoization • Automate subroutine caching via Memoize • No that’s not a typo • No need to rewrite the same cache logic • Simpler example:
 use Memoize;
 memoize(‘sha1’);
 sub sha1 {
 expensive calculation here
 } • Lots of options (dening key, time-based cache)
  • 273. Section 19.12. Caching for Optimization • Benchmark any caching • Cache is always a trade-off • time vs space • time to compute vs time to fetch/store cached value • time to compute vs time to gure out cache is fresh • Caching isn’t necessarily a win
  • 274. Section 19.13. Proling • Don’t optimize apps: prole them • Find the hotspots that are slow, and x them • Use Devel::DProf for subroutine-level visibilities • Use Devel::SmallProf for statement-level
  • 275. Section 19.14. Enbugging • Carefully preserve semantics when refactoring • The point of refactoring is not to break things
  • 276. Bonus • Damian isn’t the only one with good ideas
  • 277. Use Perl::Critic • Automatically test your code against various suggestions made here • Can be congured to require or ignore guidelines • Consider it “lint for Perl”
  • 278. In summary • Learn from the ones who have the most scars • Buy the book... • Damian could use the money!