• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Getting started with Perl XS and Inline::C
 

Getting started with Perl XS and Inline::C

on

  • 992 views

Perl XS is notoriously ugly, but it's also useful. The dirt can be swept under the carpet to a large

Perl XS is notoriously ugly, but it's also useful. The dirt can be swept under the carpet to a large

Statistics

Views

Total Views
992
Views on SlideShare
992
Embed Views
0

Actions

Likes
1
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Getting started with Perl XS and Inline::C Getting started with Perl XS and Inline::C Presentation Transcript

    • “We should forget about the small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” -Donald Knuth
    • “We should forget about the small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” -Donald Knuth Structured Programming with Go To Statements
    • Getting Started with XS and Inline::C The eminently palatable approach to Perl XS Dave Oswald daoswald@gmail.com davido@cpan.org
    • What is XS? ● XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl. – perldoc perlxs
    • What is XS? ● XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl. – ● perldoc perlxs It's a “complex and awkward intermediary language.” ● Simon Cozens (co-author: Extending and Embedding Perl), in Advanced Perl Programming, 2nd Edition.
    • What is XS? ● XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl. – ● perldoc perlxs “Hooking Perl to C using XS requires you to write a shell .pm module to bootstrap an object file that has been compiled from C code, which was in turn generated by xsubpp from a .xs source file containing pseudo-C annotated with an XS interface description. If that sounds horribly complicated, then you have achieved an accurate understanding of the use of xsubpp.” – Damian Conway in Perl Best Practices
    • What is XS? ● XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl. – ● perldoc perlxs “There is a big fat learning curve involved with setting up and using the XS environment.” – Brian “Ingy” Ingerson (Inline POD)
    • What is XS? ● XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl. – ● perldoc perlxs “The Cognitive Load is high, which is counter-indicated for producing cleanly implemented, bug-free code.” – Dave Oswald
    • “...and then he said "btw, do you know C++?" and I do, though I prefer to work in perl, and he said "good, cuz these log files might be terrabytes big, and in that case we're going to need c++"” – reyjyar (http://perlmonks.org/?node=38117)
    • “Tell him you'll write a prototype in Perl, and if that turns out not to be fast enough, you'll recode the critical parts in C, thanks to Perl's sophisticated interlanguage binding and runtime profiling tools.” – Randal Schwartz (http://perlmonks.org/?node=38180)
    • The Binary Search (pure Perl prototype) sub binsearch (&$@) { my ( $code, $target, $aref ) = @_; my ( $min, $max ) = ( 0, $#{$aref} ); no strict 'refs'; # Symbolic ref abuse follows. while ( $max > $min ) { my $mid = int( ( $min + $max ) / 2 ); local ( ${caller() . '::a'}, ${caller() . '::b'} ) = ( $target, $aref->[$mid] ); if ( $code->( $target, $aref->[$mid] ) > 0 ) { $min = $mid + 1; } else { $max = $mid; } } local ( ${caller() . '::a'}, ${caller() . '::b'} ) = ( $target, $aref->[$min] ); return $min if $code->( $target, $aref->[$min] ) == 0; return; # Not found. }
    • The XS rewrite List::BinarySearch::XS
    • A “Shortest Path” Use Inline::C to prototype the code. Paste output (with minor modifications) into an XS framework for release.
    • Ingy got “fed up with writing XS.” – Advanced Perl Programming, 2nd edition.
    • What IS Inline::C ● ● ● Inline::C was created by Ingy döt Net ● Inline::C creates XS, code but you mostly don't need to care how. ● Subsequent runs use the previously compiled dynamic library. Embed C in your Perl application ● ● On first run it compiles the C source code. Inline::C creates XS bindings, making it available to your application. ● Compiled code is cached unless a change is made to the C source. Modules build on install, so no delay for end user once installed.
    • Inline::C: “Hello World!” ● From the Inline::C-Cookbook (on CPAN), Hello World in a HERE doc. use Inline C => << 'END_C'; void greet() { printf( “Hello, worldn”); } END_C greet();
    • Another simple example... ● Also from the Inline::C-Cookbook, on CPAN perl -e 'use Inline C=>q/void greet(){printf( “Hello, worldn”);}/; greet' ● “A concept is only valid in Perl if it can be shown to work on one line.” – Inline::C POD
    • What does that look like in XS? C:UsersDave_Inlinebuilde_101c>more e_101c.xs #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include "INLINE.h" void greet() { printf("Hello, worldn"); } MODULE = e_101c PACKAGE = main PROTOTYPES: DISABLE void greet () PREINIT: I32* temp; PPCODE: temp = PL_markstack_ptr++; greet(); if (PL_markstack_ptr != temp) { /* truly void, because dXSARGS not invoked */ PL_markstack_ptr = temp; XSRETURN_EMPTY; /* return empty stack */ } /* must have used dXSARGS; list context implied */ return; /* assume stack size is correct */
    • The __C__ segment ● At the end of a Perl program, add an __END__ or __DATA__ tag, and on a subsequent line add a __C__ tag. ● ● Everything after that will be treated as C code. An example: use strict; use warnings; use Inline C => 'DATA'; #'DATA' is actually the default. greet(); __DATA__ __C__ void greet() { printf “Hello world!n”; }
    • Batteries Included: Basic Data Types. ● Inline::C already knows how to deal with passing basic C data types as parameters and return values. ● ● See perl/lib/ExtUtils/typemap for the full list. Let's pass C ints into and out of a function. use Inline 'C'; my ( $apples, $oranges ) = ( 10, 5 ); my $total_fruit = add( $apples, $oranges ); print “$apples apples plus $oranges oranges equals “ . “$total_fruit pieces of fruit.n”; __END__ __C__ int add( int first, int second ) { return first + second; }
    • Returning a string is almost as easy (ignoring Unicode for now) use Readonly; use Inline C => 'DATA'; Readonly my $string => return_string(); print $string; __DATA__ __C__ char* return_string() { return “Hello world!n”; }
    • It's not going to stay that easy forever ● If your data type isn't in the typemap file, it doesn't get converted automatically. ● ● You may create additional typemaps of your own creation. If you try to pass a type that isn't auto-converted, you will get a strange error message at compile-time. int clength( char const *str ) { … } Use of inherited AUTOLOAD for non-method main::clength() is deprecated at simple params2.pl line 11. ● This and other cryptic messages will taunt you often.
    • A quick definition of terms ● Perl has the following containers: ● ● AV = Array Value: Contains Scalars. ● ● SV = Scalar Value HV = Hash Value: Contains Scalars. SV's can contain one (or often more) of the following: ● ● UV = Unsigned Int ● NV = double ● PV = string ● CV = A coderef (subref) ● ● IV = Integer Value (or pointer) RV = a pointer to another SV (*SV) (Source: perlguts) I'm afraid it's time to read perldoc perlguts. ● ● perlguts introduces the components that comprise the Perl API The functions documented in perlguts will be your means of manipulating Perl data in C.
    • The four scenarios ● These four calling scenarios are documented in Inline::C ● Simple: Fixed number of params, all types built into the typemap: ● ● ● All arguments and single return value are specified in the typemap, so you just write your pretty little C subroutine and life is good. The conversions are automatic. Return a list or “nothing”. Parameter list fixed. ● ● ● int Foo ( int arg1, char* arg2, SV* arg3 ) void Foo( int arg1, char* arg2, SV* arg3 ) Either you really want to return nothing, or you want to build the return value yourself and push it onto “The Stack.” Either way you have to be explicit. Variable length parameter list, simple return. ● ● ● char* Foo( SV* arg1, … ) Pop arguments off of “The Stack.” Variable length return, variable length param list. ● void Foo( SV* arg1, … ) ● Void return and unfixed number of args: Combine 2nd and 3rd techniques.
    • The First Scenario ● The first situation passes basic data types (or none), and returns a single basic data type. Parameter list is fixed length. use Inline C => 'DATA'; print add_one( 10 ), “n”; __DATA__ __C__ int add_one( int arg ) { return arg + 1; }
    • The Stack ● ● The stack is Perl's internal means of passing parameters to, and return values from subroutines. INLINE.h defines a set of convenient macros used to manipulate the stack. These are sugar for less convenient XS macros. Inline_Stack_Vars Inline_Stack_Items Inline_Stack_Item(i) Inline_Stack_Reset Inline_Stack_Push(sv) Inline_Stack_Done Inline_Stack_Return(n) Inline_Stack_Void ● Examples are better than definitions...
    • The Second Scenario ● Returning a list. (C doesn't do this by nature.) __C__ void Foo( int arg1, char* arg2, SV* arg3 ) { int i, max; Inline_Stack_Vars; Inline_Stack_Reset; for (i = 0; i < max; i++) Inline_Stack_Push(newSViv(i)); Inline_Stack_Done; } ● There are other macros explained in the cookbook, and perlguts that create and manage new SV's dynamically too.
    • The Third Scenario ● Variable length argument list. Returning a single basic type. ● Variable length lists must pass at least one parameter. use Inline C => 'DATA'; my $count = greet( qw/ George John Thomas James James / ); print "That's the first $count presidents.n"; __DATA__ __C__ int greet(SV* name1, ...) { Inline_Stack_Vars; int i; for (i = 0; i < Inline_Stack_Items; i++) printf("Hello %s!n", SvPV(Inline_Stack_Item(i), PL_na)); return i; }
    • The Third scenario, part 2 ● Variable length arg list, void function (no return value). ● When manipulating the stack, if there's no return value you must specify that explicitly. use Inline C => 'DATA'; greet( qw/ George John Thomas James James / ); __DATA__ __C__ void greet(SV* name1, ...) { Inline_Stack_Vars; int i; for (i = 0; i < Inline_Stack_Items; i++) printf("Hello %s!n", SvPV(Inline_Stack_Item(i), PL_na)); Inline_Stack_Void; } /* Explicitly specify no RV */
    • The Fourth Scenario ● Variable length argument list, return multiple values. ● Use “The Stack” both to read the parameters, and to push the return values. – Read items off the stack and optionally modify them with Inline_Stack_Item(). – Inline_Stack_Item() is a setter and getter. Call Inline_Stack_Reset( ) if you plan to push to the stack. ● Resets stack pointer to the beginning of the stack. Inline_Stack_Push() as needed. – Inline_Stack_Done ● –
    • “...no longer a dark, gray bird, ugly and disagreeable to look at, but a graceful and beautiful swan.” – Hans Christian Andersen The Ugly Duckling
    • Well, it's still ugly... It's C... and Perl... and Internals ;) ● When coding for Inline::C (and XS, for that matter), you may access Perl's containers: SV's, AV's, HV's, etc. ● C's native data types “do less”, but “do it faster” ● Perl's containers “do more” but are slower. ● Weigh the tradeoffs: ● ● ● ● It's often easier to write generic algorithms that don't care what Perl's SV's contain, rather than extracting and manipulating the SV's contents. If you're going to implement the “do more” functionality anyway, just use the Perl containers unless you need non-Perl portability. If you're building a structure that will immediately get passed back, build it with Perl's containers. Use basic C data types in tight loops or for non-Perl portability.
    • An aside: Duff's Device: A loop... do { *to = *from++; /* /* * * } while(--count > 0); count > 0 assumed */ Note that the 'to' pointer is NOT incremented */
    • An aside: Duff's Device: A loop...unrolled send(to, from, count) register short *to, *from; register count; { register n = (count + 7) / 8; switch(count % 8) { case 0: do { *to = *from++; case 7: *to = *from++; case 6: *to = *from++; case 5: *to = *from++; case 4: *to = *from++; case 3: *to = *from++; case 2: *to = *from++; case 1: *to = *from++; } while(--n > 0); } }
    • “If your code is too slow, you must make it faster. If no better algorithm is available, you must trim cycles.” – Tom “Duff's Device” Duff comp.lang.c, Aug 29, 1988
    • The Sieve of Eratosthenes ● Problem: Find all primes less then or equal to the integer 'n' ● One Efficient Method: ● Sieve of Eratosthenes – – ● The sieve may be implemented as a bit vector for memory efficiency if the implementation is computationally efficient though often it's not. Otherwise, a simple array. – ● Uses a sieve to flag outcasts and retain candidates. O(n log log n) time complexity, A pure function that lends itself well to computational benchmarking. We will look at Pure Perl, Inline::C, and Inline::CPP (C++).
    • The Sieve of Eratosthenes (Pure Perl) sub pure_perl { my $top = ( $_[0] // $Bench::input ) + 1; return [] if $top < 2; my @primes = (1) x $top; my $i_times_j; for my $i ( 2 .. sqrt $top ) { if ( $primes[$i] ) { for( my $j = $i; ($i_times_j = $i * $j) < $top; $j++ ){ undef $primes[ $i_times_j ]; } } } return [ grep { $primes[$_] } 2 .. $#primes ]; }
    • The Sieve targeting Inline::C (with XS macros) SV* il_c_eratos_primes_av ( int search_to ) { AV* av = newAV(); bool* primes = 0; int i; if( search_to < 2 ) return newRV_noinc( (SV*) av ); Newxz( primes, search_to + 1 , bool ); if( ! primes ) croak( "Failed to allocate memory.n" ); for( i = 2; i * i <= search_to; i++ ) if( !primes[i] ) { int j; for( j=i; j*i <= search_to; j++ ) primes[i*j] = 1; } av_push( av, newSViv(2) ); for( i = 3; i <= search_to; i += 2 ) if( primes[i] == 0 ) av_push( av, newSViv( i ) ); Safefree( primes ); return newRV_noinc( (SV*) av ); }
    • The Sieve targeting Inline::CPP #include <vector> SV* pure_il_cpp() { int search_to = SvIV( get_sv( "Bench::input", 0 ) ); AV* av = newAV(); if( search_to < 2 ) return newRV_noinc( (SV*) av ); std::vector<bool> primes( search_to + 1, 0 ); for( int i = 2; i * i <= search_to; i++ ) if( ! primes[i] ) for( int k, j = i; (k=i*j) <= search_to; j++ ) primes[ k ] = 1; av_push( av, newSViv(2) ); for( int i = 3; i <= search_to; i+=2 ) if( ! primes[i] ) av_push( av, newSViv(i) ); return newRV_noinc( (SV*) av ); }
    • Off to the races... ● Benchmarking ● ● All subs are wrapped in Perl sub-wrappers to bind calling parameters for the benchmarks. Benchmark at 2, and 1,000,000 for 10 seconds each. ● The C versions find 3,000,000 primes in less than 1/20th of a second. ● C version tested up to 650 million primes (several seconds). – ● Beyond that I ran out of memory using a sieve of ints. C++ version performs well over a billion primes. – A bit vector conserves memory.
    • Benchmark Results Comparing: il_c il_cpp pure_perl - Inline C. - Inline::CPP. - Pure Perl. Comparison time: 10 seconds. Input parameter value of 2 Rate pure_perl pure_il_c pure_il_cpp pure_perl 400035/s –-51% -55% pure_il_c 820171/s 105% --7% pure_il_cpp 881808/s 120% 8% -- Input parameter value of 1000000 Rate pure_perl il_cpp pure_il_c pure_perl 1.48/s --98% -98% il_cpp 61.2/s 4027% --17% il_c 73.6/s 4866% 20% --
    • XS and Inline::C; not just speed ● Unit testing of C code and libraries with Perlish tools. ● Linking to external C libraries. ● Extending Perl's interface to its internals ● ● Scalar::Util, Hash::Util, etc. Building on the strengths of each language in complex ways. – Extending Perl with C/XS – Embedding Perl in C/C++
    • An example from Perl Testing: A Developer's Notebook BEGIN { chdir 't' if -d 't'; } use strict; use warnings; use Test::More tests => 6; use Inline C => Config => LIBS => '-lm', ENABLE => 'AUTOWRAP' ; Inline->import( C => <<END_HEADERS ); double fmax( double, double ); double fmin( double, double ); END_HEADERS is( is( is( is( is( is( ● fmax( 1.0, 2.0 ), 2.0, 'fmax() should find maximum of two values' ); fmax( -1.0, 1.0 ), 1.0, '... and should handle one negative' ); fmax( -1.0, -7.0 ), -1.0, '... or two negatives' ); fmin( 9.3, 1.7 ), 1.7, 'fmin() should find minimum of two values' ); fmin( 2.0, -1.0 ), -1.0, '... and should handle one negative' ); fmin( -1.0, -6.0 ), -6.0, '... or two negatives' ); Ian Langworth; Chromatic. Perl Testing: A Developer’s Notebook (Kindle Locations 5471-5474).
    • C library functions ● You may use Inline::C to call functions from C libraries ● ● You may need to write Inline:C wrappers, or pure Perl wrappers. Inline::C offers AUTOWRAP for trivial wrappers.
    • Linking to an external library # From the Inline::C-Cookbook POD: print get('http://www.axkit.org'); use Inline C => Config => LIBS => '-lghttp'; use Inline C => <<'END_OF_C_CODE'; #include <ghttp.h> char *get(SV* uri) { SV* buffer; ghttp_request* request; // A data type from libghttp. buffer = NEWSV(0,0); request = ghttp_request_new(); ghttp_set_uri(request, SvPV(uri, PL_na)); //function call from libghttp. ghttp_set_header(request, http_hdr_Connection, "close"); ghttp_prepare(request); // and a few others... ghttp_process(request); sv_catpv(buffer, ghttp_get_body(request)); ghttp_request_destroy(request); return SvPV(buffer, PL_na); } END_OF_C_CODE
    • AUTO_WRAP => 1 ● ● Inline::C can automatically generate the function wrappers to link functions in a library with Perl Also from the Inline::C-Cookbook package MyTerm; use Inline C => Config => ENABLE => AUTOWRAP => LIBS => "-lreadline -lncurses -lterminfo -ltermcap "; use Inline C => q{ char * readline(char *); }; package main; my $x = MyTerm::readline("xyz: ");
    • Perl's power tools. ● XS can call Perl subroutines by symbol, or subref. ● XS can accept subrefs as params, or return subrefs. (Callbacks, Currying) ● XS functions have access to package globals. ● XS functions have access to the lexical pad. ● XS functions can create lexical blocks (create closures, for example).
    • Practical approaches toward objects ● Build in Perl, optimize in C ● ● ● ● Build all the methods in Perl Those object methods that stand out in profiling may be rewritten in Inline::C. 90% of the benefit of Inline::C with only 10% of the fuss. Inline::CPP – C++ objects become Perl objects. ● Public data members get accessors and are exposed to Perl. ● Public methods are exposed to Perl. ● Private data members and methods aren't exposed to Perl. ● C++ constructors, destructors, inheritance, and so on.
    • An object with Inline::CPP use Inline CPP => <<'END'; int doodle() { return 42; } class Foo { public: Foo() : data(0) { std::cout << "creating a Foo()" << std::endl; } ~Foo() { std::cout << "deleting a Foo()" << std::endl; } int get_data() { return data; } void set_data( int a ) { data = a; } private: }; END int data;
    • An Object with C++ (How Perl sees it) sub main::doodle { return 42; } package main::Foo; sub new { print "creating a Foo()n"; bless {}, shift } sub DESTROY { print "deleting a Foo()n" } sub get_data { my $o=shift; $o->{data} } sub set_data { my $o=shift; $o->{data} = shift }
    • Distribution ● An Inline::C dependency ● ● ● Math::Prime::FastSieve is a proof-of-concept here. InlineX::C2XS ● ● C code gets compiled at installation time, so end user never has a “long wait” at startup. Converts Inline::C code to an XS module, with no Inline::C dependency. Or you can copy the XS output from Inline::C and paste it into an XSmodule framework. ● List::BinarySearch::XS is an example of this approach.
    • The making of My::Module with an Inline::C dependency. ● h2xs -PAXn My::Module ● Modify the Module.pm file: package My::Module; $VERSION = '1.23'; use base 'Exporter'; @EXPORT_OK = qw(cfoo cbar); use strict; use Inline C => 'DATA', VERSION => '1.23', NAME => 'My::Module'; 1; __DATA__ __C__ // C code here... ● NAME and VERSION parameters required, and must match $VERSION.
    • The making of My::Module (almost done) ● In Makefile.PL: ● Change: use ExtUtils::MakeMaker ● To: use Inline::MakeMaker ● perl Makefile.PL ● make dist ● Remember: The C code is compiled on install; user never sees a delay during startup or runtime.
    • The Inline and XS communities ● inline@lists.perl.org ● perl-xs@lists.perl.org
    • Other Inline Attractions (continued) ● Inline::C++ ● I took over maintenance from Neil Watkiss in November 2011 after it had gone unmaintained since 2003. – ● It was only passing on about 10% of CPAN smoke-testers. Now it's passing about 90% of CPAN smoke-testers. – – ● We still have some challenges. Contributors (code, FAIL-test-systems, etc.) welcome. Inline::Python ● “Inline::Python is extremely important for the company I work for which is why I can spend quite some time on it. So it's future is very secure :)” – Stefan Seivert: Current maintainer, in an email message to the Inline mailing list on 12-5-11.
    • CPR – Just for fun...
    • 'The best compliment I've gotten for CPR is when my ActiveState coworker Adam Turoff said, ``I feel like my head has just been wrapped around a brick''.' – Brian “Ingy” Ingerson Pathalogically Polluting Perl (perl.com) Feb 2001
    • The Brick: Inline::CPR ● Inline C Perl Run ● Running Perl from a C program (quick and dirty embedding) #!/usr/local/bin/cpr int main(void) { printf("Hello World, I'm running under Perl version %sn", CPR_eval("use Config; $Config{version}") ); return 0; } ● I think that's enough on that topic. ;)
    • Conclusions: Why C, Inline::C, and XS? ● An application needs to get at functionality built into existing C libraries. ● Windowing and graphic systems. ● Specialized C libraries. ● Operating-system libraries. ● Other “tool reuse.” If a C library exists, why rewrite it in Perl? ● Getting “Close to the metal.” ● An application or extension has a bottleneck and no better algorithm exists. ● Unit testing of C code, using simple Perl tools. Expose C functions to the TAP harness. – Inline::C in specific makes these transitions less complex, which facilitates writing and testing more robust code.
    • “You can sometimes write faster code in C, but you can always write code faster in Perl. Since you can use each from the other, just combine their strengths as you see fit. (And tell the dragons we said "Hi.")” – Larry Wall; Tom Christiansen; Jon Orwant. Programming Perl, 3rd Edition
    • Resources: ● Advanced Perl Programming, 2nd Edition – An excellent introduction to Inline::C and XS. ● Inline, Inline::C, Inline::C-Cookbook, Inline::Struct, Inline::CPP POD, on CPAN ● perlguts, perlcall, perlapi – Must read. ● perlxs, perlxstut, perlmod, perlembed – Unavoidable. ● Perl Best Practices – Discusses Inline::C as a Best Practice where XS is needed. ● Perl Cookbook, 2nd Edition – Several Inline::C examples. ● Programming Perl, 3rd Edition – XS-centric, but still applicable toward learning Inline::C. ● http://www.perl.com/pub/2001/02/inline.html “Pathologically Polluting Perl” ● Mailing Lists: inline@perl.org and perl-xs@perl.org.