Writing and using php streams and sockets


Published on

Slides from my tutorial at Zendcon 2011 on PHP streams, sockets and filters

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Who am I, and why should you listen to me?Windows buildsCairo, PHP-Gtk, Win\\GuiUsing PHP since php 4 betaHow I got involved in PHP community and why you should tooHow I got interested in streams – the huge template debate – there were lots more but it was basically the argument that PHP is a templating language and the desire to have some really cool features using streams1 minuteStreams are one of the most powerful features of PHP and one of the most underused. Bring your laptop and go on an in-depth look at PHP’s streams layer. We'll examine what they are and how can you use (and abuse) them. Write your own classes to access APIs such as Flickr without any curl, use filters for manipulating data without altering an existing templating system, play with all sorts of PHP transports from FTP to SSH. Learn how to create your own stream wrappers and filters, and even build a socket server. Learn about the differences between socket connections and regular streams. And learn it in a hands on method, actually writing and running code. Please bring something with PHP 5.3 installed and runnable (CLI is good enough). This will have coding exercises!
  • That’s the QR code for joind.in – should be able to point your phone at it and find it right awaySo what do you want to learn about today? Have you heard of streams, ever used them? Get some general feedback on what people think about them or have ever touched themSome thoughts on availablity of how tos and such, make everyone get up
  • Quick computer science lessonOriginally done with magic numbers in fortran, C and unix standardized the way it workedOn Unix and related systems based on the C programming language, a stream is a source or sink of data, usually individual bytes or characters. Streams are an abstraction used when reading or writing files, or communicating over network sockets. The standard streams are three streams made available to all programs.Who else uses them? Most languages descended from C have the “files as streams concept” and ways to extend the IO functionality beyond merely files, this allows them to be merged all togetherGreat way to standardize the way data is grabbed and usedQuestions on who has used streams in other languages
  • PHP streams are a very very big topic, could spend days talking about themSince this is a tutorial you’re going to have a VERY sore brain – ask questions, do not wait… I have kids and am VERY used to interruptions and tangentsYou won’t get anything out of this talk unless you participate, this is a tutorial not a lecturePlease get PHP 5.3 on your system, cli is fine, just some way to run this code you’re about to seePass around CDs with code/php for windows on themI will make you get up and move around – this is not because I hate you, this is because it’s early, your brains are toastFind a buddy for coding – there are prizes there as wellFormat of talk isBasics – codeAdvanced – codeQuizBreaks – we will break on time even if I’m in the middle of a slide, you do want to not wait in lineBreaks will be denoted by some of the most annoying – errr – fun slides you’ve ever seen9:00 - 12:00Breaks will be at 9:45 (10 minutes) 10:30 (15 minutes) 11:15 (10 minutes)Get up – meet and greet (5 minute mark for talk)
  • I say “blame” and not “pat on the back” because while the idea is pretty sound, the implementation has some – well quirks. Also this was one of the least documented and used corners of PHP introduced in PHP 4.3 to unify the methods for working on files, sockets, and other similar resourcesThere are lots of other people who have dabbled their fingers in the streams code, and lots of cool features related to it
  • So what’s the basic stream that everyone uses? It’s not even stdin or stdout – it’s the most basic of all file://Who has ever used fopen or include? Any of those calls are actually using the streams layer in PHPA stream is a way of generalizing file, network, data compression, and other operations which share a common set of functions andusesPHP manual states that a stream is a resource object which exhibits streamable behavior. That is, it can be read from or written to in a linear fashionSo if you have data you want to read or write in a linear fashion – streams are for you!Should be at the 5 minute mark (maybe later)
  • Streams are a huge underlying component of PHPStreams were introduced with PHP 4.3.0 – they are old, but underuse means they can have rough edges… so TEST TESTTESTBut they are more powerful then almost anything else you can useWhy is this better ?Lots and lots of data in small chunks lets you do large volumes without maxing out memory and cpu
  • Any good extension will use the underlying streams API to let you use any kind of streamfor example, cairo does thisstuff to work with PHP streams is spread across at least two portions of the manual, plus appendixes for the build in transports/filters/context options. It’s very poorly arranged so be sure to take the time to learn where to look in the manual – there should be three main placesWhat doesn’t use streams? Chmod, touch and some other very file specific funtionality, lazy/bad extensions, extensions with issues in the libraries they wrap around
  • All input and output comes into PHPIt gets pushed through a streams filterThen through the streams wrapperDuring this point the stream context is available for the filter and wrapper to useStreams themselves are the “objects” coming inWrappers are the “classes” defining how to deal with the stream
  • What is streamablebehavorior? We’ll get to that in a bitResource: A resource is a special variable, holding a reference to an external resourceThis is something PHP uses to point to an underlying C thingya wrapper tells the stream how to behave – how to “speak” http or to the file system or whateverYour context lets you set options that are made available to your wrappers and filtersIF YOU REMEMBER NOTHING ELSE FROM THIS TALK< REMEMBER THESE!
  • How the schemes for the streams filters work. The php stream is “magic” because it lets you tack on filters for things that don’t leave a file pointer lying around (include, file_get_contents)Basically this is a version of a URI – only not quite. Because we really can’t seem to make anything truly standard in PHP
  • Some notes – file_get_contents and it’s cousin stream_get_contents are your fastest most efficient way if you need the whole fileFile(blah) is going to be the best way to get the whole file split by linesBoth are going to stick the whole file into memory at some point.For very large files and to help with memory consumption, the use of fgets and fread will help
  • Error handling is a major pain with the file system code in PHP, and with streams codeThere is no way to deal with “expected errors” and the code is noisy with warnings. Your best bet is to have an error handler registered to deal with warnings when using file system stuffYou can also use some of the spl file classes – such as splfileobject – to do some of the filesystem functions in an object oriented wayFlock issues are numerousEverything has to be using php’s flock – you can use locking in python with php’s flock and expect it to workYes it works on windows – but it does NOT work on threaded servers (cgi yes, mod_php no)Some versions of linux have broken flock implementationsIs not going to work on network sharesRemember a wrapper is going to define the PHP streams stuff it supports. Some don’t implement the functionality needed for things like mkdir or is_dir, some arent’ seekable
  • A context is a set of parameters and wrapper specific options which modify or enhance the behavior of a stream. There is really only one parameter that can be set for streams and that is the notification callbackso you can assign a callback to your stream that is called when things happen – check for the notify constant coming it – the possible values are available in the manual, then do something (logging maybe) depending on what is happening with the streamoptions are really the more useful of the twoBasically they’re arrays of settings that are then made available – if you need additional data inside a custom stream wrapper contexts are going to be the way to go
  • It’s all in the contextall the http functionality is controlled by the context settings – if you go to php.net there’s an appendix with all the context settings available for the http (and https) requestsmethod , header, user_agent,content, proxy , request_fulluri,max_redirects,protocol_version ,timeout ,ignore_errors are just a few available.This may seem VERY backward – but content goes in the context and then you do a file_get_contents, even when POSTing or putting data (which feels a bit odd but it’s the way the http stream is implemented)
  • Normally PHP has 8K buffers in place, you could always change this with stream_set_write_buffer, this is the pair method for thatget_params is a pair to stream_context_set_paramswhich has been available since the beginning like set write bufferstream_resolve_include_path acts like `include` and will only resolve the filename against the include-path, if the path is relative. It makes no sense to resolve already absolute pathnames anyway.the last one is pretty self explanatory
  • These three things can royally mess up stream based codeIf you’re writing a system that must run with these completely locked down you may run into some serious issuesOpen_basedir is just generally buggyEnd at 9:15 or so
  • 9:15 to 9:20 coding exercise
  • So let’s think of a really simple problem here that uses PHP streams9:20-9:25 for show and tell
  • So now we’re going to take a look at all the shiny C level transport stuff available for the systemThis is stuff not only available and built in to PHP by default, but also stuff available with additional extensions. There are several extensions that provide additional built in transports, streams and filters – we’ll hit the coolest oneswarning: quite a lot of this stuff is going to be affected by allow_url_include and allow_url_fopenit’s probably better for security reasons that at least allow_url_include is turned off – but allow_url_fopen, you could really argue that either wayBottom line is when you have something so powerful you can replace a regular file include with a remote include from a url you need to be extra carefulSecurity should be your first name, not just your middle name, when dealing with streams and filters
  • These are the native built in streams for PHP – no extension necessaryNote that the file:// is considered the default and no prefix means “file” to PHPglob is an odd duck, new since PHP 5.3data is available since 5.2 and follows the RFC 2397http and ftp are pretty self-explanatory – both have context settings that make them really useful
  • You do need to note that there are two kinds of “transports” – those you can use with regular streams and those that are socketscontrary to popular believe the https/ftps/ssl and tls stuff is not really built into PHP by default –it’s part of the open ssl extension. However, the streams will register even when using openssl as a shared module, just make sure you’re loading it via .ini and not dl() – you could get some interesting behaviors.Ssh2 has the majority of it’s api – especially sftpapi, totally done with streams, which takes a bit of getting used to at first but it is a very powerful if not very verbose way of doing things. It does make it possible to drop in ssh2 stream paths where regular file actions used to be completely transparentlyThe compression stuff is often taken for granted as well since most PHP systems have zlib installed, and often bzip tooogg and exprect exist as well and are a little more esoteric, not many people use them so I don’t have examples for themPhar is a really really cool way to package up apps and have them “just run”
  • https://www.x.com/docs/DOC-1632An example of how to do instant payment notifications at paypal – a lot of the script is chopped out – for example building the headers and request data, but the basics of build your data, then open your fsockopen and do your fputs to send the dataWhy do we do the craziness with feof?Because the socket can hang forever! If a connection opened by fsockopen() wasn't closed by the server, feof() will hangHanging while feof is very very badYou have to be careful in general with feofReturns TRUE if the file pointer is at EOF or an error occurs (including socket timeout); otherwise returns FALSE
  • First we create the connection and auth – you can also use keyfiles with ssh2Then we’re going to use exec to clear out the old files first, notice the stream set blocking – we’re saying “wait until this sucks is done and you get something back”Then clean up after ourselves – there is an ssh2.exec:// stream, but it makes little sense here, we’re not going to use the output of the exec for anythingThen we do and is_dir on the remote and mkdir, and copy our files overNote the user of the actual resource in the url – that’s some really dirty magic under the covers in PHP to make that work
  • Some things to note about phars:they aren’t affected by allow_url_fopen (they can NEVER work on remote files)requires system-level disabling of the phar.readonly php.ini setting in order to create or modify phar archivesIf you’re usingphar without an exectuable stuff (just for zip or tar archives) that limitation is not in placePhars are really quite awesome. Note that phar is on by default for 5.3openssl can be used for signingzlib and bzip can be used for compression – but both of these are optional for phar
  • pipes are like sockets but only go one waystdin, stdout,stderr are all pipesproc_open and popen do pipesSTDIN/out/err are some really nifty magic in PHP, we’ll get to them later, but they’re easy ways to access and write to standard pipes from PHPproc_open is popen on steroidsyou can do mutiple pipes and suppress that ugly black box on windows and perform other black magicYou DO have to be careful if you are using this stuff, remember windows is going to act quite a bit differently in some cases – proc_open and popen are interesting on there at best
  • php://stdin, php://stdout and php://stderr allow direct access to the corresponding input or output stream of the PHP process. The stream references a duplicate file descriptor, so if you open php://stdin and later close it, you close only your copy of the descriptor--the actual stream referenced by STDIN is unaffected. Note that PHP exhibited buggy behavior in this regard until PHP 5.2.1. It is recommended that you simply use the constants STDIN, STDOUT and STDERR instead of manually opening streams using these wrappers.
  • php://output allows you to write to the output buffer mechanism in the same way as print() and echo(). php://input allows you to read raw POST data. It is a less memory intensive alternative to $HTTP_RAW_POST_DATA and does not need any special php.ini directives. php://input is not available with enctype="multipart/form-data". NOTE: you can’t attach a filter to say output and then echo and expect it to magically work, instead you must fwrite to outputphp://stdin and php://input are read-only, whereas php://stdout, php://stderr and php://output are write-only. php://input allows you to read raw POST data. It is a less memoryintensive alternative to $HTTP_RAW_POST_DATA and does not need anyspecial php.ini directives. php://input is not available withenctype="multipart/form-data"php://fd – this is new in 5.3.6. php://fd allows direct access to the given file descriptor. For example, php://fd/3 refers to file descriptor 3.
  • php://filter is a kind of meta-wrapper designed to permit the application of filters to a stream at the time of opening. This is useful with all-in-one file functions such as readfile(), file(), and file_get_contents() where there is otherwise no opportunity to apply a filter to the stream prior the contents being read. so you’re getting a filter and sticking it on a resource – read filter chain, write filter chain, Any filter lists which are not prefixed specifically by read= or write= will be applied to both the read and write chains (as appropriate). The php://memory wrapper stores the data in the memory. php://temp behaves similarly, but uses a temporary file for storing the data when a certain memory limit is reached (the default is 2 MB). The php://temp wrapper takes the following 'parameters' as parts of its 'path': /maxmemory:<number of bytes> (optional). This parameter allows changing the default value for the memory limit (when the data is moved to a temporary file). so you can actually manage the memory limit for when the data moves to a temp file
  • This is actually from Ciaran McNulty ‘s blog and it was such a perfect example of how PHP streams can make your life easierWhat if later you need to copy that file from ssh or it’s compressed with a different algorithm? That’s a lot of code tweaks. Where as in the second version simply changing the stream is enough to fix it9:25 – first large chunk of brain dumping done
  • 9:35 to 9:40 coding exercise
  • So let’s think of a really simple problem here that uses PHP streams9:45 to 9:50 show and tellTHEN - time for first break (10 minutes)
  • Custom streams and filters are very useful tools, they let you have all kinds of power over how your application actsThrough filter and stream chaining you can link several types of streams together for some very powerful actionsUserland filters are not constrained by allow_url_fopen, but if they try to use streams internally they will failObjects for userland streams may not work the way you expect, so you’ll need to test your stuff pretty carefullyFor better or worse, stream contexts are read only inside a stream wrapper. If you try to change them very bad things can happenStart this at 10:00
  • There should be interfaces for streamsYes I have a patchYes I am a bad bad girl for not getting that patch in yetThere are a small list of functions that you absolutely must implement in order to make a custom stream workCustom streams allow you to control the behavior of your data but there are some things you need to knowWhy it is a good idea to implement your own interfaces for streams – THERE ARE NO CHECKS ON wrapper_register. The only checks you’ll get is when the call actually takes place that you want to wrap (less than useful)
  • These are the basics you should always implement – if you do not PHP is going to spit warningsOpen up and look at my base stream interface for a quick idea of what this would look like
  • This is usually the heart of a custom stream wrapper. It’s important that you Return true or falseCheck options for STREAM_USE_PATH – if that’s set then you must put the actual path you used into opened_pathIf path is relative, search for the resource using the include_path is what it meansThe mode will be identical to the modes for fopen – so you’ll get a string you’ll need to check. Make sure you’re checking for bad modes! Path can generally be broken up pretty efficiently with url_parse since the general format of a PHP stream path is identical to a url ;)If you support contexts, $this->context will have the data that is being set/passed around inside your current stream context.The constructor of your class IS called every time before stream_open!
  • You’ll generally want to be keeping track of the current position of your stream pointer in your class. Generally you do this with a position class property. For fread, you’ll want to update that with the number of bytes that are actually read, not necessarily the number requested by fread.warning! if you return an empty string PHP is going to treat it as “false” AND returning false does not necessarily indicate eof, so be careful when you implement this.However you MUST make it return false or ‘’ at some point, or it’ll infinite loop and NEVER call eof for file_get_contents for fread/fgest
  • If you don’t’ have enough room for all the data that you’re sent, write as much as possibleIf the return value is greater the length of data, E_WARNING will be emitted and the return value will truncated to its length.
  • Feof is kind of a magical thing. It’ll end up being called all over the place when you least expect it!. And then not when you want it to be called (ick)Notice that the constructor is NOT called before a eofThis is called with feof, file_get_contents, fread. Anything that might need to know if we’re all done with data. So this should probably be the second thing you implement in your stream wrapperJust like stream_open you need to return true or false and can access any context information, but it has no parameters passed.
  • You need to put up your cleanup stuff here, because if you put it in a PHP destructor you can get some really really odd behaviorNote this method doesn’t give you anything (no params) and you shouldn’t return anything (well you can but it won’t do you any good)this is a bit broken though! if you don’t implement stream_close it doesn’t throw a warning unlike the other methods. It’s generally safest to always implement stream_close
  • Although technically you might not need to implement this – if you do any file_get_contents or file_put_contents calls into your code this WILL be calledNote the data being sent back is used for other stuff – the mode stuff can get really annnoying – check out the unix man page for how your wrapper should be handling mode stuff to get it to do what you want
  • file_get_contents is going to call this! and just about everything else under the sun – you really need to have it implemented.Notice there are slightly different semantics between the two – url_stat is going to give you a path and optionally flags to deal withSTREAM_URL_STAT_LINK For resources with the ability to link to other resource (such as an HTTP Location: forward, or a filesystemsymlink). This flag specified that only information about the link itself should be returned, not the resource pointed to by the link. This flag is set in response to calls to lstat(), is_link(), or filetype().STREAM_URL_STAT_QUIET If this flag is set, your wrapper should not raise any errors. If this flag is not set, you are responsible for reporting errors using the trigger_error() function during stating of the path.
  • SOME NOTES: so why two stats? one works on an already opened stream, one is given only a path and needs to determine informationfrom that that means stream open will NOT have been called when this is called!
  • Generally you need to implement both if your stream is going to have any kind of seekable behaviorNote that you need both! I’ll repeat that again, you must have both! one will tell the wrapper where you are, one will tell the wrapper to move your position in the stream
  • Upon success, streamWrapper::stream_tell() is called directly after calling streamWrapper::stream_seek(). If streamWrapper::stream_tell() fails, the return value to the caller function will be set to FALSENot all seeks operations on the stream will result in this function being called. PHP streams have read buffering enabled by default (see also stream_set_read_buffer()) and seeking may be done by merely moving the buffer pointer.
  • You’ll need to have some way inside your class/wrapper to keep track of where you are – stream tell is almost always simply returning the value of that position pointers – don’t try to make it any more complicated then that
  • These are all items that are going to be very specific to the type of stream you are implementingnot every stream will need or want them, but they can be very useful
  • flushing again – you generally only need to implement this if you’re doing some buffering or caching in your custom stream layerSo why would you need to implement flushing? This is most useful if you’re doing some kind of storage in your file, waiting for a decent size to send, waiting for more data to write, etc. This should also clear any caching you have going related to stat
  • These are pretty self explanatory as well
  • Implementing these is REALLY going to depend on what you’re doing and how you’re implementing it and what you want to do with your streamLocking This method is called in response to flock(), when file_put_contents() (when flags contains LOCK_EX), stream_set_blocking() and when closing the stream (LOCK_UN). - Portable advisory file locking
  • If your wrapper does not support any directory functionality – do NOT define these in your class
  • A bitwise mask of values, such as STREAM_MKDIR_RECURSIVE. - this is a rather undocumented constant that you need to dig out of the options
  • public boolstreamWrapper::rmdir ( string $path , int $options )A bitwise mask of values, such as the completely undocumented STREAM_MKDIR_RECURSIVE.
  • Specifies the URL that was passed to opendir(). Note: The URL can be broken apart with parse_url(). options Whether or not to enforce safe_mode (0x04).
  • This method is called in response to closedir(). Any resources which were locked, or allocated, during opening and use of the directory stream should be released.
  • Should return string representing the next filename, or FALSE if there is no next file. Note: The return value will be casted to string.
  • This method is called in response to rewinddir(). Should reset the output generated by streamWrapper::dir_readdir(). i.e.: The next call to streamWrapper::dir_readdir() should return the first entry in the location returned by streamWrapper::dir_opendir().
  • STREAM_OPTION_BLOCKING (The method was called in response to stream_set_blocking())STREAM_OPTION_READ_TIMEOUT (The method was called in response to stream_set_timeout())STREAM_OPTION_WRITE_BUFFER (The method was called in response to stream_set_write_buffer())Can be STREAM_CAST_FOR_SELECT when stream_select() is calling stream_cast() or STREAM_CAST_AS_STREAM when stream_cast() is called for other uses. Should return the underlying stream resource used by the wrapper, or FALSE. this “finishes out’” file functionality by allowing wrappers for basically everything that touches the file systemThis makes things like chown, chmod, chgrp and touch work properly, which is pretty cool actually because it makes
  • So let’s think of a really simple problem here that uses PHP streams10:00 – 10:15 coding time
  • So let’s think of a really simple problem here that uses PHP streams10:15 to 10:20 show and tell
  • So enough of the runaround, we’ve learned what PHP streams and filters are, we’ve learned how to use them, we’ve learned how to use built in streams, filters, and transports and how to create custom streams and filters. Now lets take a look at a couple of use cases for PHP streamsA use case in software engineering and systems engineering is a description of a system’s behavior as it responds to a request that originates from outside of that system. In other words, a use case describes "who" can do "what" with the system in question. The use case technique is used to capture a system's behavioral requirements by detailing scenario-driven threads through the functional requirements.So what is a use case and why are we bothering – so many times I hear “these are awesome, but when would I ever use it?”
  • Theoretical company wants to store lots of dataNeeds a way to transparently change where and how that data is storedHave to keep it flexible enough that when the next guy down the line changes his mind as to HOW the data is stored minimal changes will be necessary
  • This also filters over into using an entire template system based on streams instead of really abstract object orientation – this allows the implementation to be very simple and to get complex behavior, you add in custom filters and streamstalk about zend frameworks stream wrapper and phpbb’stemplating using it as well
  • So this is actually a clever idea I found in a piece of code by Wez Furlong - mtrack (basically a trac clone in PHP)
  • https://github.com/mikey179/vfsStream found this while poking around the internet, interesting idearealpath() does not work with any other URLs than pure filenameschmod(), chown() and chgrp() can not be used in conjunction with vfsStream URLs due to limitations of stream wrappers, which do not support changing file modes, owners or owner groups. Update: Probably this can be fixed with PHP 5.4, which adds support for this to stream wrappers.touch() does not work with any other URLs than pure filenames. Workaround: fclose(fopen($file, 'a')); ext/zip seems not support userland stream wrappers, so it can not be used in conjunction with vfsStreamis_executable() on a vfsStream directory always returns false - this is a problem with PHP itself, see comments on the is_executable() manual page
  • This is a long talk, so ask now! I’ll either say “I’ll talk about that in filters” or “I’ll talk about that with sockets” or answer it ;)
  • And in comes part 2 – starting at 10:45Filters are actually incredibly powerful things
  • A filter is a final piece of code which may perform operations on data as it is being read from or written to a stream. Any number of filters may be stacked onto a stream. Custom filters can be defined in a PHP script using stream_filter_register() or in an extension using the API Reference in Working with streams. To access the list of currently registered filters, use stream_get_filters(). Stream data is read from resources (both local and remote) in chunks, with any unconsumed data kept in internal buffers. When a new filter is prepended to a stream, data in the internal buffers, which has already been processed through other filters will not be reprocessed through the new filter at that time. This differs from the behavior of stream_filter_append(). Filters are nice for manipulating data on the fly – but remember you’ll be getting data in chunks, so your filter needs to be smart enough to handle that
  • Filters can be appended or prepended – and attached to READ or WRITENotice that stream_filter_prepend and append are smart – if you opened with the r flag, by default it’ll attach to read, if you opened with the w flag, it will attach to writeNote: Stream data is read from resources (both local and remote) in chunks, with any unconsumed data kept in internal buffers. When a new filter is prepended to a stream, data in the internal buffers, which has already been processed through other filters will not be reprocessed through the new filter at that time. This differs from the behavior of stream_filter_append(). Note: When a filter is added for read and write, two instances of the filter are created. stream_filter_prepend() must be called twice with STREAM_FILTER_READ and STREAM_FILTER_WRITE to get both filter resources.
  • Well it may look like manipulating data in a variable is preferable to the above. But the above is just a simple example. Once you add a filter to a stream it basically hides all the implementation details from the user. You will be unaware of the data being manipulated in a stream.And also the same filter can be used with any stream (files, urls, various protocols etc.) without any changes to the underlying code.Also multiple filters can be chained together, so that the output of one can be the input of another.The filters need an input state and an output state. And they need torespect the the fact that number of requested bytes does not necessarilymean reading the same amount of data on the other end. In fact the outputside does generally not know whether less, the same amount or more input isto be read. But this can be dealt with inside the filter. However thefilters should return the number input vs the number of output filtersalways independently. Regarding states we would be interested if reachingEOD on the input state meant reaching EOD on the output side prior to therequested amount, at the requested amount or not at all yet (more dataavailable).
  • The string filters are… well slightly less then useful. The only really useful thing about them is you can use some clever php magic to transparently stick them on a fileReally limited in usefulness – what would make them more useful?However, if you’re writing a PHP extension and need to see how the filter stuff works under the hood grab /ext/standard/filters.c and you’ll see some great examples
  • The convert filters are also not really usefulWhat does that star means? means there’s a whole bunch of convert options available but it’s all linked into one wildcard filterhowever, the open in the PHP source checks for these four itemsThese probably aren’t useful in most of your work, unless you’re doing some evil mail stuffThis is most useful for seeing how a filter handles a wildcard situation – you can register and deal with your own wildcard filters if you wantdechunk does exactly what it says and deals with chunked encodingThe consumed one is a bit of an odd duck – it eats the data and basically throws it away
  • There are some extensions with some filters, and oddly enough they tend to be much more usefulon the fly compression ( a filter is actually more useful then a stream in some caseson the fly encryption, on the fly iconv conversion!11:05 – 5 minutes for “what are streams” now for “how to make streams even cooler”
  • With bucket brigade exercise, 15 minutes to get to here11:0010 minute write a filter exercise, 5 minutes of show and tell
  • 11:15 wrap up filter exercisesShow code, talk about solution
  • So just like streams, you have the capability of doing your own implementation of PHP filters
  • so for custom filters you’re going to extend this php_filter_classThis class predates any internal PHP classes, so it predates the naming conventions (and of course we NEVER EVER want to break BC) note that the methods however are camelcased not underscored, because you know everything needs to be screwy and inconsistentI really want a nice abstract class for this… So you extend the class and then use stream_register_filter to register itRemember you can “wildcard” your user stream to do multiple filters in one
  • This method is called during instantiation of the filter class object. If your filter allocates or initializes any other resources (such as a buffer), this is the place to do it. Your implementation of this method should return FALSE on failure, or TRUE on success. When your filter is first instantiated, and yourfilter->onCreate() is called, a number of properties will be available as shown in the table below.
  • A string containing the name the filter was instantiated with. Filters may be registered under multiple names or under wildcards. Use this property to determine which name was usedThe contents of the params parameter passed to stream_filter_append() or stream_filter_prepend()The stream resource being filtered. Maybe available only during filter calls when the closing parameter is set to FALSE.
  • This method is called upon filter shutdown (typically, this is also during stream shutdown), and is executed after the flush method is called.
  • PSFS_PASS_ON Filter processed successfully with data available in the outbucket brigade.PSFS_FEED_ME Filter processed successfully, however no data was available to return. More data is required from the stream or prior filter. PSFS_ERR_FATAL (default) The filter experienced an unrecoverable error and cannot continue. Note that you CANNOT change that signature – you MUST accept consumed by referenceClosing tells you if the filter
  • Sometimes a picture is worth a thousand wordsthis is how data is handled in filters – the bucket brigade is the line of people passing those buckets full of water (or in our case data) to the fire
  • So – you get a bucket (using stream_bucket_make_writeable)The name is a misnomer – what it is actually doing is grabbing a bucket from the brigade… it’s the handoffSo we get the handoff and then we do something to itBuckets are actually a little object with a datalen and data insideThen when we’re done with it, we append to preprend it to the out brigade (we’re done with the bucket, we filled it with water, we’re passing it off)Note that you can also create an entirely NEW bucket and whack it into brigade if you want!You’ll most definitely want to keep track of your current state in your filter class. Remember you can rip data out of the bucket, store it in an internal property … and then say “feed me” to get more data
  • With bucket brigade exercise, 15 minutes to get to here11:0010 minute write a filter exercise, 5 minutes of show and tell
  • 11:15 wrap up filter exercisesShow code, talk about solution
  • Filters are probably the very least used thing in PHP, not because they aren’t useful, but because people don’t see a solution for themIn fact this is true for streams and filters, I see a LOT of people using layers and layers of Object Oriented programming and adapter patterns and decorators and intercepting filters and blah blahblahWhen all they really need is a simple filter on a stream
  • http://jeremycook.ca/2011/I’m building an app where I need to encrypt files uploaded by a user to add an extra security layer. I was initially thinking of using stream_filter_register() to create my own stream filter as the files are read and written. If you’re not familiar with the concept of stream filters in PHP they’re a very powerful feature. By attaching a filter to a stream you can perform various operations to data as it is being read from or written to a stream. Once the filter is defined and attached to the stream this is done completely transparently. Anyway, coming back to my problem of encrypting files, I did a quick search on Google and there didn’t seem to be an easy way of doing this. I then came across this gem in the PHP manual. PHP has a number of built in stream filters and one of them is an encryption filter. Providing you have the mcrypt extension installed encrypting and decrypting files is as easy as registering a stream filter on a stream! Pasted below is the example code from the PHP manual.03/20/easy-file-encryption/
  • 11:25 start back upBucket BrigadeAny built in filter By default, stream_filter_append() will attach the filter to the read filter chain if the file was opened for reading (i.e. File Mode: r, and/or +). The filter will also be attached to the write filter chain if the file was opened for writing (i.e. File Mode: w, a, and/or +). STREAM_FILTER_READ, STREAM_FILTER_WRITE, and/or STREAM_FILTER_ALL can also be passed to the read_write parameter to override this behavior.Order in which they’re added to the filter list AND append will always be reprocessed, prepend will not
  • This is your last chance to get question in, here are the socket resource links
  • Sockets and streams are “interchangeable” to a point in PHPYou can ONLY use available registered transportsTransports cannot be created at the PHP level, only at the C levelBut can be added via PHp extensions – some extensions that have transports include the ssl extension
  • What is streamablebehavorior? We’ll get to that in a bitProtocol: set of rules which is used by computers to communicate with each other across a networkResource: A resource is a special variable, holding a reference to an external resourceTalk about resources in PHP and talk about general protocols, get a list from the audience of protocols they can name (yes http is a protocol)A socket is a special type of stream – pound this into their headsA socket is an endpoint of communication to which a name can be bound. A socket has a type and one associated process. Sockets were designed to implement the client-server model for interprocess communication where:Inphp , a wrapper ties the stream to the transport – so your http wrapper ties your PHP data to the http transport and tells it how to behave when reading and writing data
  • Sockets are just like "worm holes" in science fiction. When things go into one end, they (should) come out of the other. According to the unix socketsfaqIn fact, if you are using Unix, sockets actually are filesAnyway – so if sockets are these worm holes and we can use them just like files with our fwrite, and fread and friends, we can have fun
  • Some quick notes:blocking is like sleeping – it’s synchronous – nothing is going to happen until it’s doneYou can use stream_set_blocking to get around this – which means reads and writes will fail instead of blockingyou have to check the values from fread and fwrite and if they’re zero, try again (send the data again)you’ll need internal buffering and feof has no meaningthis is buggy as hell under windows – particularly with processes – works really quite well with sockets – mixed results with streams\\your mileage may varyYou can use stream_set_timeout – defaults to 60 seconds, after that sets “timed_out” in meta data and returns empty string/zerotimeouts are really only useful with one socketblocking is a PAIN to get working correctly but very useful when doing a lot of things at oncestream_select is also buggy on windows – especially with processes (the processes stuff with PHP on windows is …. icky)it does timeouts and blocking basically – tells you when what you want to do will NOT blockfeof does NOT MEAN CONNECTION CLOSEDit means either a read failed and the buffer is empty ORbuffer is empty and there is no data within the timeoutYou’re moving data across the black hole – do yourself a favor and do it in little chunks. Will make the world a better placeand while stream_get_meta_data has some awesome information don’t be poking at it, it’s for information purposes only
  • Avoid the old sockets extension unless you really really know what you’re doingMost of the things you used to need the sockets extension for you no longer dothose last two functions, stream socket server and stream socket client make doing a client/server relationship really easy with much less codeIt’s sometimes hard to find examples on the stream_socket stuff since most of the old stuff on the internet still uses the sockets extensionDon’t follow their lead, take the time to read the php documentation and use the new APIs
  • By default sockets are going to assume tcp – since that’s a pretty standard way of doing things. Notice that we have to do things the old fashioned way just for this simple http request – sticking our headers together, making sure stuff gets closed. However if you can’t use allow_url_fopen this is a way around ita dirty dirty way but – there you have itremember allow_url_fopen only stops “drive-by” hacking
  • Internet Domain sockets expect a port number in addition to a target address. In the case of fsockopen() this is specified in a second parameter and therefore does not impact the formatting of transport URL. With stream_socket_client() and related functions as with traditional URLs however, the port number is specified as a suffix of the transport URL delimited by a colon. unix:// provides access to a socket stream connection in the Unix domain. udg:// provides an alternate transport to a Unix domain socket using the user datagram protocol. Unix domain sockets, unlike Internet domain sockets, do not expect a port number. In the case of fsockopen() the portno parameter should be set to 0.
  • 11:35 to here – home stretch10 minute code exercise and 5 minute show and tellFigure out IP beforehand and make sure my solution is runningPick one or the other – server will be running on my system for testing for client creators, client can literally be two lines
  • Wrap up at 11:50 – run rest to endFlip to demo code in komodo
  • You shouldn’t ever learn about something in PHp without the why as well as the what and how – so while I’ve focused quite a bit on what and how I’m also trying to focus on why – use cases that make sense
  • It requires a “transport” – a specific type of PHP stream wrapper that talks to the networkTcp, udp, unix, udgStream_socket_server and stream_socket_clientHas to be done through a C level extension
  • This is your last chance to get question in, here are the socket resource linksThe rfc there is for the tcp/ip modelThe unix domain sockets page is pretty nice if you want to figure out exactly how the unix sockets work – one thing you do have to remember is unix sockets are not available and won’t work on windows
  • Do some review here with questions for the audienceWhat are streams?What are sockets?What are filters?Do you have ideas for places you would now use them in your own code?Can you think of some standard libraries that should be built? What frameworks or libraries have good tools for this stuff?
  • So how do you spread the word and help make streams sockets and filters something to be used more “in the wild”First of all, the documentation can always use a hand, visit edit.php.net for an online docbook editor to add stuff to the manualCode sampels and blog posts and articles are also a good good thingBut most of all get your own applications using it and showcasing the cool opportunities for this kind of stuff – put in an s3 stream (I dare someone to get it working properly with wordpress – oy) pick any open source project and get it fully stream compatible, the results might surprise you
  • These are things that I want to haveinphp – the biggest thing are the interfaces for streams – even if they’re only “helpers” in 5.3 and or 5.4, to have a future where you can easily see that something is say seekable would be great. Also some heavy duty testing would be great, especially for extensions and edge casesWhat else would you like to see? I’m kind of curious
  • Let’s wrap it up and wake up
  • Writing and using php streams and sockets

    1. 1.  Idea originating in 1950’s  Standard way to get Input and Output A source or sink of dataDefinitions   C – stdin, stderr, stdout  C++ iostreamWho uses them  Perl IO  Python io  Java  C#
    2. 2.  We will write code (in pairs) There will be a quiz You get out what you put in – participate! This is really 3 talks in one - Streams Filters Sockets
    3. 3. These are the people you can blame for the PHP streams implementation
    4. 4. Abstracting I/O
    5. 5.  Access input and output generically Can write and read linearly May or may not be seekable Comes in chunks of data
    6. 6.  EVERYTHING include/require _once stream functions file system functions many other extensions
    7. 7. Stream Contexts Stream StreamALL IO Filter Wrapper
    8. 8.  Stream › Resource that exhibits a flow or succession of data Wrapper › Tells a stream how to handle specific protocols and encodings Context › A set of parameters and options to tell a stream (or filter) how to behave
    9. 9.  Scheme › The name of the wrapper to be used. file, http, https, ftp, etc. Target › Depends on the wrapper, filesystem uses a string path name, ssh2 uses a PHP resource home/bar/foo.txt file:///home/bar/foo.txt http://www.example.com/foo.txt ftp://user:pass@ftp.example.com/foo.txt php://filter/read=string.toupper|string.rot13/resource =http://www.example.com
    10. 10.  flock wrapper limitations non-existent pointers (infinite loops can and will happen) error handling
    11. 11.  Parameters Options Modify or enhance a stream stream_context_set_param stream_context_set_option stream_context_create
    12. 12.  stream_set_read_buffer stream_context_get_params stream_resolve_include_path stream_supports_lock
    13. 13.  Open_basedir allow_url_fopen allow_url_include
    14. 14. 1. Get a Feed from a webservice (I’m using flickr) - http://www.flickr.com/services/feeds/2. No curl installed, allow_url_fopen is on3. Display however you would like
    15. 15. Grab the feed as xml, pop it through simplexml, loop and display the output to a webpage
    16. 16. Streams available by default
    17. 17.  file:// http:// ftp:// data:// glob://
    18. 18.  SSL  Phar › https:// › phar:// › ftps://  Zlib › ssl:// › compress.zlib:// › tls:// › zlib:// SSH  Bzip › ssh2.shell:// › compress.bz2:// › ssh2.exec:// › ssh2.tunnel:// › ssh2.sftp:// › ssh2.scp://
    19. 19.  Pipes STDIN, STDOUT, STDERR proc_open popen
    20. 20.  php://stdin php://stdout php://stderr
    21. 21.  php://output php://input php://fd
    22. 22.  php://filter (5.0.0) php://memory (5.1.0) php://temp (5.1.0)
    23. 23. 1. Talk to a command line program on your system (I’m using hg)2. Make it do something and read the output into a variable
    24. 24. Talk to hg, get the data, make an array of branches
    25. 25. Userland Streams
    26. 26.  There are no interfaces Implement as though there were an interface Seekable is optional Flushable is optional Directory support is optional and there’s even more available
    27. 27.  stream_open stream_read stream_write stream_eof stream_close stream_stat
    28. 28.  fopen  file_get_contentsInformation  Return true or false  $this->context will have any context metadataCode
    29. 29.  fread  fgets  file_get_contentsInformation  etc…  Return string data or false  $this->context will have any context metadataCode
    30. 30.  fwrite  file_put_contentsInformation  get in a string of data to deal with  return how many bytes you wroteCode
    31. 31.  feof  file_get_contents  freadInformation  etc…  Return true or false  $this->context will have any context metadataCode
    32. 32.  fclose  file_get_contentsInformation  Don’t return anything  any cleanup should go hereCode
    33. 33.  fstat  file_get_contents file_put_contentsInformation Code
    34. 34.  fstat calls stream_stat and should always be implemented EVERYTHING ELSE uses url_stat Good idea to do both Return an array of data identical to stat()
    35. 35.  Called for just about everything  fileperms() fileinode() filesize() fileowner() filegroup() fileatime()Information filemtime() filectime() filetype() is_writable() is_readable() is_executable() is_file() is_dir() is_link()Code
    36. 36.  stream_seek stream_tell
    37. 37.  fseek()  isn’t always called – php streams have read buffering on by defaultInformationCode (stream_set_read_buffer() in 5.3.3 +)
    38. 38.  fseek()  This one is ALWAYS called to get the current position, even if stream_seek isInformationCode NOT
    39. 39.  stream_flush rename unlink stream_lock
    40. 40.  fflush()  return true if data was stored, OR there was no data to store, otherwise returnInformationCode false
    41. 41.  called on rename  no real caveats, works as describedInformationCode
    42. 42.  called on unlink  works just as describedInformationCode
    43. 43.  flock()  file_put_contents() LOCK_EX stream_set_blocking()Information   on stream closeCode
    44. 44.  mkdir rmdir dir_closedir dir_opendir dir_readdir dir_rewinddir
    45. 45.  mkdir()  watch for the recursive flag in the options bitmaskInformationCode
    46. 46.  rmdir()  watch for the recursive flag in the options bitmaskInformationCode
    47. 47.  opendir()  option is “enforce safe mode” (which you should just ignore)In formationCode
    48. 48.  closedir()  all resource locked, allocated, created should be releasedInformationCode
    49. 49.  readdir()  should return a string filename or false return value is cast to a string!Information Code
    50. 50.  rewinddir()  should reset the output generated by readdir, next call should retrieve the firstInformationCode entry
    51. 51.  stream_set_option() 5.3 stream_cast() 5.3 stream_metadata() 5.4 › chown() › chmod() › chgrp() › touch()
    52. 52. Create a custom wrapper for any kind of network or shared memory storage you want (except s3 which is way overdone)Implement at least the basics – extra points for more features
    53. 53. Wrapper for wincache (btw, porting this to apc would be a breeze)Does basics + seek + flush (since we’re caching) + rename + unlinkNo directories
    54. 54. Use Case land – when streamsmake sense
    55. 55.  Data in s3 Data locally during development Easy switch out if alternative storage is ever desired Storing image files
    56. 56.  Existing Zend Framework Code Register the s3:// wrapper Use a configuration setting for the stream to use for all images on the system
    57. 57.  Store and edit template files in a database Have the snappiness of including from disk Minimal Configuration
    58. 58.  db:// stream simple stream wrapper that looks for the template in the db, and writes it to the filesystem before returning the data The cached location is FIRST in the include path, so if it fails, the db stream gets hit
    59. 59.  Talk to mercurial (hg binary) Talk to git communicates via command line Use pipes to get data out into format for use in system
    60. 60.  Use proc_open to keep a pipe to the binary going Pass commands through stdin pipe as necessary Abstract this out to other binaries that are used by the system
    61. 61.  Test filesystem related functionality in simpletest or phpunit without actually touching the filesystem
    62. 62.  A stream wrapper that mocks all wrappable file system functionality that can be used with any testing system
    63. 63. 1. Name 3 Built in PHP streams.2. What is a Context? A Wrapper? A Stream?3. How do you identify a stream?4. Name two extensions that provide additional PHP streams.
    64. 64.  http://php.net/streams http://php.net/filesystem http://ciaranmcnulty.com/blog/2009/04/ simplifying-file-operations-using-php- stream-wrappers
    65. 65. Because everyone needs to rot13a file they open on the fly
    66. 66.  Performs operations on stream data Can be prepended or appended (even on the fly) Can be attached to read or write When a filter is added for read and write, two instances of the filter are created.
    67. 67.  Data has an input and output state When reading in chunks, you may need to cache in between reads to make filters useful Use the right tool for the job
    68. 68.  string filters › string.rot13 › string.toupper › string.tolower › string.strip_tags
    69. 69.  convert filters › convert.*  base64-encode  base64-decode  quoted-printable-encode  quoted-printable-decode dechunk › decode remote HTTP chunked encoding streams consumed › eats data (that’s all it does)
    70. 70.  bzip.compress and bzip.compress convert.iconv.* zlib.inflate and zlib.deflate mcrypt.* and mdecrypt.*
    71. 71.  Given a list of bad words (we’ll use the 7 bad words), write a filter to remove them from text Given a commonly misspelled word, fix the spelling in the text
    72. 72.  I only did the bad words one, and created an extremely naïve regex that stripped the “bad” words from whatever text it finds. It buffers the entirety of the text before doing the replacement – this could be done by looking for word boundaries and doing it piecemeal to improve performance
    73. 73. Manipulate data on the fly
    74. 74.  Extend an internal class php_user_filter It’s not abstract… Yes that’s a horrible name Remember this pre-dates php 5.0 decisions Note the method names are camelcased…
    75. 75.  onCreate  basically a constructor Called every time PHP needs a new filterInformation  (on every stream)  return true or falseCode
    76. 76.  php_user_filter › $this->filtername › $this->params › $this->stream
    77. 77.  onClose  basically a destructorInformation  no returnCode
    78. 78.  MUST return › PSFS_PASS_ON › PSFS_FEED_MEInformation › PSFS_ERR_FATAL  You get buckets of data and do stuff to themCode
    79. 79.  $in and $out are “bucket brigades” containing opaque “buckets” of data You can only touch buckets and brigades with the stream_bucket_* functions You get a bucket using stream_bucket_make_writeable
    80. 80.  Given a list of bad words (we’ll use the 7 bad words), write a filter to remove them from text Given a commonly misspelled word, fix the spelling in the text
    81. 81.  I only did the bad words one, and created an extremely naïve regex that stripped the “bad” words from whatever text it finds. It buffers the entirety of the text before doing the replacement – this could be done by looking for word boundaries and doing it piecemeal to improve performance
    82. 82. Use Case land – when filtersmake sense
    83. 83.  Need to encrypt files uploaded by a user to add an extra security layer Encryption needs to happen before it goes “on the wire” Don’t want to have encryption for all files – don’t need overhead of adapter system
    84. 84.  Use mcrypt’s stream filtering Attach the mcrypt filter to the write stream when uploading data Attach the mcrypt filter to the read stream when downloading the data
    85. 85.  Upload documents to remote storage taking least amount of space possible in transmission
    86. 86.  Use PHP’s compressions filters (bz2, zlib) in conjunction with php’s streams that allow to compression to crush every last bit of space out of the files
    87. 87. 1. What term is used to describe data handling in filters?2. Name two built in filters you should be using and why.3. What is the default mode filters are appended with?4. What is the difference between appending and prepending a filter?
    88. 88.  http://php.net/streams http://php.net/filters
    89. 89. Putter about the network with me
    90. 90.  Socket › Bidirectional network stream that speaks a protocol Transport › Tells a network stream how to communicate Wrapper › Tells a stream how to handle specific protocols and encodings
    91. 91.  Network Stream, Network Transport, Socket Transport Slightly different behavior from a file stream Bi-directional data
    92. 92.  Sockets block › stream_set_blocking › stream_set_timeout › stream_select feof means “connection_closed”? huge reads or writes (think 8K) stream_get_meta_data is READ ONLY
    93. 93.  New APIS in streams and filesystem functions are replacements Extension is old and not always kept up to date (bit rot) Extension is very low level stream_socket_server stream_socket_client
    94. 94.  tcp udp unix udg SSL extension › ssl › sslv2 › sslv3 › tls
    95. 95.  Given two files of quotes, write a server with PHP streams methods to return random quotes, optionally allow client to pick which file to retrieve quotes from Given a simple tcp server that will return a random quote – make a client that pulls quotes for users
    96. 96.  socket_server.php – uses stream_socket_server and stream_accept socket_client.php – uses stream_socket_client and a bit of loop magic to make a little “cli” app
    97. 97. Use Case land – when socketsmake sense
    98. 98.  Send email using an external smtp server with special requirements for both authentication and encryption
    99. 99.  Simple smtp class, using PHP sockets to communicate with the server and pass the mail on, complete with tls encryption
    100. 100.  Push data from a PHP script into flash player with any additional extensions or tools
    101. 101.  http://ria.dzone.com/articles/php-and- flex-sockets Use sockets to talk directly to the flash player
    102. 102. 1. How is a socket different from a stream?2. What transports does PHP provide by default?3. Name the two most commonly used stream socket functions.4. How can a user add additional transports to PHP?
    103. 103.  http://php.net/streams http://php.net/transports http://wikipedia.org/wiki/Unix_domain_s ocket http://tools.ietf.org/html/rfc1122
    104. 104. Does your brain feel like mush yet?
    105. 105.  Stream › Resource that exhibits a flow or succession of data  WrapperStreams › Tells a stream how to handle specific protocols and encodings  Context › A set of parameters and options to tell a stream (or filter) how to behave  Performs operations on stream data  Can be prepended or appended (even on the fly)Filters  Can be attached to read or write
    106. 106.  pipes are like sockets but only go one wayProcesses (pipes)  stdin, stdout, stderr are all pipes  proc_open and popen do pipes  You can use stream functions on pipes, can’t do custom pipes  Socket › Bidirectional network stream that speaks a protocol  TransportSockets › Tells a network stream how to communicate  Wrapper › Tells a stream how to handle specific protocols and encodings
    107. 107.  http://edit.php.net Blog posts Articles Cool apps using streams, sockets and filters properly
    108. 108.  Set of interfaces for streams Improved based filter class Wrappers for chmod and touch More tests! Any you have?
    109. 109.  http://emsmith.net http://joind.in/3747 auroraeosrose@gmail.com IRC – freenode – auroraeosrose #php-gtk #coapp and others