So I’ve got this nifty little RSS parser doohickey, Magpie. In the name of lowering the curve, and weaning PHP programmers away from their previously available hackish solutions I tried to make it as simple to use, and “PHP-like” as possible. Meaning that fetching the remote feed, and parsing it, and caching it have been rolled into one convenient step. Now that HTTP conditional GETs are all the rage, I’m adding them to Magpie. (I’ve had an ugly implementation lying around for a while, but its not even worth check into CVS)

PHP as Web Client

But how the hell does one do web automation with PHP? I feel like no one has ever taken this problem on before in PHP. Or at least no one on the web is talking about it somewhere I can find it. How does one get at If-Modified-Since, Last-Modified, and Etag? Where is LWP or urllib2 for PHP?
You can get at the response headers from fopen() from the array $http<em>response</em>header which is magically instaniated behind the scenes. (because PHP does that kind of thing) I wonder if I stuffed some vaules into an array named $http<em>request</em>header, would it work? (No, it doesn’t)

PHP Cookbook

The PHP Cookbook has a tantalizing Chapter 11 entitled “Web Automation”, with rule 11.1 being “Fetching URLs with GET”, and rule 11.4 “Fetching URLs with Headers”, sounding just about perfect. And its supposed to come out this November, could only by 9 days, but I’m impatient. I’m going to stop by Quantum today, to see if they have one of their looks-like-someone-snuck-it-out-the-back-door-and-photo-copied-it O’Reilly specials.

Rolling Your Own

So barring the deliverance from on high by O’Reilly it appears that the only way I’m going to get these features right now is to roll my own using fsockopen() and hand-packed headers. Have I mentioned that PHP is sadly deficient in tools?

update, 10/25: So the word on the street is “just use sockets”, the answer rolls off the mailing lists and newsgroups, with the polish and weariness of a frequently asked question. No one suggested it, but I’m also intrigued by Snoopy, the web client class for PHP. I think I’ll start by rolling my own, and loop back to Snoopy when I have time to do benchmarks.

fyi: ended up using Snoopy, very happy with it.

The Impenetrable Importance of Culture

For me the hardest part in working with languages I’m less familiar with (Python, and PHP for example) rather then those I’m more comfortable with (Perl or Java) is not syntax questions, it’s culture. For all of Perl’s much vaunted “There is More Then One Way To Do It”, I know the proper way to do things, the proper tool to reach for, and if I don’t I have ways of finding out, largely through internal calculation based on my understanding of the Perl reputation landscape. It is that information which is opaque to me, especially in PHP where the vast number of practioners are novices.