So I’ve got this nifty little RSS parser doohickey, Magpie. In the name of lowering the curve, and weaning PHP programmers away from their previously available hackish solutions I tried to make it as simple to use, and “PHP-like” as possible. Meaning that fetching the remote feed, and parsing it, and caching it have been rolled into one convenient step. Now that HTTP conditional GETs are all the rage, I’m adding them to Magpie. (I’ve had an ugly implementation lying around for a while, but its not even worth check into CVS)
PHP as Web ClientBut how the hell does one do web automation with PHP? I feel like no one has ever taken this problem on before in PHP. Or at least no one on the web is talking about it somewhere I can find it. How does one get at
Etag? Where is LWP or urllib2 for PHP?
You can get at the response headers from
fopen() from the array
$httpresponseheader which is magically instaniated behind the scenes. (because PHP does that kind of thing) I wonder if I stuffed some vaules into an array named
$httprequestheader, would it work? (No, it doesn’t)
The PHP Cookbook has a tantalizing Chapter 11 entitled “Web Automation”, with rule 11.1 being “Fetching URLs with GET”, and rule 11.4 “Fetching URLs with Headers”, sounding just about perfect. And its supposed to come out this November, could only by 9 days, but I’m impatient. I’m going to stop by Quantum today, to see if they have one of their looks-like-someone-snuck-it-out-the-back-door-and-photo-copied-it O’Reilly specials.
Rolling Your Own
So barring the deliverance from on high by O’Reilly it appears that the only way I’m going to get these features right now is to roll my own using
fsockopen() and hand-packed headers. Have I mentioned that PHP is sadly deficient in tools?
update, 10/25: So the word on the street is “just use sockets”, the answer rolls off the mailing lists and newsgroups, with the polish and weariness of a frequently asked question. No one suggested it, but I’m also intrigued by Snoopy, the web client class for PHP. I think I’ll start by rolling my own, and loop back to Snoopy when I have time to do benchmarks.
fyi: ended up using Snoopy, very happy with it.