{"id":187,"date":"2008-08-14T11:19:31","date_gmt":"2008-08-14T19:19:31","guid":{"rendered":"http:\/\/www.curlybrace.com\/words\/?p=187"},"modified":"2008-08-14T15:09:14","modified_gmt":"2008-08-14T23:09:14","slug":"craigslist-blocks-yahoo-pipes","status":"publish","type":"post","link":"https:\/\/www.curlybrace.com\/words\/2008\/08\/craigslist-blocks-yahoo-pipes\/","title":{"rendered":"Craigslist Blocks Yahoo Pipes"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2008\/08\/craigslist_brokenheart_yahoopipes.png\" alt=\"Craigslist has no love for Yahoo Pipes\" title=\"Craigslist has no love for Yahoo Pipes\" width=\"500\" height=\"163\" class=\"size-full wp-image-201\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2008\/08\/craigslist_brokenheart_yahoopipes.png 553w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2008\/08\/craigslist_brokenheart_yahoopipes-300x98.png 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/p>\n<p>Craigslist is one of the greatest sites in the world, and the entire Bay Area seems to revolve around it.  Sadly, Craigslist&#8217;s search facility is extremely bad, seemingly only capable of searching within a price range and neighborhood.  Craigslist supplies RSS feeds, but this still means I have to sift through a lot of information in order to find what I&#8217;m looking for.<\/p>\n<p><a href=\"http:\/\/pipes.yahoo.com\">Yahoo Pipes<\/a> provides a way to filter and manipulate RSS feeds.  It&#8217;s very visual, and relatively easy to use.  This would be an excellent tool to prune down my Craigslist RSS feeds.<\/p>\n<p>Unfortunately, as of some time in the recent past, Craigslist has begun blocking Yahoo Pipes.  Perhaps someone wrote an overly-popular pipe which caused a tremendous load on Craigslist&#8217;s servers, or perhaps Craigslist thinks they&#8217;ll somehow lose income by allowing Pipes.  Either way, it sucks.<\/p>\n<p>The work-around which I&#8217;ve employed is to mirror the base Craigslist search on my own server, then feed the Yahoo Pipe from that.<\/p>\n<p>This requires you to have a server which:<\/p>\n<ol>\n<li \/>Is HTTP accessible.\n<li \/>Provides <a href=\"http:\/\/en.wikipedia.org\/wiki\/Vixie_cron\">cron<\/a>, or some other method of running a script at regular intervals.\n<li \/>Has <a href=\"http:\/\/curl.haxx.se\/\">curl<\/a>, <a href=\"http:\/\/www.gnu.org\/software\/wget\/\">wget<\/a>, or another HTTP-content-fetching utility.\n<\/ol>\n<h3>Mirroring the RSS Feed<\/h3>\n<p>First, create an appropriate directory structure.  For example:<\/p>\n<blockquote>\n<pre>mkdir ~\/public_html\/feeds<\/pre>\n<\/blockquote>\n<p>Next, test out <tt>curl<\/tt> or a similar content-fetching application on a Craigslist RSS feed URL.  Don&#8217;t forget that quotes are usually needed around the URL:<\/p>\n<blockquote>\n<pre>curl \"http:\/\/feedUrl\" --output ~\/public_html\/feeds\/yourFile.xml<\/pre>\n<\/blockquote>\n<p>Examine the content of the file and make sure that it&#8217;s the expected XML.  If the file is very small, and contains text to the effect of, &#8220;this URL has moved&#8221;, then you may have forgotten to surround the URL with double quotes.<\/p>\n<h3>Creating Yahoo Pipe<\/h3>\n<p><img decoding=\"async\" src=\"http:\/\/farm4.static.flickr.com\/3287\/2762771299_13a5bdda67_m.jpg\" align=\"right\" style=\"margin-left: 15px\" \/><br \/>\nTo fetch this mirrored RSS feed, use the &#8220;Fetch Data&#8221; source and provide it the URL to your freshly-fetched file.<\/p>\n<p>If the pipe can&#8217;t be read, verify the permissions for the containing folder hierarchy on your server.  For *nix boxes, make sure the execute bit is set (<tt>chmod a+x ~\/feeds<\/tt>).<br \/>\n<br style=\"clear:both;\" \/><\/p>\n<h3>Automating Update<\/h3>\n<p>Create a script file which will retrieve any and all feeds you wish to mirror.  I place my scripts in <tt>~\/bin<\/tt>, so I placed the following into <tt>~\/bin\/fetch-feeds<\/tt>:<\/p>\n<blockquote>\n<pre>#!\/bin\/bash\r\n\r\nrm ~\/public_html\/feeds\/yourFile.xml\r\ncurl \"http:\/\/feedUrl\" --output ~\/public_html\/feeds\/yourFile.xml<\/pre>\n<\/blockquote>\n<p>Note that I delete the existing feed mirror before fetching the new one so that any retrieval error will be obvious.<\/p>\n<p>Now, call this script from inside your <tt>crontab<\/tt> (Scheduled Tasks on Windows servers):<\/p>\n<blockquote>\n<pre>crontab -e<\/pre>\n<\/blockquote>\n<p>I update my mirror at 7am and 2pm with the following:<\/p>\n<blockquote>\n<pre># Fetch Craigslist feeds at 7am and 2pm:\r\n0 7,14 * * * ~\/bin\/fetch_feeds<\/pre>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Craigslist is one of the greatest sites in the world, and the entire Bay Area seems to revolve around it. Sadly, Craigslist&#8217;s search facility is extremely bad, seemingly only capable of searching within a price range and neighborhood. Craigslist supplies &hellip; <a href=\"https:\/\/www.curlybrace.com\/words\/2008\/08\/craigslist-blocks-yahoo-pipes\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13,15],"tags":[],"class_list":["post-187","post","type-post","status-publish","format-standard","hentry","category-internet","category-technology"],"_links":{"self":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts\/187","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/comments?post=187"}],"version-history":[{"count":14,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts\/187\/revisions"}],"predecessor-version":[{"id":203,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts\/187\/revisions\/203"}],"wp:attachment":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/media?parent=187"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/categories?post=187"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/tags?post=187"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}