|
Simple RSS XML
Grabber in PERL
Recently I decided to jump on
the Web 2.0 bandwagon and get some RSS XML feeds for news
headlines and add them to some of my more popular websites. Like
usual, I was unable to find a simple script that I could add and
include the basic text in my pages. So, as usual, I wrote one.
I figure it will be a hot
little script for anyone looking to include a text based XML from
any other website. I have kept it simple, it probably wont work
on most feeds without some tweaking, but should be a simple
install and fine for most people.
The script requires LWP::UserAgent
which is part of the LWP package and should already be on most
servers. I did not use any other modules to parse the XML but it
would probably work better if I had. I know very little about XML
but using a module that calls 3 more modules that each call
another module to parse a simple text file just seemed like a
pointless waste of cpu time.
All I wanted was the article
tile, the link to the full article and the description of the
story. It just doesn't seem that complicated and I did not want
to make it more than it was. A simple fetch and print
using LWP.
There is only one variable to
configure, that is the actual url of the xml file you want to
access. Or use the second version of the script which will allow
you to dynamically define the xml file in multiple pages.
download
the perl script rss.cgi
See
The Sample Web Page Output
- We have installed a sample XML file in this directory and the
script is running to access the file. You can compare the working
script to the script you install on your own website. They should
look the same if everything is working. You wont want to access
our feed, since it will never update. But at least this way we
can provide a fully configured working script.
To install the script,
save the file rss.txt and rename to rss.pl or rss.cgi so it can run on
your website.
Make sure the first line of
the program points to your perl compliler.
#!/usr/bin/perl is the default on most servers
Upload the file in ASC format
chmod 0755
Then just access the script
using your web browser.
Once you have it working,
replace the test url with the actual url of your desired rss feed
and you can start displaying updated headlines on your own
website.
RSS Fetch and Print -
Version 2
A second version of the script
is set up to use a querry string to define the url of the xml
file. The advantage of this script over the more simplified
version is that one script can do the work for all the xml files
you want to access. You could actually have hundreds of pages of
content, all dynamic using one simple script.
To call the script use any
shtml page with the include virtual tag. You could do the same
thing using jsp, asp or php, but since I am perl only, I have no
idea how.
<!--#include
virtual="rss.cgi?http://bumblebeeware.com/rssxml/test.xml" -->
Just replace the test url http://bumblebeeware.com/rssxml/test.xml with the actual url you want to include in your web
page. And that is it. Build 100 pages with the urls of each xml
file and you instantly have a huge website with dynamic up to
date content.
download
the perl script rss2.cgi
I have added a block in the
script to prevent the script from being accessed directly using a
querry string. That will prevent the script from being used as a
proxy accessing outside pages. This is important to prevent abuse
of the script and your website.
You can further the security
by requiring that the accessing page is part of your domain. But
I am trying to keep the code as simple as possible so you can
expand on it as needed.
You can do that by replacing
the lines
if ($ENV{'REQUEST_URI'}
=~ /$ENV{'SCRIPT_NAME'}/){
print "Location: http://$ENV{'HTTP_HOST'}\n\n";
exit;}
with someting containing the
exact url of the script
$scripturl = "yourdomain.com/rss2.cgi";
# the url of the script on your server
if ("$ENV{'HTTP_HOST'}$ENV{'REQUEST_URI'}" =~ /$scripturl/){
print "Location: http://$ENV{'HTTP_HOST'}\n\n";
exit;}
There is plenty of room to customize this
script and make it better. But it is a quick simple solution when
I have seen nothing as simple in perl.
|