MasterNewMedia
Curated by: Luigi Canali De Rossi
 


Thursday, March 9, 2006

How To Create A RSS Feed From Any Web Page

Sooner or later, and maybe without even knowing the technical terms required to communicate this to someone else, you will want to subscribe and monitor web sites, information pages, or online catalog sections on an ongoing basis.

RSS_icon.gif

You have heard about RSS, webfeeds, Atom and other apparently not too clear tech terms describing something that did sound like what you are really in need of now, but even with all of your best will you wouldn't how or where to start given that those pages you have identified do not sport any orange colored button or icon hinting to a proper RSS feed.

Can do you generate an RSS feed for a web page that doesn't have one?

Can anyone do this on her own?

The answer to both is a resounding YES!

Today, thanks to new "html scraping" services available to everyone, RSS feeds can be automatically generated for just about any web site, no matter what kind of layout, coding or language it is written in. In some situations, to create a standard RSS feed from any web page that does not have one may take less than a minute, while in other cases, where your needs for customization are higher, you may need to spend a little more time.

Morale of the story: any web page today can be made to generate a RSS feed automatically. By the owner or, as it will increasingly happen, by someone else who wants to be informed in near-real-time of any news and content updates made on it.

RSS_yahoo_my_rssaddress_350.gif

HTML scraping or the ability to automatically generate a standard RSS feed from a HTML document (a web page) that does not have one has been a new type service under increasing demand for over 2 years now.

Early services (e.g.: MyRSS) that offered HTML scraping later disappeared or were replaced by other more profitable ones. Creating an automatic RSS feed from a non-RSS enabled web page enables a number of truly useful potential applications and I am sure that such services will enjoy soon greater marketplace rewards.



FeedYes
feed_yes.gif
FeedYes is the latest entry in this small group of online services which allow anyone to create/ generate automatically a RSS feed for any web page. FeedYes, has really found a simple and truly effective route to simplify this task while providing good enough a solution to satisfy most needs.

While it is not perfect, it is damn good and fast at doing what it does. It is alos rather simple to use, and once you have gone through it once, creating a second feed for another site, may take literally only a few seconds.

FeedYes is a three-step process that involves a) providing the URL of the page out of which an automatic RSS feed needs to be created, b) indicating among the dynamic links found by FeedYes on the specifiied URL, which one is the first that refers to the content section that you are interested in (all web pages have different content sections in the same page, and you probably do not want to create a feed for the comments section or for the most recent articles appearing on the same site), c) indicating in the updated list of links FeedYes will spit out the last relevant link pertaining to your selected content section.

In this way, FeedYes isolates with good precision (you are the one effectively guiding) the specific content section you are interested in (say the Latest News) and creates an RSS feed for it.



Feed43
feed43_logo.gif
Feed43 is an online service that converts standard web pages or XML documents to RSS feeds. Feed43 does so by extracting snippets of text or HTML by applying specific search patterns to the document from which the feed needs to be extracted. The search patterns help Feed43 understand exactly which content to grab from a page and which not.

This allows for a much more precise control of what will be contained in a feed at the expense of the ease of use and accessibility of the overall product itself. For technically savvy users this is in fact an excellent and very reliable approach to RSS feed generation but for non-technical users Feed43 may scare off lots of users in a matter of minutes.

In Feed43 the set of steps required to create a custom RSS feed for a web page that has none are as follows:
a) Identify the web page from which to generate a RSS feed.
b) Create a RSS feed on Feed43 pointing to that web page.
c) Define search patterns required.
d) Specify output templates required.
e) Generate the new RSS feed.

All feeds created with Feed43 are "public", but optionally Feed43 also allows you to protect any newly created RSS feed with a password. The service is free.



FeedFire
feedfire_logo_170.gif
FeedFire is the oldest of these HTML-to-RSS services allowing anyone to automatically create a RSS news feed for any Web site that does not have one.

You simply register at FeedFire, input the URL of the page and FeedFire dos the rest for you in the fraction of a second. All that's needed is a FULL URL to the page you would like to have made into RSS. All bandwidth costs to host the new RSS feeds are absorbed by FeedFire.

FeedFire also allows to sponsor newly created RSS feeds. this can be done by anyone like me and you, who are not major corporations but people who are looking for a clever, considered and comprehensively featured service that allows them to add extra reach, exposure, visibility and unique content to others and/or to THEIR own web site.

RSS feeds created and sponsored with FeedFire can also be made private, and used for creating intelligence reports or RSS learning objects or RSS newsmastering channels containing information otherwise inaccessible to others.

Sponsored feeds can be further filtered by allowing the sponsor to select only news items that "include" or do not have specific keywords. It is also possible to customize the number of news items displayed in the sponsored feed, the number of words per news item and even the title and the description of the newly created RSS feed. The varying levels of sponsorship have increasingly higher levels of features and customisation.

Find out more.




Recent related resources:

 
 
 
Readers' Comments    
2010-07-09 09:04:04

Bertram

Thanks for the review, I have found what I was looking for.



2010-02-25 12:34:15

peter

Stay away from feedyes - they are crooks. I upgraded my account via paypal and they have not upgraded it via their site. They have not responded to 5 emails via their own procedures. Use them at your peril.



2009-11-25 10:33:12

JenniC

Nicely maintained blog.

I use biterscripting for extracting info from web page (our own web pages). I further use it to create structured information from it ( RSS in this case ).

You can do a google search on biterscripting. The tool itself is free at biterscripting.com or any download site.

I am not very good in coding, but I will give your approach a try soon.



2009-10-27 15:41:13

Tom

Another good online rss feed creation service is rssa.at It hosts the feed for you and provides the website owner with various widgets which allow the visitor to signup to receive email alerts when they publish new content.



2009-09-17 12:23:03

may cong cu

thanks for reviev.. i checking all them now



2009-09-08 11:05:53

James

I think the RSS feed tool (feedyes) is great and now have it working, linking html content from my website to my LinkedIn group.

One question though - all the titles of the links that are produced by the RSS feed have the word "enraquo" following it. How can I get rid of that?

Thanks

James



2009-08-29 20:41:12

Stella Lee

Since this article was written, I've been using a new service, http://www.feedbeater.com to do this. It's more robust in that it checks for differences instead of relying on scraper patterns that break.



2009-04-16 10:32:17

Shahriar Hyder

Here is a post regarding techniques for 'Scraping your way to RSS feeds' albeit in a non-programmatic (layman) way:

http://technosiastic.wordpress.com20090408scraping-your-way-to-rss-feeds



2008-12-05 11:30:32

Robin Good

The tool I use to do what you have just described is MySyndicaat.com .



2008-12-05 11:06:13

Andrew

I am looking for something more robust... but still free or cheap. I have a site i scrape with feed43.com that pulls a list of articles and the links to those articles. I want to be able to add content to each RSS item filled (part of) the article from the resulting link's page. Is there a web-service that has this capability?



2008-10-16 22:34:36

roodie

Ive noticed there are both html scraper services like page2rss and there are services that let you manually create rss feeds like www.webrss.com Which is best for the long term. Will the scrapers always work?



2008-07-25 17:28:44

Andreas Beer

So, 1. yesfeed:

-won't work with my pages
-I can't delete my account
-only 14 days for free

2. feed43
-website down

3. feedfire
-slow website
-no clear statements whatsoever on free or not
-asks for phone number on registration, i didn't bother...



2008-07-15 10:56:51

Claus

thanks for the nice article,it helped me a lot



2008-07-14 22:55:43

web man

I would like to suggest a great new site that organizes your RSS feeds.
It employs a bayesian filter for RSS feeds where you can train the filter what you like and
what you don't like. It's free, try it at a rel=nofolow href="http://www.filteredrss.com"www.filteredrss.coma.



2007-10-17 01:21:15

rss feeds

Great article - appreciate you spreading the knowledge!

Another service worth checking out is Runstream.

It's not an scraping service like some of the others listed, this offers an easy editor and a full analytics package for tracking feed usage.



2007-10-01 18:28:07

RRSSSSSgen

You can always try to rely on an external service, but i've found the best thing to be a server side PHP script that creates the RSS feed on your OWN server! check out http://www.xmlhub.com/rssgenr8.php they have good instructs etc.. been using this for over 4 yrs. and still working like a charm



2007-06-20 12:58:37

tenfourzero

www.IrisFeed.com creates an RSS AND Atom feeds for any website!

It's so much easier to use than the other services AND it's free AND you don't need to register AND there's no banners ... did I mention its awsome?!



2007-06-11 04:51:26

hoodooville

Robin, thanks again for all of the goodies you publish!.
Be Smart! Be Independent! Be Good! Thanks for walking your talk.
P.S. Anyone know if one feed is better for photos than the other?



2007-02-10 04:31:42

Drew2020

Feedity (www.feedity.com) works like a charm for me in just two clicks!



2006-05-08 18:15:05

Ryan North

You might also want to check out RSSPECT.com, which generates RSS feeds for any document (HTML, but also PDF or MP3 or anything else) online. It simply checks to see if the document has been updated, and if so, updates the feed. The feeds themselves are less detailed (you have to visit the site you've targeted to see what's changed) but the setup process is easy and it doesn't rely on text parsing, which can be brittle.

(Disclaimer: I wrote the site!)



2006-03-13 17:03:21

webcam

Hi Robin,

i have trouble to mix xml with php.
where i can find info about this?



2006-03-13 13:38:23

Peter Bates

Hi Robin,

Just getting to grips with creating RSS feeds. But there is still one thing I don't fully understand. When using something like "Yesfeed", I create the feed and add the RSS script to my page with the RSS button. Is that all I have to do? Will the RSS feed get automatically updated every time I add new content that that page?

Peter



2006-03-11 18:39:55

Carmen Holotescu

Hi,

A very good presentation.

You can use also MySyndicaat - http://www.mysyndicaat.com to create RSS feeds.

My best,
Carmen



2006-03-10 08:25:12

John Tropea

Wow Robin, I just did a watered down version of this exact post:
http://libraryclips.blogsome.com/2006/03/10/feedyes-to-scraping/



 
posted by Robin Good on Thursday, March 9 2006, updated on Friday, February 26 2010


Search this site for more with 

  •  

     

     

     

     

    5221

    Recommended Resources

     

     

    Subscribe to MasterNewMedia
    Feature Articles and Reports

  • RSS Feed

          Mail

    Powered by FeedBlitz

     

    POP Newsletter

    Robin Good's Newsletter for Professional Online Publishers  

    Name:
    Email:

     

     
    Real Time Web Analytics