5 Eylül 2007 Çarşamba

Republish RSS feeds

There is much debate about whether or not you are allowed to republish the contents of an RSS feed on your site. I revisit this debate every now and again to see if there has been any developments.

In the light of Microsoft changing there licensing rules for RSS search results I thought it would be a good opportunity to revisit the age old issue of RSS fair usage.

Current state of affairs

If you take a brief look around the internet it won't be long before you find a site that is displaying the contents of another sites RSS feed. A common reason for this is where webmasters republish RSS on their sites to give Google the impression to site is constantly being updated.

Webmasters will often use popular mainstream newsfeeds from sites such as Yahoo then parse them into their own website using tools such as Magpie so that the Google bots have something to crawl.

Dig still deeper and you may find sites that scrape RSS feeds and republish them verbatim. Again this is for the usually financial gain of the webmaster. The webmaster of the scraper site has source of fresh content that auto updates, is readable by Google and is usually monetized using Adsense.

The dark side of RSS

This sort of use of RSS for republishing for financial gain would be considered to be on the more dubious side for RSS usage. RSS is all about syndication for both personal consumption and to a lesser extent commercial republishing, so where is the line to be drawn.

Fair usage of RSS

Consuming an RSS feed in your RSS reader I think we can all agree on is fine, that is why publishers publish RSS. What is less clear is whether or not someone can take an RSS feed and publish it on their own site.

Why would someone want to republish an RSS feed on their website? The reason is usually financial. Is it right? Everyone is going to have their own answer to that question. In my opinion as long as clear spiderable links are used giving clear attribution then it seems fine. I am sure most webmasters would appreciate the extra traffic although I would also recommend that you contact the feeds publisher as a courtesy to let them know what you are doing. At least then they can ask you to stop if they feel it is infringing on there copyright.

Protecting your RSS feed

It looks like more and more bloggers are worrying about content theft. As RSS becomes more mainstream and a valuable metric this is making RSS news feeds a more visible target for lazy webmasters.

Just browsing through my feed reader I have found several Copyright notices appearing at the end of each feed entry. I am sure this sort of practice will become standard soon but why not get a jump on it and make sure you are protecting your RSS content.

Free RSS Resources

Firefox Has A Feed Reader Built In?

Subscribing to and reading a feed with Mozilla’s Firefox browser is really quick and easy.

Just browse your way to a website and look down to the lower right-hand corner of your screen, not where the actual web page is displayed, but just below it in the browser bar at the bottom of the screen. If you see an orange cube there with what looks like radiating radio waves, put your cursor over it. That’s the Firefox Live Bookmarks button.

Reading Feeds With My Yahoo

To use My Yahoo, you need a free Yahoo account. It only takes a minute to set up, and it includes a free Yahoo email address with tons of storage. If you don’t already have an account, you can set it up at the same time that you use the My Yahoo feed reader.

My Yahoo lets you create a customized Yahoo page with all sorts of content from Yahoo and elsewhere on a broad range of topics.

How To Find And Read A Feed

Regardless of which RSS feed reader you use, you have to tell it where the feed is that you want to read, just like you have to tell a web browser where the website is. You give the reader the web address of the feed, also known as the feed URL, just like you’d give the web URL to your web browser. To find feeds, check out my current lists of RSS feed directories and search engines. If you’re looking for the feed from a blog, you can find specific blogs in my current lists of blog directories and search engines.

How Do I Read A Feed?

As an RSS feed consumer, you have a lot of choices, and many of them are free. It’s like the early browser wars. There are lots of competitors offering a core set of features plus their own special enhancements. Some are buggier than others. Some don’t offer the latest features (like audio and video enclosures, which we’ll discuss soon) yet, but they’re working on them. Some run on more OS platforms than others. Some integrate the features of a web browser, an old-style internet news group reader, and an RSS feed reader, while others only display RSS feeds. You’ve got to do a bit of research to find the one that’s best for your needs.

How Do You Generate An RSS Feed?

There are several ways to generate an RSS feed:

1) Directly from your blog software (if you blog), optionally enhanced by FeedBurner or a similar service. This is how I currently generate all of my feeds.

If you haven’t chosen your blog software yet, I highly recommend WordPress. It’s got a great feature set, it’s been around long enough to have a lot of the bugs fixed, lots and lots of loyal users of other blog software have moved to WordPress because it’s so much better, and it’s free, even for commercial use. It also seamlessly supports audio enclosures (podcasting) in your RSS feed.

If you need a place to host your blog and feed, iPowerWeb is by far the best low-cost service I’ve ever seen. Everything just works, they provide great statistics, site features and documentation, lots of storage and email accounts, proper and current PHP and MySQL support (necessary for many blogs including WordPress) and their tech support folks are responsive and follow up to make sure any issue gets resolved. WordPress installs and runs like a dream on iPowerWeb. With some other hosting providers, I’ve had to rewrite PHP code and place files in all sorts of unnatural places on my site to get it to work, due to silly restrictions and limitations of the hosting providers. I’ve tried several low-cost and medium-cost hosting services and now would never use anyone but iPowerWeb.

Which Version of RSS Do I Use?

Just like anything else in the high-tech industry, the standards for RSS are evolving quickly. For now, most programs which generate feeds seem to have settled on RSS 2.0, though RSS 0.9x and RSS 1.x variants are still around. There’s also a similar standard called Atom, which is used by Blogger and some of the other blog services. Most feed readers will understand and display all of these different formats, and will in fact deal with Atom feeds even though the programs are called “RSS” feed readers.

AdSense Advertising on Yahoo?

Yahoo plans to offer contextual advertising, similar to Google’s AdSense program. The question is , when? According to observations posted recently on Waxy.org, it may only be a few months from now, with testing already underway. Overture’s name is changing to Yahoo too. Sign up with Yahoo here to stay informed.

4 Eylül 2007 Salı

What is RSS Anyway?

RSS stands for Really Simple Syndication or Rich Site Summary or RDF Site Summary depending on who you listen to. It’s also confused with RDF (Resource Description Framework), XML (eXtensible Markup Language) and a variety of other related TLA’s (Three-Letter Acronyms). RSS is actually a family of web syndication protocols which provide information in XML files known as RSS feeds.

The issue is further confused by some “helpful” RSS feed readers, search engines and directories, which claim that RSS feeds always end with the .rss or .xml extension. NOT! In fact, RSS feeds can have almost any file extension or none at all. The current version of WordPress, a popular and free (free for commercial uses too!) blog software package which I use, uses the .php extension for all of the feeds it generates. I suspect that many other blog packages written in PHP (Hypertext Preprocessor) code also use that extension for their feeds. Feedburner, a great feed re-publishing service, doesn’t use an extension at all, unless you tell it to use a particular one.

Web-Based Feed Readers

Many RSS feed readers are web-based – you don’t actually have to install anything on your desktop, or even use your own system. Just go online in the internet cafe or the library or on your own computer and you can read any feed you want to. My Yahoo, Bloglines, NewsGator and MyFeedster are some of the popular web-based feed readers, and they’re free. Set up an account at no charge and go feed surfing.

If you set up your own RSS feed, and then tell My Yahoo about it, you’ll wind up with a listing in the Yahoo search engine (which would otherwise be much more difficult to get).


Mozilla vs Microsoft

Today, the Firefox web browser and its email companion Thunderbird, which are the open-source successors of Netscape from Mozilla, come with an RSS feed reader built into the 1.0 release. They also run on most OS platforms. Microsoft is playing catch-up. MS Internet Explorer has no RSS feed reader capability. But you just know it will, and soon. Microsoft sees the potential of RSS and how it’s taking hold across the web. They see Mozilla trying to take the new browser market away from them again. When that happens, anyone who’s not taking full advantage of RSS for their business or personal promotion will be left in the dust. Even if Microsoft enforces a new RSS standard, everyone else will support it, because of the sheer size of the Microsoft-based market. That’s just how it is.

1 Eylül 2007 Cumartesi

RSS

RSS (which, in its most recent format, stands for "Really Simple Syndication") is a family of Web feed formats used to publish frequently updated content such as blog entries, news headlines or podcasts. An RSS document, which is called a "feed", "web feed", or "channel", contains either a summary of content from an associated web site or the full text. RSS makes it possible for people to keep up with their favorite web sites in an automated manner that's easier than checking them manually.

RSS content can be read using software called a "feed reader" or an "aggregator." The user subscribes to a feed by entering the feed's link into the reader or by clicking an RSS icon in a browser that initiates the subscription process. The reader checks the user's subscribed feeds regularly for new content, downloading any updates that it finds.

The initials "RSS" are used to refer to the following formats:

  • Really Simple Syndication (RSS 2.0)
  • RDF Site Summary (RSS 1.0 and RSS 0.90)
  • Rich Site Summary (RSS 0.91)

RSS formats are specified using XML, a generic specification for the creation of data formats.

History

Before RSS, several similar formats already existed for syndication, but none achieved widespread popularity or are still in common use today, as most were envisioned to work only with a single service. The basic idea of re-structuring metadata information about web sites has been traced back at least as far as 1995, and the work of Ramanathan V. Guha and others at Apple Computer's Advanced Technology Group developing the Meta Content Framework (MCF). Other early work on XML syndication formats, including RDF, took place at Netscape, Userland Software, and Microsoft. For a more detailed discussion of these early developments, see history of web syndication technology

RDF Site Summary, the first version of RSS, was created by Ramanathan V. Guha of Netscape in March 1999 for use on the My Netscape portal. This version became known as RSS 0.9.

In July 1999, responding to comments and suggestions, Dan Libby produced a prototype tentatively named RSS 0.91 (RSS standing for Rich Site Summary), that simplified the format and incorporated parts of Dave Winer's Scripting News format. This they considered an interim measure, with Libby suggesting an RSS 1.0-like format through the so-called Futures Document.

In April 2001, in the midst of AOL's acquisition and subsequent restructuring of Netscape properties, a re-design of the My Netscape portal removed RSS/XML support. The RSS 0.91 DTD was removed during this re-design, but in response to feedback, Dan Libby was able to restore the DTD, but not the RSS validator previously in place. In response to comments within the RSS community at the time, Lars Marius Garshol, to whom (co?)authorship of the original 0.9 DTD is sometimes attributed, commented, "What I don't understand is all this fuss over Netscape removing the DTD. A well-designed RSS tool, whether it validates or not, would not use the DTD at Netscape's site in any case. There are several mechanisms which can be used to control the dereferencing of references from XML documents to their DTDs. These should be used. If not the result will be as described in the article."

Effectively, this left the format without an owner, just as it was becoming widely used.

A working group and mailing list, RSS-DEV, was set up by various users and XML notables to continue its development. At the same time, Winer unilaterally posted a modified version of the RSS 0.91 specification to the Userland website, since it was already in use in their products. He claimed the RSS 0.91 specification was the property of his company, UserLand Software. Since neither side had any official claim on the name or the format, arguments raged whenever either side claimed RSS as its own, creating what became known as the RSS fork.

The RSS-DEV group went on to produce RSS 1.0 in December 2000Like RSS 0.9 (but not 0.91) this was based on the RDF specifications, but was more modular, with many of the terms coming from standard metadata vocabularies such as Dublin Core.

Nineteen days later, Winer released by himself RSS 0.92,a minor and supposedly compatible set of changes to RSS 0.91 based on the same proposal. In April 2001, he published a draft of RSS 0.93 which was almost identical to 0.92.A draft RSS 0.94 surfaced in August, reverting the changes made in 0.93, and adding a type attribute to the description element.

In September 2002, Winer released a final successor to RSS 0.92, known as RSS 2.0 and emphasizing "Really Simple Syndication" as the meaning of the three-letter abbreviation. The RSS 2.0 spec removed the type attribute added in RSS 0.94 and allowed people to add extension elements using XML namespaces. Several versions of RSS 2.0 were released, but the version number of the document model was not changed.

In November 2002, The New York Times began offering its readers the ability to subscribe to RSS news feeds related to various topics. In January, 2003, Winer called the New York Times' adoption of RSS the "tipping point" in driving the RSS format's becoming a de facto standard.

In July 2003, Winer and Userland Software assigned ownership of the RSS 2.0 specification to his then workplace, Harvard's Berkman Center for the Internet & Society.

In January 2005, Sean B. Palmer, Christopher Schmidt, and Cody Woodard produced a preliminary draft of RSS 1.1. It was intended as a bugfix for 1.0, removing little-used features, simplifying the syntax and improving the specification based on the more recent RDF specifications. As of July 2005, RSS 1.1 had amounted to little more than an academic exercise.

In April 2005, Apple Computer released Safari 2.0 with RSS Feed capabilities built in. Safari delivered the ability to read RSS feeds, and bookmark them, with built-in search features. Safari's RSS button is a blue rounded rectangle with RSS written inside in white. The favicon displayed defaults to a newspaper icon .

In November 2005, Microsoft proposed its Simple Sharing Extensions to RSS.[14]

In December 2005, the Microsoft IE team and Outlook team announced in their blogs that they will be adopting the feed icon first used in the Mozilla Firefox browser, effectively making the orange square with white radio waves the industry standard for both RSS and related formats such as Atom. Also in February 2006, Opera Software announced they too would add the orange square in their Opera 9 release.

In January 2006, Rogers Cadenhead relaunched an RSS Advisory Board with a view to continuing the development of the RSS format and resolving ambiguities. In June 2007, the board revised their version of the specification to confirm that namespaces may extend core elements with namespace attributes, as Microsoft has done in Internet Explorer 7. In their view, a difference of interpretation left publishers unsure of whether this was permitted or forbidden. No press account of the differences between the Winer spec and the Cadenhead spec for RSS 2.0 is included in this article's references, though blog searches in May, 2007 found private opinions that the two specs were very similar.

Incompatibilities

As noted above, there are several different versions of RSS, falling into two major branches (RDF and 2.*). The RDF, or RSS 1.* branch includes the following versions:

  • RSS 0.90 was the original Netscape RSS version. This RSS was called RDF Site Summary, but was based on an early working draft of the RDF standard, and was not compatible with the final RDF Recommendation.
  • RSS 1.0 is an open format by the RSS-DEV Working Group, again standing for RDF Site Summary. RSS 1.0 is an RDF format like RSS 0.90, but not fully compatible with it, since 1.0 is based on the final RDF 1.0 Recommendation.
  • RSS 1.1 is also an open format and is intended to update and replace RSS 1.0. The specification is an independent draft not supported or endorsed in any way by the RSS-Dev Working Group or any other organization.

The RSS 2.* branch (initially UserLand, now Harvard) includes the following versions:

  • RSS 0.91 is the simplified RSS version released by Netscape, and also the version number of the simplified version championed by Dave Winer from Userland Software. The Netscape version was now called Rich Site Summary, this was no longer an RDF format, but was relatively easy to use. It remains the most common RSS variant.
  • RSS 0.92 through 0.94 are expansions of the RSS 0.91 format, which are mostly compatible with each other and with Winer's version of RSS 0.91, but are not compatible with RSS 0.90. In all Userland RSS 0.9x specifications, RSS was no longer an acronym.
  • RSS 2.0.1 has the internal version number 2.0. RSS 2.0.1 was proclaimed to be "frozen", but still updated shortly after release without changing the version number. RSS now stood for Really Simple Syndication. The major change in this version is an explicit extension mechanism using XML Namespaces.

For the most part, later versions in each branch are backward-compatible with earlier versions (aside from non-conformant RDF syntax in 0.90), and both versions include properly documented extension mechanisms using XML Namespaces, either directly (in the 2.* branch) or through RDF (in the 1.* branch). Most syndication software supports both branches. Mark Pilgrim's article "The Myth of RSS Compatibility" discusses RSS version compatibility in more detail.

The extension mechanisms make it possible for each branch to track innovations in the other. For example, the RSS 2.* branch was the first to support enclosures, making it the current leading choice for podcasting, and as of mid-2005 is the format supported for that use by iTunes and other podcasting software; however, an enclosure extension is now available for the RSS 1.* branch, mod_enclosure. Likewise, the RSS 2.* core specification does not support providing full-text in addition to a synopsis, but the RSS 1.* markup can be (and often is) used as an extension. There are also several common outside extension packages available, including a new proposal from Microsoft for use in Internet Explorer 7.

The most serious compatibility problem is with HTML markup. Userland's RSS reader—generally considered as the reference implementation—did not originally filter out HTML markup from feeds. As a result, publishers began placing HTML markup into the titles and descriptions of items in their RSS feeds. This behavior has become widely expected of readers, to the point of becoming a de facto standard, though there is still some inconsistency in how software handles this markup, particularly in titles. The RSS 2.0 specification was later updated to include examples of entity-encoded HTML, however all prior plain text usages remain valid.

Atom

In reaction to recognized issues with RSS (and because RSS 2.0 is frozen), a third group began a new syndication specification, Atom, in June 2003. Their work was later adopted by the Internet Engineering Task Force (IETF) leading to the publication of a specification (RFC 4287) for the Atom Format in 2005. Work on the Atom Publishing Protocol, a standards-based protocol for posting to publishing tools is ongoing.

The relative benefits of Atom in comparison to the two RSS branches are a matter of debate within the Web-syndication community. Supporters of Atom claim that it improves on RSS by relying on standard XML features, by specifying a payload container that can handle many different kinds of content unambiguously, and by having a specification maintained by a recognized standards organization. Critics claim that Atom unnecessarily introduces a third branch of syndication specifications, further confusing the marketplace.

Atom aims to define both a syntax and a protocol for updating user blogs and thus goes beyond the simple remit of RSS. While this is appealing to many users, particularly those in the blogging community, it has been met with resistance in the professional community (mainly publishers) due to its lack of extensibility.[15]

For a comparison of Atom 1.0 to RSS 2.0 see Atom Compared to RSS 2.0.

Modules

The primary objective of all RSS modules is to extend the basic XML schema established for more robust syndication of content. This inherently allows for more diverse, yet standardized, transactions without modifying the core RSS specification.

To accomplish this extension, a tightly controlled vocabulary (in the RSS world, "module"; in the XML world, "schema") is declared through an XML namespace to give names to concepts and relationships between those concepts.

Some RSS 2.0 modules with established namespaces:

  • Ecommerce RSS 2.0 Module
  • Media RSS 2.0 Module
  • OpenSearch RSS 2.0 Module

BitTorrent and RSS

The peer-to-peer application BitTorrent has also announced support for RSS. Such feeds (also known as Torrent/RSS-es or Torrentcasts) will allow client applications to download files automatically from the moment the RSS reader detects them (also known as Broadcatching). Most common BitTorrent clients already offer RSS support.