Tuesday, 12 April 2011

Getting full articles in RSS

Most sites practice the annoying habit of truncating their RSS feeds. Their motive is simple: an article snippet forces you to visit the site to read the whole thing, so they get more traffic, and hopefully ad-clicks. Still, it is annoying, especially on my phone, where the sites aren't guaranteed to display well and loading them isn't fast either. Fortunately, there are sites and scripts that will take a given feed and send the complete articles to your favourite reader.

The first (and my preferred) option is Full Text RSS Feed Builder. No secrets here: it really does build full text RSS feeds. Head over to the site, enter the desired feed into the box, and hit "Submit". You'll be given a new feed with articles restored to their full text. When I first tried the site, it didn't seem to work with many feeds, but it now seems to work perfectly with Lifehacker, Wired Science, and Scientific American. It isn't perfect with BBC Science & Environment and Technology feeds, but it only consistently skips the audio and video entries, which I tend not to read.

After experimenting with FTRFB (for want of a better abbreviation) above, I wondered if there were Yahoo! Pipes that would accomplish the same thing. It appears that there are. Just search "full text" and try some of the entries. I haven't tested these as fully. It appears that some pipes work and some don't. Good luck.

Tuesday, 5 April 2011

Finding co-citations on NASA ADS

The NASA Astronomy Data Service is a marvellous tool. Not only is it the baseline for astronomical literature searches, it also locates an article in the cite-web. It's easy to obtain the list of articles that cite or are cited by the reference in question. This all means that the database can be used for all sorts of other interesting experiments, for which I have a few ideas. Here's the first.

Lately, I've been working closely from two papers. Though they are authored by different groups (that had collaborated before and have since), their content is very similar and they were published back-to-back in the Astrophysical Journal. What became interesting to me was this: what articles cite both papers?

To answer this question, I needed the lists of article identifiers ("bibcodes") for the citation lists of both papers. For each article, I navigated to the database page linked above, clicked "Refereed citations to the article" (or just "Citations to the article", if you prefer all of them), clicked "Select All Records" at the bottom, and requested the records in a custom format of %R. This returned a plain text list of the identifiers of all the refereed citations. I saved these lists to two plain text files. I concatenated them, sorted them, and found the identifiers that occurred twice in the combined list. These are the articles that appear twice and thus cite both articles. To get this list of identifiers back into a list of hits, I copied the list into the "Bibliographic Code Query" at this ADS query page.

In short then, all I did was get the lists of citations to each of my two articles, find the common hits, and locate them on ADS. The results revealed a few further papers that were worth going through. In addition, I think this kind of search reveals, from metadata alone, how much these papers have in common. The articles each have 176 and 167 of their own refereed citations. Of those, 112 papers cite both: 63% and 67%, respectively!

This process can probably be automated quite easily. Using embedded queries or the Perl module, I imagine an able scripter could quite easily write a short program that will find the list of common citations for two or more papers. What I don't know is how useful this would generally be. A more interesting application might be to rank papers based on how much they are co-cited with a selected reference, but that could be a big calculation because it would require a second step in the citation web. Still, if such a calculation is made only at regular intervals, it might still be useful.

This is one a few a few ADS-based ideas I have. Let me know if you think (or have found) it's useful or interesting, and watch out for more ideas further down the line.