da: (bit)
[personal profile] da
I've given up on using LJ for my RSS feeds. I've got 88 of them, which means I sometimes don't see real people posts for pages and pages. I'm jumping to Google Reader. Google Reader will import "OPML" data, so that's how I wanted to do the transfer. This is a how-to.

(If you are hoping to read your LJ friends list directly from an RSS reader, you might instead try this exporter, which grabs the necessary data for friends- it doesn't include feeds and is beyond the scope of this how-to.)

The short version:

1) if you're handy with Perl, grab the code at the bottom of this entry, change the user name, and run the code.
1b) if you're not handy with Perl, give me a shout and I'll run it for you on my server. :)

2) Output is a list of RSS URLs. To translate these to OPML, feed them to this page. Copy and paste them into the big text box, then hit the "Create OPML" link. Seconds later, you will have an output file, which you should save to disk (the file name doesn't matter).

3) Optionally, open the file in a text editor and change the "title" parts from the URL into a sensible title for each feed. Yeah, I was too lazy to write my own OPML and fix that.

4) in Google Reader, the left-hand lower corner, choose "Manage Subscriptions". Choose "Import/Export". Browse and upload your OPML file.

Done!

So far, I like the google reader interface, and now I can actually pay attention to the real people on my list who do still post. (I appreciate y'all! I did this for yoooou!)

---
Don't bother reading the rest unless you want technical details; mostly here for google searching. Let me know if this helped anybody!

The code:

#!/usr/bin/perl

use strict;
use WWW::Mechanize;

my $base_url = "http://da-lj.livejournal.com/profile";

my $m = WWW::Mechanize->new ( autocheck => 1 );

$m->get( $base_url );

my $profile_html = $m->content;
my @feed_lines = ($profile_html =~ /watchingfeeds_body.*/g);
my @feed_urls = ($feed_lines[0] =~ /href='(.*?)'/g);

foreach my $lj_url (@feed_urls) {
    $m->get( $lj_url );
    print $m->find_link( text => 'XML' )->url() . "\n";
}


And that's it. I'm using WWW::Mechanize, which is the bee's knees if you have to do screen-scraping in Perl.

I started off with a manual grab of my "watching" page, a word-processor search-and-replace, and was about to run a batch of 'wget's to grab the lj-feed pages when I realized it would be quicker in perl.

The biggest drawback to this method is the cost of installing WWW::Mechanize in the first place. CPAN makes it easy(ish) but it has a tonne of dependencies. ...I guess it's just one step if you're on a reasonably recent Debian/Ubuntu.

Anyhow.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org

December 2024

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
293031    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Tuesday, 15 July 2025 12:10 am
Powered by Dreamwidth Studios