da: (bit)
[personal profile] da
I've given up on using LJ for my RSS feeds. I've got 88 of them, which means I sometimes don't see real people posts for pages and pages. I'm jumping to Google Reader. Google Reader will import "OPML" data, so that's how I wanted to do the transfer. This is a how-to.

(If you are hoping to read your LJ friends list directly from an RSS reader, you might instead try this exporter, which grabs the necessary data for friends- it doesn't include feeds and is beyond the scope of this how-to.)

The short version:

1) if you're handy with Perl, grab the code at the bottom of this entry, change the user name, and run the code.
1b) if you're not handy with Perl, give me a shout and I'll run it for you on my server. :)

2) Output is a list of RSS URLs. To translate these to OPML, feed them to this page. Copy and paste them into the big text box, then hit the "Create OPML" link. Seconds later, you will have an output file, which you should save to disk (the file name doesn't matter).

3) Optionally, open the file in a text editor and change the "title" parts from the URL into a sensible title for each feed. Yeah, I was too lazy to write my own OPML and fix that.

4) in Google Reader, the left-hand lower corner, choose "Manage Subscriptions". Choose "Import/Export". Browse and upload your OPML file.

Done!

So far, I like the google reader interface, and now I can actually pay attention to the real people on my list who do still post. (I appreciate y'all! I did this for yoooou!)

---
Don't bother reading the rest unless you want technical details; mostly here for google searching. Let me know if this helped anybody!

The code:

#!/usr/bin/perl

use strict;
use WWW::Mechanize;

my $base_url = "http://da-lj.livejournal.com/profile";

my $m = WWW::Mechanize->new ( autocheck => 1 );

$m->get( $base_url );

my $profile_html = $m->content;
my @feed_lines = ($profile_html =~ /watchingfeeds_body.*/g);
my @feed_urls = ($feed_lines[0] =~ /href='(.*?)'/g);

foreach my $lj_url (@feed_urls) {
    $m->get( $lj_url );
    print $m->find_link( text => 'XML' )->url() . "\n";
}


And that's it. I'm using WWW::Mechanize, which is the bee's knees if you have to do screen-scraping in Perl.

I started off with a manual grab of my "watching" page, a word-processor search-and-replace, and was about to run a batch of 'wget's to grab the lj-feed pages when I realized it would be quicker in perl.

The biggest drawback to this method is the cost of installing WWW::Mechanize in the first place. CPAN makes it easy(ish) but it has a tonne of dependencies. ...I guess it's just one step if you're on a reasonably recent Debian/Ubuntu.

Anyhow.

Date: Friday, 27 May 2011 02:44 am (UTC)
From: [identity profile] da-lj.livejournal.com
I guess I could've used the source from the Friend of a Friend exporter as a template to do it entirely in python.

Dang, that would have been fun, too. *shrug*
Edited Date: Friday, 27 May 2011 02:45 am (UTC)

Date: Friday, 27 May 2011 03:24 am (UTC)
From: [identity profile] thingo.livejournal.com
Wait. How does this deal with friends-locked posts that would otherwise require a password in LJ? (I can't tell from your description whether it all Just Works.)

Date: Sunday, 29 May 2011 07:02 pm (UTC)
From: [identity profile] da-lj.livejournal.com
eep, it seems this comment never made it to my mail-reader. Sorry for lack of followup...

For my situation (moving RSS feeds off of LJ to another reader) it doesn't come up.

The FOAF to OPML link says:

If you check the auth checkbox, the OPML file will contain links that will let you read private (friends only) messages. Not all feed aggregators support this feature — check with your product documentation.

After experimentation, I can say that google reader does NOT pick up the locked posts generated by that OPML file.
(format: "http://whomever.livejournal.com/data/atom?auth=digest)

However, if I visit the locked posts URL myself, Firefox prompts me for basic authentication, and then shows me the feed. So, I can see how in principle, you authenticate to LJ, a fancier feed-reader would then allow you to download the feeds.

Sadly I don't have data on other feed-readers that would work. Though, the one built into firefox MIGHT be sufficient...

Date: Friday, 27 May 2011 04:58 am (UTC)
From: [identity profile] http://users.livejournal.com/merle_/
You da(_lj) man!

I have the same question as [livejournal.com profile] thingo. I'd been looking into Mechanize for work purposes but did not see a trivial way to add in authentication. My best guess was to start the script by logging in with hardcoded password and then start making requests using the cookie it returned, but.. eeeew.

Date: Sunday, 29 May 2011 07:08 pm (UTC)
From: [identity profile] da-lj.livejournal.com
See my comment to [livejournal.com profile] thingo above...

WWW::Mechanize *could* do that authentication as you suggest. Yes, sort of ew.

http://www.livejournal.com/support/faqbrowse.bml?faqid=306

gave me another idea, which is to feed your browser's LJ cookie directly to WWW::Mechanize. Let me know if you get that to work!

Date: Sunday, 29 May 2011 09:59 pm (UTC)
From: [identity profile] http://users.livejournal.com/merle_/
I think you could feed a cookie in (and was how I was thinking of doing it), but the problems are:
- the cookies are tied to an IP and change in a serial manner
- every now and then LJ just logs you out
- if you have a slow connection you can see in the status bar that it goes through a bunch of redirects

None of those is insurmountable but it makes for a fragile system that will need constant maintenance of that cookie.

December 2024

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
293031    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Saturday, 12 July 2025 09:42 pm
Powered by Dreamwidth Studios