Plugin: FeedGrab

scottb

343 posts

17 years ago

scottb

I am researching the possibility of building a portal for an organization. The site would grab about 200 different RSS feeds daily. Some of them will need to be updated hourly. Drupal has extensive aggregator functions, but I would prefer to build this site in EE.

I can’t use Magpie because the pages strain under pulling multiple feeds, but FeedGrab may be able to do what I want. One problem I need to overcome is the automation of grabbing feeds. I have seen in another post a reference to using the EE cron plugin with FeedGrab. Based on what I read, my code might look something like this:

{exp:cron minute="30" hour="" day="" month="*" plugin="feedgrab:FeedGrab"}

{exp:feedgrab url=”path_to_rss_feed” weblog=”id_number” title=”title” date=”pubDate” use=”link|description” fields=”rss_url|rss_body” }

{/exp:cron}

The cron runs once an hour every day and every month. I have the impression from the cron documentation that I need to define both the name of the plugin and the function that needs to be called.

Unfortunately, this doesn’t work, and other variations don’t work, either. Can anyone offer suggestions?

Mike Essl

8 posts

17 years ago

Mike Essl

Hello, I’m getting duplicate entries, but only on the 3 newest entries:

Here is my code:

{exp:feedgrab url="http://api.flickr.com/services/feeds/photos_public.gne?id=96729633@N00&tags=unexpectedlyquitcom&lang=en-us&format=rss_200" 
weblog="1" 
title="title"
date="pubDate"
use="media:content@url|media:content@height|media:content@width|link" 
fields="flickrimageurl|flickrimageheight|flickrimagewidth|flickrurl"
unique="date,flickrurl”}

I thought adding the unique setting would fix it but no dice. Even weirder is that it doesn’t always do it.

You can see the site here: http://www.unexpectedlyquit.com

J. Hull

132 posts

17 years ago

J. Hull

Great news!

At least for anyone interested in grabbing lower-level information from RSS feeds (like klick and me above).

I figured out a way for FeedGrab to be able to grab this information by using two Yahoo Pipes. For my purposes I was able to grab out my FriendFeed comments and feed them into my database.

The whole story, with links to the Pipes so you can mash your own, can be found here:

Integrating My FriendFeed Comments Into My Personal Blog

Hope this saves someone time in the future.

SSM

33 posts

17 years ago

SSM

what if you don’t want all those RSS feeds stored in a weblog? I want to import my message board feed, but I have no desire to store thousands of message board thread titles. I just want the last 5 threads on my front page.

Is there a solution for this?

J. Hull

132 posts

17 years ago

J. Hull

I think you would use Magpie for that (built into EE if I’m not mistaken).

I think the difference is:

FeedGrab is for taking the information in RSS feeds and putting them in your weblog. Magpie is for simply displaying RSS feeds within your template (no information copied over).

I could be wrong, but that is my understanding.

Hop Studios

459 posts

17 years ago

Hop Studios

I’m getting duplicate entries sometimes, like Mike Essl above. Any suggestions on how to debug?

TTFN Travis

Andrew Weaver

206 posts

17 years ago

Andrew Weaver

I’m not sure why you are getting duplicates. This is how I import flickr feeds:

{exp:feedgrab
    url="http://api.flickr.com/services/feeds/photos_public.gne?id=25509357@N00&format=rss_200" 
    weblog="1" 
    title="title"
    date="dc:date.Taken"
    use="link|media:content@url|media:thumbnail@url|description|guid" 
    fields="flickr_link|flickr_image|flickr_thumbnail|flickr_description|flickr_guid" 
    unique="flickr_guid"
    category_field="media:category"
    category_group="2"
    category_delimiter="SPACE" 
}

I use the guid field as the unique value. Let me know if this helps.

J. Hull

132 posts

17 years ago

J. Hull

Just so everyone knows, now that the excitement has died down with my whole Yahoo Pipes solution in combination with FeedGrab - it’s not really working out as well as I thought. There is some caching issue with Yahoo Pipes that isn’t allowing the feeds to be updated. Consequently it doesn’t work so well.

I’m hoping XMLGrab will be the solution, if not I’ll have to write my own plugin (which I’ve never done!)

Hoosteeno

109 posts

17 years ago

Hoosteeno

Regarding Travis’s duplicate issue above (http://ellislab.com/forums/viewthread/37598/P144/#433002):

It appears that the text of values inserted into the database is urlencoded. If you’re using a URL value as your unique identifier, and if the URL includes certain characters, is_entry_unique() will always return true, even if that URL is already present in the database (“%2b” does not equal “+”, for example).

We added a urldecode to the is_entry_unique() function (at appx. line 586-588):

// MODIFIED BY JUSTIN CRAWFORD
//$sql .= " AND " . $name . "=\"" . $DB->escape_str( $post[ $value ] ) . "\"";
$sql .= " AND " . $name . "=\"" . $DB->escape_str( urldecode( $post[ $value ] ) ) . "\"";

-Justin

SSM

33 posts

17 years ago

SSM

I have 36,000 RSS entries I would like to import. Any advice on how to approach this?

scottb

343 posts

17 years ago

scottb

Update: You can ignore the following post. I found what I needed to fix it.

I built out test pages on our existing site (version 1.6.3), and everything worked great. Then I copied the pages over to a new site (v1.6.4) along with the plugin. Suddenly on the new site I’m getting url output errors.

This code:

{title}

Works correctly on the first site. But on the second site, {rss_url} is outputting a complete path including the <a > so that I end up with two <a >s.

I have looked over the admin options for setting url output but don’t see anything that would indicate why I’m getting the errors. Any suggestions?

klick

49 posts

about 17 years ago

klick

And I’m with klick above - need to be able to access that lower level data in FriendFeed AND be able to extract the two different titles from Google Reader shared items which currently look like this:

still haven’t figuered that out in detail. 😉

plus having another problem: my delicious feed keeps sticking empty entries (all with timestamp 0100) between the other weblog entries when updating the “plugin call” template.

it also echoes this error code:

Notice: strtotime() [function.strtotime]: Called with empty time parameter in …/xx/plugins/pi.xmlgrab.php on line 489

this causes my feeds to break. if anyone has got a clue what’s happening let me know 😊

ah and the twitter stream has some character encoding problems. i’m on that.

besides that: wonderful plugin! thanks!

best klick

chabbs

12 posts

about 17 years ago

chabbs

Does this look right?

{exp:cron minute="30" hour="*" day="*" month="*" plugin="feedgrab:FeedGrab"}
{exp:feedgrab url="my-rss" 
                          weblog="70" 
                          title="title"
                          date="pubDate"
                          use="link|description" 
                          fields="music-url|music-body" }
{/exp:cron}

I’ve been trying to figure out how the cron plugin works with FeedGrab for the last 3 hours and I haven’t been able to find an answer. Can someone please tell me if this is right because it doesn’t seem to work?

Thanks for any help.

ms

274 posts

about 17 years ago

ms

While I can’t tell you if the parameters are correct, I can tell you that the ee:cron only works if the page is visited frequently. Its not a true cron, but just checks the timespan between the page views. so, if you have a test page nobody is visiting, ee:cron will never fire regardless of the time passed by. I use a real cron job to make sure the page is visited regularly.

Markus

chabbs

12 posts

about 17 years ago

chabbs

Thanks Markus. I guess I’ll have to look into using a real cron job.

Plugin: FeedGrab

Reply