We use cookies to improve your experience. No personal information is gathered and we don't serve ads. Cookies Policy.

ExpressionEngine Logo ExpressionEngine
Features Pricing Support Find A Developer
Partners Upgrades
Blog Add-Ons Learn
Docs Forums University
Log In or Sign Up
Log In Sign Up
ExpressionEngine Logo
Features Pro new Support Find A Developer
Partners Upgrades
Blog Add-Ons Learn
Docs Forums University Blog
  • Home
  • Forums

Plugin: FeedGrab

Development and Programming

scottb's avatar
scottb
343 posts
16 years ago
scottb's avatar scottb

I am researching the possibility of building a portal for an organization. The site would grab about 200 different RSS feeds daily. Some of them will need to be updated hourly. Drupal has extensive aggregator functions, but I would prefer to build this site in EE.

I can’t use Magpie because the pages strain under pulling multiple feeds, but FeedGrab may be able to do what I want. One problem I need to overcome is the automation of grabbing feeds. I have seen in another post a reference to using the EE cron plugin with FeedGrab. Based on what I read, my code might look something like this:

{exp:cron minute="30" hour="" day="" month="*" plugin="feedgrab:FeedGrab"}

{exp:feedgrab url=”path_to_rss_feed” weblog=”id_number” title=”title” date=”pubDate” use=”link|description” fields=”rss_url|rss_body” }

{/exp:cron}

The cron runs once an hour every day and every month. I have the impression from the cron documentation that I need to define both the name of the plugin and the function that needs to be called.

Unfortunately, this doesn’t work, and other variations don’t work, either. Can anyone offer suggestions?

       
Mike Essl's avatar
Mike Essl
8 posts
16 years ago
Mike Essl's avatar Mike Essl

Hello, I’m getting duplicate entries, but only on the 3 newest entries:

Here is my code:

{exp:feedgrab url="http://api.flickr.com/services/feeds/photos_public.gne?id=96729633@N00&tags=unexpectedlyquitcom&lang=en-us&format=rss_200" 
weblog="1" 
title="title"
date="pubDate"
use="media:content@url|media:content@height|media:content@width|link" 
fields="flickrimageurl|flickrimageheight|flickrimagewidth|flickrurl"
unique="date,flickrurl”}

I thought adding the unique setting would fix it but no dice. Even weirder is that it doesn’t always do it.

You can see the site here: http://www.unexpectedlyquit.com

       
J. Hull's avatar
J. Hull
132 posts
16 years ago
J. Hull's avatar J. Hull

Great news!

At least for anyone interested in grabbing lower-level information from RSS feeds (like klick and me above).

I figured out a way for FeedGrab to be able to grab this information by using two Yahoo Pipes. For my purposes I was able to grab out my FriendFeed comments and feed them into my database.

The whole story, with links to the Pipes so you can mash your own, can be found here:

Integrating My FriendFeed Comments Into My Personal Blog

Hope this saves someone time in the future.

       
SSM's avatar
SSM
33 posts
16 years ago
SSM's avatar SSM

what if you don’t want all those RSS feeds stored in a weblog? I want to import my message board feed, but I have no desire to store thousands of message board thread titles. I just want the last 5 threads on my front page.

Is there a solution for this?

       
J. Hull's avatar
J. Hull
132 posts
16 years ago
J. Hull's avatar J. Hull

I think you would use Magpie for that (built into EE if I’m not mistaken).

I think the difference is:

FeedGrab is for taking the information in RSS feeds and putting them in your weblog. Magpie is for simply displaying RSS feeds within your template (no information copied over).

I could be wrong, but that is my understanding.

       
Hop Studios's avatar
Hop Studios
459 posts
16 years ago
Hop Studios's avatar Hop Studios

I’m getting duplicate entries sometimes, like Mike Essl above. Any suggestions on how to debug?

TTFN Travis

       
Andrew Weaver's avatar
Andrew Weaver
206 posts
16 years ago
Andrew Weaver's avatar Andrew Weaver

I’m not sure why you are getting duplicates. This is how I import flickr feeds:

{exp:feedgrab
    url="http://api.flickr.com/services/feeds/photos_public.gne?id=25509357@N00&format=rss_200" 
    weblog="1" 
    title="title"
    date="dc:date.Taken"
    use="link|media:content@url|media:thumbnail@url|description|guid" 
    fields="flickr_link|flickr_image|flickr_thumbnail|flickr_description|flickr_guid" 
    unique="flickr_guid"
    category_field="media:category"
    category_group="2"
    category_delimiter="SPACE" 
}

I use the guid field as the unique value. Let me know if this helps.

       
J. Hull's avatar
J. Hull
132 posts
16 years ago
J. Hull's avatar J. Hull

Just so everyone knows, now that the excitement has died down with my whole Yahoo Pipes solution in combination with FeedGrab - it’s not really working out as well as I thought. There is some caching issue with Yahoo Pipes that isn’t allowing the feeds to be updated. Consequently it doesn’t work so well.

I’m hoping XMLGrab will be the solution, if not I’ll have to write my own plugin (which I’ve never done!)

       
Hoosteeno's avatar
Hoosteeno
109 posts
16 years ago
Hoosteeno's avatar Hoosteeno

Regarding Travis’s duplicate issue above (http://ellislab.com/forums/viewthread/37598/P144/#433002):

It appears that the text of values inserted into the database is urlencoded. If you’re using a URL value as your unique identifier, and if the URL includes certain characters, is_entry_unique() will always return true, even if that URL is already present in the database (“%2b” does not equal “+”, for example).

We added a urldecode to the is_entry_unique() function (at appx. line 586-588):

// MODIFIED BY JUSTIN CRAWFORD
//$sql .= " AND " . $name . "=\"" . $DB->escape_str( $post[ $value ] ) . "\"";
$sql .= " AND " . $name . "=\"" . $DB->escape_str( urldecode( $post[ $value ] ) ) . "\"";

-Justin

       
SSM's avatar
SSM
33 posts
16 years ago
SSM's avatar SSM

I have 36,000 RSS entries I would like to import. Any advice on how to approach this?

       
scottb's avatar
scottb
343 posts
16 years ago
scottb's avatar scottb

Update: You can ignore the following post. I found what I needed to fix it.

I built out test pages on our existing site (version 1.6.3), and everything worked great. Then I copied the pages over to a new site (v1.6.4) along with the plugin. Suddenly on the new site I’m getting url output errors.

This code:

{title}

<div class=”rss_description”>{rss_body}</div>

Works correctly on the first site. But on the second site, {rss_url} is outputting a complete path including the <a > so that I end up with two <a >s.

I have looked over the admin options for setting url output but don’t see anything that would indicate why I’m getting the errors. Any suggestions?

       
klick's avatar
klick
49 posts
16 years ago
klick's avatar klick
And I’m with klick above - need to be able to access that lower level data in FriendFeed AND be able to extract the two different titles from Google Reader shared items which currently look like this:

still haven’t figuered that out in detail. 😉

plus having another problem: my delicious feed keeps sticking empty entries (all with timestamp 0100) between the other weblog entries when updating the “plugin call” template.

it also echoes this error code:

Notice: strtotime() [function.strtotime]: Called with empty time parameter in …/xx/plugins/pi.xmlgrab.php on line 489

this causes my feeds to break. if anyone has got a clue what’s happening let me know 😊

ah and the twitter stream has some character encoding problems. i’m on that.

besides that: wonderful plugin! thanks!

best klick

       
chabbs's avatar
chabbs
12 posts
16 years ago
chabbs's avatar chabbs

Does this look right?

{exp:cron minute="30" hour="*" day="*" month="*" plugin="feedgrab:FeedGrab"}
{exp:feedgrab url="my-rss" 
                          weblog="70" 
                          title="title"
                          date="pubDate"
                          use="link|description" 
                          fields="music-url|music-body" }
{/exp:cron}

I’ve been trying to figure out how the cron plugin works with FeedGrab for the last 3 hours and I haven’t been able to find an answer. Can someone please tell me if this is right because it doesn’t seem to work?

Thanks for any help.

       
ms's avatar
ms
274 posts
16 years ago
ms's avatar ms

While I can’t tell you if the parameters are correct, I can tell you that the ee:cron only works if the page is visited frequently. Its not a true cron, but just checks the timespan between the page views. so, if you have a test page nobody is visiting, ee:cron will never fire regardless of the time passed by. I use a real cron job to make sure the page is visited regularly.

Markus

       
chabbs's avatar
chabbs
12 posts
16 years ago
chabbs's avatar chabbs

Thanks Markus. I guess I’ll have to look into using a real cron job.

       
First 9 10 11 12 13 Last

Reply

Sign In To Reply

ExpressionEngine Home Features Pro Contact Version Support
Learn Docs University Forums
Resources Support Add-Ons Partners Blog
Privacy Terms Trademark Use License

Packet Tide owns and develops ExpressionEngine. © Packet Tide, All Rights Reserved.