Plugin: CSVGrab

Dom Stubbs

156 posts

17 years ago

Dom Stubbs

I’ve got to import several thousand entries into EE and thought I’d give CSVGrab a try. Initially I broke the data into 1000 line chunks and have gone right the way down to a single line of data. Despite that I’m having the very same problem as David and the page loads for a good few minutes before timing out. I’ve tried adding some debugging messages to the plugin and flushing the output buffer but it appears to have no affect within plugins (or perhaps that’s a symptom of whatever the problem is).

If anyone had this problem and found a solution I’d love to hear it.

Thanks, Dom

David Webb

62 posts

17 years ago

David Webb

Hey Dom.

My problem was in how the external file was being referenced. My host didn’t allow php to load the file in the way csvgrab asks for it. I just replaced the http file call with a server path call and it all worked.

{exp:csvgrab url="/home/directory/directory/bands.csv"

Hope this helps.

Andrew Weaver

206 posts

17 years ago

Andrew Weaver

Hi Dom,

Thanks for the post. It is always appreciated when people tell me of a problem (even if my first reaction is less positive).

Out of interest, from your debugging could you tell whether the CSV was getting loaded? I think some web hosts have tighter PHP security than others, which prevent the file loading.

Andrew

Andrew Weaver

206 posts

17 years ago

Andrew Weaver

Thanks David. Dom - could you try David’s suggestion when you get a chance and let me know if it helps.

I’ll look into improving the code to get remote files.

Dom Stubbs

156 posts

17 years ago

Dom Stubbs

Thanks for the quick responses guys. Unfortunately I think in an all-time first I really wish you hadn’t, as as soon as I submitted the post I scanned through my code again and realised that a closing bracket was missing from the plugin tag. I could’ve sworn that I put one in there after the 50 item field list but apparently not! Anyway, no chance to delete my post so I’ll have to admit to it. 😛

I’m now in the process of importing nearly 22,000 entries, each with around 50 fields. So far the plugin has been faultless and I can’t thank you enough Andrew for providing such a time saving tool. One change I have had to make which may be useful to others, is that the duplicate title checking was a problem for me as I’m importing a huge list of businesses, some of which do have the same name. As a result the standard behaviour of the plugin was ignoring 6 out of the 7 businesses named ‘7 Eleven’ for instance.

To disable this just change the following on around line 148:

if ($query->num_rows == 0) {

To:

if ($query->num_rows == 0 or $query->num_rows > 0) {

(Yes you could also remove the if statement entirely, but this is a little easier than digging around for the closing bracket as well, and we all know what trouble I have with them.)

Also if anyone else has so much data they need to split it across several files, here’s a semi-batch import solution which may be of use:

<?php
    print "<h1>IMPORTING DATA</h1>";
    ob_flush();
    flush();
    $nextpage = "{segment_3}" + 1;
?>

{exp:csvgrab 
    url="http://yoursite.com/data/{segment_3}.txt"
    otherparams
}



<a href="http://{path=import_debug/csvgrab/<?=$nextpage?>}">Next File</a>

Finally, again if you need to split a large file, I used GSplit which is free and lets to specify exactly how many lines you want per file.

Hopefully that’s all of use to others if they need to do something similar in the future.

P.S. Small world, David. I’m in Portsmouth and have been to a fair chunk of the venues on your photos site. Nice site it is too btw!

David Webb

62 posts

17 years ago

David Webb

Glad you got your problem sorted Dom, and thanks for the kind words. If you ever see me at a gig come say hi.

Brian M.

529 posts

17 years ago

Brian M.

If you’ve changed it so it will accept entries with the same title - are you going to have entries with the same url_title? That seems like that could be a real problem. Thought I’d mention that although you’ve probably thought of it already…

Dom Stubbs

156 posts

17 years ago

Dom Stubbs

Thanks guys. The URL title is indeed a concern but I’m not sure at this stage if we’ll be using them at all. If we are then it should be relatively simple to come up with a solution that runs through the DB and appends an auto_increment value to any duplicate titles. If we’re not I don’t think there should be any negative effects, but I will double and triple check that.

Andrew Weaver

206 posts

17 years ago

Andrew Weaver

The url issue will exist with the current version of the plugin.

When I finally get the new version out the url will automatically auto increment.

Dom Stubbs

156 posts

17 years ago

Dom Stubbs

Oh absolutely, I’m aware that I will have duplicate url titles currently, it’s just whether or not that’s actually a concern at this point. If we’re not going to use them I wouldn’t imagine it will matter.

Great to hear you’re adding an auto increment feature in the future though.

Thanks again, Dom

Visiluna

92 posts

17 years ago

Visiluna

EDIT –

Apparently, while my “Reply to Post” window was open overnight, there was a whole bunch of new posts addressing this exact issue.

So, please feel free to pretend that I’m not an idiot who doesn’t read before posting.

I will, however, leave the post here to make my specific thoughts available.

Thanks, Philip Jones

END EDIT –

Andrew,

Do you have any plans to update this plugin?

In peeking at the code I noticed that it checks to see if an entry’s title is unique before inserting it. It also does not restrict the query to any particular weblog.

It is my understanding that duplicate titles are acceptable, just not duplicate URL Titles within the same weblog.

Could this be updated to check “url_title” for duplicates instead of “title”, as well as adding the “weblog_id” to the comparison?

And as a feature request, it would be nice if it could append a number to create a unique URL Title when a duplicate is found, instead of skipping duplicate URL Titles. As some may not desire this behavior, this should probably be optional.

Thanks, Philip Jones

mhulse

329 posts

17 years ago

mhulse

Thanks for this plugin! I was able to suck-in several csv entries and assign categories very easily. This plugin has been a life saver!

Many thanks for sharing it with us! 😊

Cheers, M

Vangroovy

4 posts

17 years ago

Vangroovy

Another question that I hope is relevant here as well: is there a way to explicitly specify the entry_id value when importing?

Willem de Boer

142 posts

17 years ago

Willem de Boer

Also a question:

Is there a possibility to import entries, and do that everyday and overwrite the entries that are allready in. For me it’s handy to import everyday our catalog from our CRM to the website and update the entries that are edited and add the new ones

Rob Quigley

236 posts

17 years ago

Rob Quigley

I noticed that I can add categories based on

{exp:csvgrab url="http://URL.csv"                                                  
                          weblog="3"         
                          title="1"
                          delimiter=","
                          encloser="QUOTE"
                          category_field="9"
                          category_group="1" 
                          use="2|3|4|5|6|7|8|9|10"
                          fields="last|first|jobtitle|webaddress|emailaddress|office|doffice|zip|dist"   }

But after adding the entries they are not actually attached to the categories that were added.

I could have sworn that this worked the first time I tested this but not since…

It doesn’t make sense to add category functionality in adding them if they can’t also automatically be selected due the large number of categories I have and number of entries.