Hi Roberto Good to hear you solved it. And indeed the right way ;) Without further knowledge on the sizes and timing involved, I can only guess that if the BATs are large and may have to be removed from the directories (i.e. system calls) The general mechanism is to use BBPunfix(), provided the reference counts are properly set during the operations (to be checked for your case). On 25/01/2018 22:08, Roberto Cornacchia wrote:
Thanks Martin,
Actually I was doing that in my original code, and grouped all the BBPreclaim at the end just to measure them easier. It makes no difference in my case.
The issue seems simply to be that BBPreclaim is a more expensive operation than I thought, so calling it many times kills performance. I was hoping that there could be a similar but cheaper function that could be sufficient in this case. I'm always confused with BBPrelclaim, BBPrelease, BBPunfix.
I actually solved my problem the right way, rewriting the whole thing to work in larger batches. This brings the total time to 15% of the original one.
Roberto
On Thu, 25 Jan 2018 at 20:53 Martin Kersten
mailto:martin.kersten@cwi.nl> wrote: try to reclaim a.s.a.p., this reduces resource competitions. martin
Sent from my iPad
On 25 Jan 2018, at 19:22, Roberto Cornacchia
mailto:roberto.cornacchia@gmail.com> wrote: One obvious optimization was to avoid the explicit BATproject and use the candidate inut in BATappend, when possible. This improved things and now the clean up takes 3/4 of the total time.
Still...
loop(...) { BAT *gn, *en, *hn; BAT *idxn, *pn;
// tokenize, group, transform, append // this takes 1/4 of the total loop time if (BATutf8_tokenize(&tokens, s, delims, min_tok_len) != GDK_SUCCEED) goto fail; if (BATgroup(&gn, &en, &hn, tokens, NULL, NULL, NULL, NULL) != GDK_SUCCEED) goto fail; idxn = BATconstant(0, TYPE_int, &idx, BATcount(en), TRANSIENT); pn = BATconvert(hn, NULL, TYPE_dbl, TRUE); if (BATappend(br1,idxn,NULL,FALSE) != GDK_SUCCEED) goto fail; if (BATappend(br2,tokens,en,FALSE) != GDK_SUCCEED) goto fail; if (BATappend(br3,pn,NULL,FALSE) != GDK_SUCCEED) goto fail;
// this takes 3/4 of the total loop time BBPreclaim(idxn); BBPreclaim(pn); BBPreclaim(hn); BBPreclaim(gn); BBPreclaim(en); BBPreclaim(tokens); }
On Thu, 25 Jan 2018 at 18:26 Roberto Cornacchia
mailto:roberto.cornacchia@gmail.com> wrote: Hi there,
Would someone like to pick up this wonderful opportunity to tell me that I'm doing something silly? Please do.
In this loop, the cleanup take 4/5 of the total time. Is there a better way of doing this?
loop(...) { BAT *gn, *en, *hn; BAT *idxn, *tn, *pn;
// tokenize, group, transform, append // this takes 1/5 of the total loop time if (BATutf8_tokenize(&tokens, s, delims, min_tok_len) != GDK_SUCCEED) goto fail; if (BATgroup(&gn, &en, &hn, tokens, NULL, NULL, NULL, NULL) != GDK_SUCCEED) goto fail; idxn = BATconstant(0, TYPE_int, &idx, BATcount(en), TRANSIENT); tn = BATproject(en,tokens); pn = BATconvert(hn, NULL, TYPE_dbl, TRUE); if (BATappend(br1,idxn,NULL,FALSE) != GDK_SUCCEED) goto fail; if (BATappend(br2,tn,NULL,FALSE) != GDK_SUCCEED) goto fail; if (BATappend(br3,pn,NULL,FALSE) != GDK_SUCCEED) goto fail;
// this takes 4/5 of the total loop time BBPreclaim(idxn); BBPreclaim(tn); BBPreclaim(pn); BBPreclaim(hn); BBPreclaim(gn); BBPreclaim(en); BBPreclaim(tokens); }
_______________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list