Thank you Stefan,

So as I understand that the algebra.(left)fetchjoin() is the most optimized way in monetdb.

I was hoping in my case to gain some precious seconds by trying to instruct monetdb to use other algorithm,

I have a small table around 50k rows (that should fit easily in memory) that need to be joined with a big table of 12M rows, the join results is around 150M rows.

The join key is a char(21) column which is the first column of a primary key defined on both tables (I think this why its reflected in the algebra.(left)fetchjoin()).

On the join I am doing an aggregate on a combination of columns from both tables.

The execution time is around 80s, returning 250 rows, similar queries on different columns needs to be executed around 50 times.

I am trying to avoid to materialize the join results in a table and then execute the aggregation on the materialized table, the reason is that it will consume a 35GB of disk space.

Is it hard to achieve a better performance by relying only on the join?

Another performance issue i am facing is with range join on date columns, does monetdb has a specific method/index for this kind of join?

Regards.

On Mon, Nov 2, 2015 at 7:48 PM, Stefan Manegold <Stefan.Manegold@cwi.nl> wrote:

Hi,

whenever algebra.(left)fetchjoin() is used, you surely do not want to use hash-join or merge-join, instead.
fetch-join is MonetDB lingo for the binary match on OIDs that can exploit positional lookups because (at least)
one of the arguments is a dense (ascending) (non-materialized) sequence of OIDs (aka., "Virtual OIDs" or "VOID").
This implicitly performs a (very efficient) merge whenever also the second argument is sorted,
and otherwise implicitly a very efficient hash using identity as a perfect hash function (no collisions, no false positives).
This operation does not come from / represent a table-join on SQL level, but rather the projection (aka.
column reconstruction) required in (late-materialization) columnar query execution.

Having said that, in "real" join cases, i.e., when algebra.join() is used on value (rather than OID) columns,
the actual join algorithm is chosen on-the-fly in side the algebra.join() MAL operation, and thus does not
show in the MAL plan.
You can see (some of) the internal algorithm decisions by starting mserver5 with the --algorithm command line switch
(see the mserver5 man page for details).

There is no way to "overrule" the runtime algorithm decisions made inside MonetDB's kernel (GDK).

Best,
Stefan

----- On Nov 2, 2015, at 5:37 PM, imad hajj chahine imad.hajj.chahine@gmail.com wrote:

> Hi,
>
> How can we control the algorithm used for the join operation? In the trace i am
> always getting algebra.leftfetchjoin, what could possibly be done to use merge
> join or hash join that will be reflected in the trace by algebra.join.
>
> Thanks.
>

> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list

--
| Stefan.Manegold@CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list