I personally have a love-hate relationship with mitosis.

My experience is that overall it does provide nice performance improvements when considering a reasonably large spectrum of query / data combinations. That is just a very rough indication.

Do not expect it will always be beneficial though, because it won't be. As you have found yourself, it does slow down considerably certain query / data combinations.

Notice that I talk about query / data combinations, not just queries. Data splitting for parallel processing is a bet. The query plan becomes more complicated and it can happen that you end up doing much more work than you would do without splitting (not to mention that mitosis considers only very simple strategies for splitting).

An example of how things can go very wrong (8 minutes vs. 14 seconds) - just because of data distribution: https://www.monetdb.org/bugzilla/show_bug.cgi?id=3437

The current version of mitosis, together with the underlying data statistics available at optimization time, cannot do much better, I'm afraid.

If your data / query pool doesn't vary very much, I suggest you take a close look at what gets slow and why, and decide whether to keep mitosis or not. Or, better: decide which data / query patterns prefer mitosis and which not.

Regards,

Roberto

On 5 January 2015 at 11:36, Masood Mortazavi <masoodmortazavi@gmail.com> wrote:

Your question is somewhat confusing but here I share some of our experience related to the same topic.

Our experiments with MonetFB shows that the default mitosis optimization MonetDB provides (splitting the longest column in the execution plan to initiate plan graph mitosis) works quite well on multicore systems for typical queries.

We did some further work on adding an optimization module where the table(s) selected for "mitosis"-splitting (and number of splits -- always being kept to be less, in total, than the number of available cores) can vary according to some "optimization strategy" related to our estimate of total processing cost. In brief form, we published this work and some simplifications and findings last year.

Regards,
Masood Mortazavi

On Sunday, January 4, 2015, Vijay Krishna <vijayakrishna55@gmail.com> wrote:
Hi,

I have been working on join performances with MonetDB. I tried using various optimizer pipelines.

Few costly queries which took 15 seconds with the default pipeline, returned results as fast as 5 seconds with the 'no_mitosis_optimizer' pipeline.

I am looking to study more on mitosis optimizer. Is there any reference on what does it do?
From the mserver5 man page, I got this - "forcefully activate mitosis even on small tables, i.e., split small tables in as many (tiny) pieces as there are cores (threads) available;"
So, does this mean with the mitosis optimizer, the tables are split and processed? If so, then why are queries slower with mitosis optimizer in the pipeline?

Also, from the monetdb man page, I was alerted that "Changing this setting is discouraged at all times."
What is the disadvantage of changing the optimizer pipeline to something other than the 'default_pipe'? Though the 'no_mitosis_optimizer' is stable, is it worth for production?

Any help much appreciated.

Thanks & Regards,

Vijayakrishna.P.
Mobile : (+91) 9500402305.

_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list