This error is due that the "2-age" column has more than 2 distinct values,
there is a space after some values. can you run a trim on the "2-Age" and
then try it again.
As for the vector thing, I didnt get the point, shouldnt the db engine
split the data into vectors and feed the aggregate function which return
one data point?
I am running attached script, which I pieced together from your emails.
This does not run, it produces the following error: Error running R expression. Error message: Error in t.test.formula(arg1 ~
arg2) :
grouping factor must have exactly 2 levels
Calls: as.data.frame -> <Anonymous> -> t.test -> t.test.formula In general, for aggregations, the R code needs to return a vector with a
Thanks, if you need more information from my side let me know. Also can you answer me on the following when i use 2 functions one to retrieve Pvalue and one for the stat value:
retrieve Pvalue and one for the stat value:
Am I doing the correct way to retrieve the stat and p-value, is the
ttest get executed twice in this case? is there a more optimal way where i
is there a more optimal way where i can retrieve both information.
Hannes.Muehleisen@cwi.nl> wrote:
Looking into this at the moment.
dimitar.nedev@monetdbsolutions.com> wrote: Hannes, do you know if the STD_DEV is the issue there or if the
Dimitar
imad.hajj.chahine@gmail.com> wrote: Hi, Shouldn't we have the same result from running the first query and
the second query?
after all its an aggregate function and should return the test for
each group. I checked that monetdb had problem with running group by with
statistics functions like STD_DEV, are these 2 issues related? Thank you. On Mon, Jun 15, 2015 at 1:52 PM, Dimitar Nedev <
dimitar.nedev@monetdbsolutions.com> wrote:
Hi Imad, Here is the output I got from running the queries. Note, this might
not be entirely representative since I did clean-up some data before
loading it. sql>select "2-Age", "tone"."ttest2"("9-Score",6.08963) from
"tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" is not null group by "2-Age";
| 2-Age | L1 |
| [10-20] | 0.89311103869734754 |
3 tuples (3.218ms) sql>select '[10-20]' as "2-Age","tone"."ttest2"("9-Score",6.08963)
from "tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" = '[10-20]'
more>union all
more>select '[21-30]' as "2-Age","tone"."ttest2"("9-Score",6.08963)
from "tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" = '[21-30]'
more>union all
more>select '[> 30]' as "2-Age","tone"."ttest2"("9-Score",6.08963)
from "tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" = '> 30';
| 2-Age | L1 |
| [10-20] | 0.14150742355349472 |
| [21-30] | 0.32872830865003566 |
| [> 30] | 0.37127075363219841 |
3 tuples (8.825ms) sql>select "tone"."ttest2samplesStatistic"("9-Score", "4-Gender") as
"stat", "tone"."ttest2samples"("9-Score", "4-Gender") as "pvalue" from
"tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" group by
| stat | pvalue |
| -2.1588506639013736 | 0.032555033970368859 |
1 tuple (13.287ms) I dropped the aggregate with:
sql>DROP AGGREGATE "tone"."ttest2samplesStatistic";
operation successful (2.049ms) Best regards,
Dimitar
Thank you Dimitar, I was using Squirrel to connect from my host machine to the VM. Now when I connect using SSH-client and run the short group by query it works but the result is not similar to the union queries (PS: i also run it in the VM direclty and had the same result)
it works but the result is not similar to the union queries (PS: i also run
it in the VM direclty and had the same result) +---------+--------------------------+
| 2-Age | pval |
| [21-30] | 0.99996611809271041 |
3 tuples (2.638ms) What i am trying to achieve is to run a one sample ttest for all the
age categories, and it seems that the group by is ignored. I have also created 2 functions for the two sample ttest: CREATE AGGREGATE "tone"."ttest2samples"(arg1 double, arg2
varchar(250)) RETURNS double LANGUAGE R { t.test(arg1 ~ arg2)$p.value };
CREATE AGGREGATE "tone"."ttest2samplesStatistic"(arg1 double, arg2
varchar(250)) RETURNS double LANGUAGE R { t.test(arg1 ~ arg2)$statistic }; select "tone"."ttest2samplesStatistic"("9-Score", "4-Gender") as
"stat", "tone"."ttest2samples"("9-Score", "4-Gender") as "pvalue" from
"tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" group by
"2-Age" Am I doing the correct way to retrieve the stat and p-value, is the
ttest get executed twice in this case? is there a more optimal way where i
can retrieve both information. Please note that I also have the same
grouping issue as before. Do you have the same behavior, or I am missing something in the
creation of these functions. A final question, how do u drop a R function? i am trying to execute
the following:
Thank you.
dimitar.nedev@monetdbsolutions.com> wrote:
Hi Imad, Hannes asked me to look at your issue with the MonetDB-R VM. I loaded the data and ran the queries you mentioned, but I did not
encounter any issues. This holds for both the short and the union queries.
I am also not familiar with the issue you report, nor its cause. At this
time I think it is local to your instance. If you can export your db, or
re-load the data, I would recommend you to create a new instance with a
fresh image and try again. If you import the image as a new VM instance,
you can keep your old one around as well. Please note that during the test:
- I ran as the admin (monetdb) user, as the one created default on
the VM image cannot create new schemas.
- I have to clean the data a bit up to load it, removing dozen lines
or so.
- Renamed ttest1 to ttest2 to match the queries. Best regards,
Dimitar >>>>> Begin forwarded message:
>>>>> From: imad hajj chahine
