Hi Hannes,
This error is due that the "2-age" column has more than 2 distinct values,
there is a space after some values. can you run a trim on the "2-Age" and
then try it again.
As for the vector thing, I didnt get the point, shouldnt the db engine
split the data into vectors and feed the aggregate function which return
one data point?
Thanks
On Fri, Jun 19, 2015 at 11:12 AM, Hannes Mühleisen wrote: I am running attached script, which I pieced together from your emails.
This does not run, it produces the following error: Error running R expression. Error message: Error in t.test.formula(arg1 ~
arg2) :
grouping factor must have exactly 2 levels
Calls: as.data.frame -> <Anonymous> -> t.test -> t.test.formula In general, for aggregations, the R code needs to return a vector with a
single entry for each group. Best, Hannes On 19 Jun 2015, at 09:44, imad hajj chahine Thanks, if you need more information from my side let me know. Also can you answer me on the following when i use 2 functions one to
retrieve Pvalue and one for the stat value:
Am I doing the correct way to retrieve the stat and p-value, is the
ttest get executed twice in this case? is there a more optimal way where i
can retrieve both information. On Fri, Jun 19, 2015 at 10:39 AM, Hannes Mühleisen <
Hannes.Muehleisen@cwi.nl> wrote:
Looking into this at the moment. On 16 Jun 2015, at 09:56, Dimitar Nedev <
dimitar.nedev@monetdbsolutions.com> wrote: Hannes, do you know if the STD_DEV is the issue there or if the
queries should actually yield different results? Dimitar On 2015-Jun-15, at 13:28 , imad hajj chahine <
imad.hajj.chahine@gmail.com> wrote: Hi, Shouldn't we have the same result from running the first query and
the second query?
after all its an aggregate function and should return the test for
each group. I checked that monetdb had problem with running group by with
statistics functions like STD_DEV, are these 2 issues related? Thank you. On Mon, Jun 15, 2015 at 1:52 PM, Dimitar Nedev <
dimitar.nedev@monetdbsolutions.com> wrote:
Hi Imad, Here is the output I got from running the queries. Note, this might
not be entirely representative since I did clean-up some data before
loading it. sql>select "2-Age", "tone"."ttest2"("9-Score",6.08963) from
"tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" is not null group by "2-Age";
+---------+--------------------------+
| 2-Age | L1 |
+=========+==========================+
| [10-20] | 0.89311103869734754 |
+---------+--------------------------+
3 tuples (3.218ms) sql>select '[10-20]' as "2-Age","tone"."ttest2"("9-Score",6.08963)
from "tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" = '[10-20]'
more>union all
more>select '[21-30]' as "2-Age","tone"."ttest2"("9-Score",6.08963)
from "tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" = '[21-30]'
more>union all
more>select '[> 30]' as "2-Age","tone"."ttest2"("9-Score",6.08963)
from "tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" where
"2-Age" = '> 30';
+---------+--------------------------+
| 2-Age | L1 |
+=========+==========================+
| [10-20] | 0.14150742355349472 |
| [21-30] | 0.32872830865003566 |
| [> 30] | 0.37127075363219841 |
+---------+--------------------------+
3 tuples (8.825ms) sql>select "tone"."ttest2samplesStatistic"("9-Score", "4-Gender") as
"stat", "tone"."ttest2samples"("9-Score", "4-Gender") as "pvalue" from
"tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" group by
"2-Age";
+--------------------------+--------------------------+
| stat | pvalue |
+==========================+==========================+
| -2.1588506639013736 | 0.032555033970368859 |
+--------------------------+--------------------------+
1 tuple (13.287ms) I dropped the aggregate with:
sql>DROP AGGREGATE "tone"."ttest2samplesStatistic";
operation successful (2.049ms) Best regards,
Dimitar On 2015-Jun-12, at 00:46 , imad hajj chahine <
imad.hajj.chahine@gmail.com> wrote: Thank you Dimitar, I was using Squirrel to connect from my host machine to the VM. Now when I connect using SSH-client and run the short group by query
it works but the result is not similar to the union queries (PS: i also run
it in the VM direclty and had the same result) +---------+--------------------------+
| 2-Age | pval |
+=========+==========================+
| [21-30] | 0.99996611809271041 |
+---------+--------------------------+
3 tuples (2.638ms) What i am trying to achieve is to run a one sample ttest for all the
age categories, and it seems that the group by is ignored. I have also created 2 functions for the two sample ttest: CREATE AGGREGATE "tone"."ttest2samples"(arg1 double, arg2
varchar(250)) RETURNS double LANGUAGE R { t.test(arg1 ~ arg2)$p.value };
CREATE AGGREGATE "tone"."ttest2samplesStatistic"(arg1 double, arg2
varchar(250)) RETURNS double LANGUAGE R { t.test(arg1 ~ arg2)$statistic }; select "tone"."ttest2samplesStatistic"("9-Score", "4-Gender") as
"stat", "tone"."ttest2samples"("9-Score", "4-Gender") as "pvalue" from
"tone"."Marketing_Loyalty_4700298d-9862-40b3-9028-b0f15dab9dea" group by
"2-Age" Am I doing the correct way to retrieve the stat and p-value, is the
ttest get executed twice in this case? is there a more optimal way where i
can retrieve both information. Please note that I also have the same
grouping issue as before. Do you have the same behavior, or I am missing something in the
creation of these functions. A final question, how do u drop a R function? i am trying to execute
the following:
drop all function "tone"."ttest2samplesStat" Thank you. On Thu, Jun 11, 2015 at 11:28 PM, Dimitar Nedev <
dimitar.nedev@monetdbsolutions.com> wrote:
Hi Imad, Hannes asked me to look at your issue with the MonetDB-R VM. I loaded the data and ran the queries you mentioned, but I did not
encounter any issues. This holds for both the short and the union queries.
I am also not familiar with the issue you report, nor its cause. At this
time I think it is local to your instance. If you can export your db, or
re-load the data, I would recommend you to create a new instance with a
fresh image and try again. If you import the image as a new VM instance,
you can keep your old one around as well. Please note that during the test:
- I ran as the admin (monetdb) user, as the one created default on
the VM image cannot create new schemas.
- I have to clean the data a bit up to load it, removing dozen lines
or so.
- Renamed ttest1 to ttest2 to match the queries. Best regards,
Dimitar >>>>> Begin forwarded message:
>>>>>
>>>>> From: imad hajj chahine