Thanks a lot for the answer. I have a question. Even if I create column as
a BIGINT, how will it make sure that imported data
"1010000000000110100101110101010" (lets say 50 long 1s and 0s) is a bit
string, not a number. What I want to say BIGINT has maximum value of "
9223372036854775807" and the imported
data "1010000000000110100101110101010" is more than bigint value, if we
consider it as a number, not a bit string. I want to chunk 10,000 long 1s
and 0s into 200 chunks of size 50, and import them into Monetdb. Lets say
table (Doc,Sentenceid,Chunkid, Chunk).
On Fri, Mar 11, 2016 at 8:26 AM, Stefan Manegold
Hi Shmagi,
now that we understand a bit better what you want --- storing and bitwise-anding bit-strings of 10,000 bit each --- the only (existing) types in MonetDB that support that directly would (indeed) be either
(1) TEXT | STRING | CLOB | CHARACTER LARGE OBJECT string with unbounded length
or
(2) BLOB | BINARY LARGE OBJECT bytes with unbounded length
cf., https://www.monetdb.org/Documentation/Manuals/SQLreference/BuiltinTypes
In the first case, each bit (w|c)ould be represented by (i) a character '0' or '1', i.e., taking a whole byte of storage --- unless you have (or create) a textual (CSV) representation that (ii) packs bits into valid UTF-8 characters.
In the second case, each bit (w|c|s)ould use only one bit of storage.
In either case, you'd have to implement your own bit-wise bit-string and() operation as well as count_set_bits() operation as UDFs in C or Pyhton as Roberto suggested.
Alternatively, you could also consider implementing your own variable- or fixed-length bit-string type, but than you're largely on your own and the main (if not only) documentation for this is the code itself; cf., https://www.monetdb.org/Documentation/Manuals/SQLreference/Userdefinedtypes https://www.monetdb.org/Documentation/Manuals/MonetDB/MAL/Types
Finally, you might consider representing your 10,000 bit bit-string as 159 64-bit BIGINT columns or 79 128-bit HUGEINT columns (if supported on your system; cf. https://www.monetdb.org/Documentation/Manuals/SQLreference/BuiltinTypes); note though that each column could effectively only hold 63 (127) bits instead of 64 (128) as MonetDB (or SQL in general) only supports signed types and MonetDB uses one value of the range (the smallest representable value) as NULL value; thus, bitwise and() (bit_and()) would work "unexpectedly" if the highest bit is set. While this would give you bit-wise and() (called bit_and()) "for free" (at least per column), you'd need to design your queries to combine all columns, and still implement a UDF to count the bits set.
Hope this helps you further.
Best, Stefan
----- On Mar 10, 2016, at 11:31 PM, Roberto Cornacchia roberto.cornacchia@gmail.com wrote:
It can't indeed. You can forget about conversion to integers. Your only option is to keep them as strings and develop special bitwise UDFs that work directly on strings. The starting points for developing your UDFs are still the same. On 10 Mar 2016 22:53, "Shmagi Kavtaradze" < kavtaradze.s@gmail.com > wrote:
Thanks for the advice, I will try to do it. Sorry in advance, maybe I don't understand easy stuff, but if I have int of 10,000 long 1s and 0s, how can it fit to Monetdb, if BIGINT is " BIGINT 64 bit signed integer between -9223372036854775807 and 9223372036854775807 "
On Thu, Mar 10, 2016 at 10:40 PM, Roberto Cornacchia < roberto.cornacchia@gmail.com > wrote:
If you really want to load strings into MonetDB and convert them to integers on the fly in your query before doing your bitwise operations, you can have fun with implementing your own function for this (both ways, from string2int and int2string).
One way is to implement it in C, which is trivial, and then make it become a UDF in MonetDB (less trivial but instructive):
https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/UserDefinedFuncti...
. Another way is to enable the python integration ( https://www.monetdb.org/blog/embedded-pythonnumpy-monetdb ) and then write your UDF in python without any need to recompile. The UDF in python would be a one-line: int('1001001',2) Good luck.
On 10 March 2016 at 22:29, Sjoerd Mullender < sjoerd@monetdb.org > wrote:
Text is about the worst type you could have chosen. Far better is to choose one of the integer types: tinyint, smallint, integer, bigint, depending on the maximum number of bits that you have. If you have a CSV file and the column contains just sequences of 0 and 1, you will need to convert those number to e.g. hexadecimal notation (i.e. something like 0x42ab). Once you have loaded the data as integers, you can use the bitwise operators. & is bitwise AND, | is bitwise OR.
On 03/10/2016 10:01 PM, Shmagi Kavtaradze wrote:
I am new to Monetdb. I am using Postgresql mainly, but want to check Monetdb performace. In Postres I have column of type bit() filled with 0s and 1s. Then I am comparing each row to all other rows with bitwise AND on that column. Monetdb does not have bit() type so I used text. Any ideas how to do bitwise AND in monetdb and what type of column should I use for this? The query I tried:
select a.sentenecid, b.sentenecid, a.sentence AND b.sentence from test a, test b;
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list