Dear MonetDB support, My name is Jiri Nadvornik and I am working for Czech astronomical institute (http://www.asu.cas.cz/en/about/about-the-institute). For more than a year I’m trying to generate identifiers for astronomical objects used when creating light curves for these objects. In general, I’m working with spherical coordinates and based on Euclidean parameters (and possibly other parameters, like magnitude) I am clustering these objects to assign multiple observations to one real object. A bottleneck when doing this in a relational database (like PostgreSQL) is not the finding of the neighbors as such (~ ms in a loop), but the inefficiency of working with disk IO when doing so. So after a little analysis of what could be done, I am quite interested into storing the data (mainly spherical coordinates) in an array DB, physically partition regions of the sky to multiple disks and then use the clustering on these partitions in parallel. My question no. 1 for you is whether MonetDB has some kind of support for spherical indexing (PgSphere, Q3C, HealPix, …) to ingest and correctly distribute the astronomical data. I need to control the way of how the data will be distributed to physical partitions. If question no.1 can be somehow answered as true, then question no 2. is: What is the easiest way to install MonetDB, ingest my data and test them? P.S.: Our processed dataset contains now cca 3e8 observations (entries for clustering) which resolve to about 6e6 real objects. We expect that number to double when we process all of our data. Thank you for your answer. Kind regards, Jiri Nadvornik Astronomical Institute of the Academy of Sciences of the Czech Republic nadvornik.ji@gmail.com
Dear Jiri, thank you very much for you interest in MonetDB! Concerning your question 1, the simple and short answer is No. Nevertheless, with a "mere" 6 million objects --- though I have no idea how large each object is --- you are welcome to try out MonetDB without any partitioning on a single reasonable sided machine. As far installing MonetDB and (bulk) loading data, please see the documentation on the MonetDB website at http://www.monetdb.org/, in particular https://www.monetdb.org/Documentation/Guide/Installation and https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData Please don't hesitate to ask in case you have further questions. Best, Stefan ----- Original Message -----
Dear MonetDB support,
My name is Jiri Nadvornik and I am working for Czech astronomical institute ( http://www.asu.cas.cz/en/about/about-the-institute ).
For more than a year I’m trying to generate identifiers for astronomical objects used when creating light curves for these objects.
In general, I’m working with spherical coordinates and based on Euclidean parameters (and possibly other parameters, like magnitude) I am clustering these objects to assign multiple observations to one real object.
A bottleneck when doing this in a relational database (like PostgreSQL) is not the finding of the neighbors as such (~ ms in a loop), but the inefficiency of working with disk IO when doing so.
So after a little analysis of what could be done, I am quite interested into storing the data (mainly spherical coordinates) in an array DB, physically partition regions of the sky to multiple disks and then use the clustering on these partitions in parallel.
My question no. 1 for you is whether MonetDB has some kind of support for spherical indexing (PgSphere, Q3C, HealPix, …) to ingest and correctly distribute the astronomical data. I need to control the way of how the data will be distributed to physical partitions.
If question no.1 can be somehow answered as true, then question no 2. is: What is the easiest way to install MonetDB, ingest my data and test them?
P.S.: Our processed dataset contains now cca 3e8 observations (entries for clustering) which resolve to about 6e6 real objects. We expect that number to double when we process all of our data.
Thank you for your answer.
Kind regards,
Jiri Nadvornik
Astronomical Institute of the Academy of Sciences of the Czech Republic
nadvornik.ji@gmail.com
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Hello! I'm running Oct2014-SP1 on a 64-bit Fedora 20 system and encountering another segmentation violation in mserver5. This time it happens quite often, making it nearly impossible to use the database. Before that started happening I also encountered these errors: GDK reported error. BATproject: does not match always when attempting a COPY INTO operation. One slightly suspicious factor is that the table being added to currently contains 3,995,238,028 records, just under the maximum for a 32-bit unsigned integer. Is there, in fact, a limit on how many records can be stored in a table? Or in any case, is there anything I can do about the errors? At this point any significant query on this table cause mserver5 to crash. The segmentation violation gets reported in the merovingian log, but all it says is database 'logs' (18717) was killed by signal SIGSEGV Before that there is nothing suspicious. Oh, one other strange thing is that, even when it is running, I can no longer query the status of the database with monetdb status The command generates an error: monetdb: cannot find a control socket, use -h and/or -p but this is a local database and the Unix domain sockets are there: /tmp/.s.monetdb.50000 /tmp/.s.merovingian.50000 /opt/farm/logs/.mapi.sock It seems like things have gone very wrong but I am not sure what to check. Thanks for any pointers you can give! Tim
participants (3)
-
Jiří Nádvorník
-
Stefan Manegold
-
Tim Burress