Re: [Monetdb-developers] RDF data management in MonetDB/SQL
Dear Lefteris, Thank you for your reply. Your explanation will help me much. In addition, would you give me some guidelines on how to make the 6 different indices of the triple table be understood by the SQL/SPARQL query optimizer and used accordingly. If I want to implement this feature by myself, in which files the code should be added or modified. BTW, in the future, will you implement the MonetDB/SPARQL as a new supported front-end language? Will you consider to translate SPARQL to MAL directly, instead of through SQL to MAL (but this way you have to develop a SPARQL specific optimizer, instead of making use of the SQL optimizer)? Thank you so much. Best regards, Xin Wang
From: Lefteris
Reply-To: To: wangx Subject: Re: [Monetdb-developers] RDF data management in MonetDB/SQL Date:Fri, 16 Jul 2010 10:29:53 +0200 Hi,
your understanding is in general correct. The availability of the six different indices of the triple table should be understood by the SQL/SPARQL query optimizer and used accordingly. However, the MonetDB/RDF module has not been announced yet, and that is because it has not finished yet. You are using a piece of code which is experimental and unfinished. As such, the only way to test Monet and its capabilities on RDF data is to manually write the SQL query in such a way that you use the correct order of the triple table on the correct join.
In the future of course, this will be done by the optimizer and the user will only have to write simple SPARQL queries referring to the name of the graph, instead of the underlying storage schema, but until then I am afraid you will have to do it by hand. The good news is that if you are using RDF data and testing queries that have been published previously on papers as benchmarks, most likely someone else will have done already the translation to a correct SQL query (since most experiments on newly build engines that do not support SPARQL use this method).
Hope this helps you a bit,
lefteris
2010/7/15 wangx :
Hi MonetDB developers, I have a question about RDF data management in MonetDB/SQL. The comment of sql.rdfshred says "shredding an RDF data file from location results in 7 new tables (6 permutations of SPO and a mapping) ... We can then query with SQL queries the RDF triple store by quering tables gid_spo, gid_pso etc., ...". In my option, if the spo table is considered the triples table, the other 5 tables (sop, pso, pos, osp, ops) (except the mapping table) can be viewed as indexes of the triples table spo. When I writeSQLto query the shredded RDF data in the triples table, I have two ways.The first way is toonly use spo table tomake self-joins. The second way is touse all 6 tables to make joins. I noticed that "MonetDB/SQL Reference Manual" says that "The heart is the MonetDB server, which comes with the following innovative features. ... Index selection, creation and maintenance is automatic". IfI use6 tables (as indexes) explicitly to make joins, it seems that I write the query plan by myself. However, I think this work should be done by the SQL optimizer using statistics from the system catalog. I wondered if these tables have alreadybeen specified as indexes in the internal code, or if there is a way to specify it so that the optimizer can use them as indexes to generate query plans. I am not sure if my understanding is correct. I will appreciate any help from developers. Thank you in advance.
Best regards, Xin Wang ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
Dear Xin,
our plan for the future is to release a MonetDB/SPARQL front end. It
will understand SPARQL, translate it to a SPARQL algebra tree, apply
some SPARQL specific optimizations and then translate it to the
relational algebra tree that MonetDB/SQL understands, so as to apply
the relational optimizations before passed to MAL. So it is a
translation that jumps over SQL but does not land into MAL, just
before it:)
Implementing these features your self will be a hard job, but if you
want to, then the first place to check would be in the directory of
the sql module : src/server/
There, you will see some sql_* files, in the same way, some rdf_*
equivalent files (and functions) have to be written, and then hook up
with the relational algebra which can be found on the rel_* files.
I think that maybe for your needs it is better to write an external
wrapper that produces the correct SQL statements given the SPARQL
query and the physical storage implemented in Monet.
regards,
lefteris
2010/7/16 wangx
Dear Lefteris,
Thank you for your reply. Your explanation will help me much. In addition, would you give me some guidelines on how to make the 6 different indices of the triple table be understood by the SQL/SPARQL query optimizer and used accordingly. If I want to implement this feature by myself, in which files the code should be added or modified. BTW, in the future, will you implement the MonetDB/SPARQL as a new supported front-end language? Will you consider to translate SPARQL to MAL directly, instead of through SQL to MAL (but this way you have to develop a SPARQL specific optimizer, instead of making use of the SQL optimizer)? Thank you so much.
Best regards, Xin Wang
From: Lefteris
Reply-To: To: wangx Subject: Re: [Monetdb-developers] RDF data management in MonetDB/SQL Date:Fri, 16 Jul 2010 10:29:53 +0200 Hi,
your understanding is in general correct. The availability of the six different indices of the triple table should be understood by the SQL/SPARQL query optimizer and used accordingly. However, the MonetDB/RDF module has not been announced yet, and that is because it has not finished yet. You are using a piece of code which is experimental and unfinished. As such, the only way to test Monet and its capabilities on RDF data is to manually write the SQL query in such a way that you use the correct order of the triple table on the correct join.
In the future of course, this will be done by the optimizer and the user will only have to write simple SPARQL queries referring to the name of the graph, instead of the underlying storage schema, but until then I am afraid you will have to do it by hand. The good news is that if you are using RDF data and testing queries that have been published previously on papers as benchmarks, most likely someone else will have done already the translation to a correct SQL query (since most experiments on newly build engines that do not support SPARQL use this method).
Hope this helps you a bit,
lefteris
2010/7/15 wangx :
Hi MonetDB developers, I have a question about RDF data management in MonetDB/SQL. The comment of sql.rdfshred says "shredding an RDF data file from location results in 7 new tables (6 permutations of SPO and a mapping) ... We can then query with SQL queries the RDF triple store by quering tables gid_spo, gid_pso etc., ...". In my option, if the spo table is considered the triples table, the other 5 tables (sop, pso, pos, osp, ops) (except the mapping table) can be viewed as indexes of the triples table spo. When I writeSQLto query the shredded RDF data in the triples table, I have two ways.The first way is toonly use spo table tomake self-joins. The second way is touse all 6 tables to make joins. I noticed that "MonetDB/SQL Reference Manual" says that "The heart is the MonetDB server, which comes with the following innovative features. ... Index selection, creation and maintenance is automatic". IfI use6 tables (as indexes) explicitly to make joins, it seems that I write the query plan by myself. However, I think this work should be done by the SQL optimizer using statistics from the system catalog. I wondered if these tables have alreadybeen specified as indexes in the internal code, or if there is a way to specify it so that the optimizer can use them as indexes to generate query plans. I am not sure if my understanding is correct. I will appreciate any help from developers. Thank you in advance.
Best regards, Xin Wang ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
participants (2)
-
Lefteris
-
wangx