[Monetdb-developers] RDF data management in MonetDB/SQL
Hi MonetDB developers, I have a question about RDF data management in MonetDB/SQL. The comment of sql.rdfshred says "shredding an RDF data file from location results in 7 new tables (6 permutations of SPO and a mapping) ... We can then query with SQL queries the RDF triple store by quering tables gid_spo, gid_pso etc., ...". In my option, if the spo table is considered the triples table, the other 5 tables (sop, pso, pos, osp, ops) (except the mapping table) can be viewed as indexes of the triples table spo. When I write SQL to query the shredded RDF data in the triples table, I have two ways. The first way is to only use spo table to make self-joins. The second way is to use all 6 tables to make joins. I noticed that "MonetDB/SQL Reference Manual" says that "The heart is the MonetDB server, which comes with the following innovative features. ... Index selection, creation and maintenance is automatic". If I use 6 tables (as indexes) explicitly to make joins, it seems that I write the query plan by myself. However, I think this work should be done by the SQL optimizer using statistics from the system catalog. I wondered if these tables have already been specified as indexes in the internal code, or if there is a way to specify it so that the optimizer can use them as indexes to generate query plans. I am not sure if my understanding is correct. I will appreciate any help from developers. Thank you in advance. Best regards, Xin Wang
Hi,
your understanding is in general correct. The availability of the six
different indices of the triple table should be understood by the
SQL/SPARQL query optimizer and used accordingly. However, the
MonetDB/RDF module has not been announced yet, and that is because it
has not finished yet. You are using a piece of code which is
experimental and unfinished. As such, the only way to test Monet and
its capabilities on RDF data is to manually write the SQL query in
such a way that you use the correct order of the triple table on the
correct join.
In the future of course, this will be done by the optimizer and the
user will only have to write simple SPARQL queries referring to the
name of the graph, instead of the underlying storage schema, but until
then I am afraid you will have to do it by hand. The good news is that
if you are using RDF data and testing queries that have been published
previously on papers as benchmarks, most likely someone else will have
done already the translation to a correct SQL query (since most
experiments on newly build engines that do not support SPARQL use this
method).
Hope this helps you a bit,
lefteris
2010/7/15 wangx
Hi MonetDB developers, I have a question about RDF data management in MonetDB/SQL. The comment of sql.rdfshred says "shredding an RDF data file from location results in 7 new tables (6 permutations of SPO and a mapping) ... We can then query with SQL queries the RDF triple store by quering tables gid_spo, gid_pso etc., ...". In my option, if the spo table is considered the triples table, the other 5 tables (sop, pso, pos, osp, ops) (except the mapping table) can be viewed as indexes of the triples table spo. When I write SQL to query the shredded RDF data in the triples table, I have two ways. The first way is to only use spo table to make self-joins. The second way is to use all 6 tables to make joins. I noticed that "MonetDB/SQL Reference Manual" says that "The heart is the MonetDB server, which comes with the following innovative features. ... Index selection, creation and maintenance is automatic". If I use 6 tables (as indexes) explicitly to make joins, it seems that I write the query plan by myself. However, I think this work should be done by the SQL optimizer using statistics from the system catalog. I wondered if these tables have already been specified as indexes in the internal code, or if there is a way to specify it so that the optimizer can use them as indexes to generate query plans. I am not sure if my understanding is correct. I will appreciate any help from developers. Thank you in advance.
Best regards, Xin Wang ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
participants (2)
-
Lefteris
-
wangx