Question with wildcarded expressions for LIKE clause
I am trying to do some constraints off the system tables to select a unique pattern of tables. We have tables like the following (I can't give the actual table names): * Config (a single record exists here, no merge tables use it) * Data_0000 (a small number of records exist in each of these) * Data_0001 * Data_0002 * Data_0003 * Data (this is a merge table over the last 4) * Test_table (a small number of records exist here, no merge tables use it) * Main__yyyy_mm_dd__hh__0000 (millions of records, yyyy, mm, dd, hh are all numbers, we have one set of tables per hour, and keep significantly over 1 weeks' worth of data) * Main__yyyy_mm_dd__hh__0001 * Main__yyyy_mm_dd__hh__0002 * Main__yyyy_mm_dd__hh__0003 * Main__yyyy_mm_dd__hh (this is a merge table over the last 4) If I do a select * from sys.tables where schema_id in (select id from schemas where name = 'frodo') and type in (select table_type_id from table_types where table_type_name = 'TABLE') I get the right thing - all non merge tables. I then want to further break it down into the tables that end in __[0-9][0-9][0-9][0-9] and those that don't (two separate queries for two separate purposes). Consider frodo the name of our schema. Adding and name like 'data\_\_____' appears to work. This should give me the 4 data_* tables. Adding and name like '%\_\_____' gives me all non-merge tables, including config and test_table. I'm assuming % is the wild card for 0 or more, and _ is the wild card for 1. This matches some documentation I have found. The \ are required because I need to explicitly look for 2 of the underscore character. Am I doing this correctly? Are these the right wildcarding characters? MS SQLServer gives a # as a match for a single digit, is there an equivalent? Or, is this just a bug in MonetDB? Thanks, Dave
On 25 Jul 2019, at 02:32, Gotwisner, Dave
wrote: I am trying to do some constraints off the system tables to select a unique pattern of tables.
We have tables like the following (I can’t give the actual table names): • Config (a single record exists here, no merge tables use it) • Data_0000 (a small number of records exist in each of these) • Data_0001 • Data_0002 • Data_0003 • Data (this is a merge table over the last 4) • Test_table (a small number of records exist here, no merge tables use it) • Main__yyyy_mm_dd__hh__0000 (millions of records, yyyy, mm, dd, hh are all numbers, we have one set of tables per hour, and keep significantly over 1 weeks’ worth of data) • Main__yyyy_mm_dd__hh__0001 • Main__yyyy_mm_dd__hh__0002 • Main__yyyy_mm_dd__hh__0003 • Main__yyyy_mm_dd__hh (this is a merge table over the last 4)
If I do a select * from sys.tables where schema_id in (select id from schemas where name = ‘frodo’) and type in (select table_type_id from table_types where table_type_name = ‘TABLE’) I get the right thing – all non merge tables. I then want to further break it down into the tables that end in __[0-9][0-9][0-9][0-9] and those that don’t (two separate queries for two separate purposes). Consider frodo the name of our schema.
Adding and name like ‘data\_\_____’ appears to work. This should give me the 4 data_* tables.
Adding and name like ‘%\_\_____’ gives me all non-merge tables, including config and test_table. I’m assuming % is the wild card for 0 or more, and _ is the wild card for 1. This matches some documentation I have found. The \ are required because I need to explicitly look for 2 of the underscore character.
Hai Dave, your understanding of the definitions of the wild card characters is correct. Jennie
Am I doing this correctly? Are these the right wildcarding characters? MS SQLServer gives a # as a match for a single digit, is there an equivalent? Or, is this just a bug in MonetDB?
Thanks, Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Jenny,
Now that I have confirmed the syntax, I believe there might be a problem with like and not like as used below.
Here is the test schema:
set schema foof;
create table config (a integer);
create table data_0000 (a integer);
create table data_0001 (a integer);
create table data_0002 (a integer);
create table data_0003 (a integer);
create merge table data (a integer);
create table test_table (a integer);
create table main__yyyy_mm_dd__hh__0000 (a integer);
create table main__yyyy_mm_dd__hh__0001 (a integer);
create table main__yyyy_mm_dd__hh__0002 (a integer);
create table main__yyyy_mm_dd__hh__0003 (a integer);
create merge table main__yyyy_mm_dd__hh (a integer);
set schema sys;
select * from schemas where name = 'foof';
+--------------+------+----------------------+----------+-----------+
| id | name | authorization | owner | system |
+========+======+===========+=======+=======+
| 1050825 | foof | 3 | 3 | false |
+-------------+----------+------------------+-----------+-----------+
Select * from tables where schema_id = 1050825;
Shows all the 12 tables I created above
Select * from tables where schema_id = 1050825 and name like '%\_\_____';
Lists all 11 tables. My belief is this should only list the 4 main__ tables and not the main__ merge table. The only table not listed here is 'data'.
Doing the same query without the back-slashes gives the same result.
Doing the complement (not like, instead of like), gives the data table.
For the like expression, I would have expected only 4 tables. For the not like, I would have expected 8 tables.
Is this a bug? Confirming from your response, \ escapes _ to change it from any single character to only the character _?
Dave
-----Original Message-----
From: users-list
On 25 Jul 2019, at 02:32, Gotwisner, Dave
wrote: I am trying to do some constraints off the system tables to select a unique pattern of tables.
We have tables like the following (I can’t give the actual table names): • Config (a single record exists here, no merge tables use it) • Data_0000 (a small number of records exist in each of these) • Data_0001 • Data_0002 • Data_0003 • Data (this is a merge table over the last 4) • Test_table (a small number of records exist here, no merge tables use it) • Main__yyyy_mm_dd__hh__0000 (millions of records, yyyy, mm, dd, hh are all numbers, we have one set of tables per hour, and keep significantly over 1 weeks’ worth of data) • Main__yyyy_mm_dd__hh__0001 • Main__yyyy_mm_dd__hh__0002 • Main__yyyy_mm_dd__hh__0003 • Main__yyyy_mm_dd__hh (this is a merge table over the last 4)
If I do a select * from sys.tables where schema_id in (select id from schemas where name = ‘frodo’) and type in (select table_type_id from table_types where table_type_name = ‘TABLE’) I get the right thing – all non merge tables. I then want to further break it down into the tables that end in __[0-9][0-9][0-9][0-9] and those that don’t (two separate queries for two separate purposes). Consider frodo the name of our schema.
Adding and name like ‘data\_\_____’ appears to work. This should give me the 4 data_* tables.
Adding and name like ‘%\_\_____’ gives me all non-merge tables, including config and test_table. I’m assuming % is the wild card for 0 or more, and _ is the wild card for 1. This matches some documentation I have found. The \ are required because I need to explicitly look for 2 of the underscore character.
Hai Dave, your understanding of the definitions of the wild card characters is correct. Jennie
Am I doing this correctly? Are these the right wildcarding characters? MS SQLServer gives a # as a match for a single digit, is there an equivalent? Or, is this just a bug in MonetDB?
Thanks, Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe40joYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e=
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe40joYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e=
The correct syntax is: select ... from ... where ... and name like '%\\_\\___' escape '\\'; Note two things here: the backslashes have been doubled and there is an escape clause. The backslashes need to be doubled because the parser, when it sees a backslash in a string, handles the next character specially. In this way you can get e.g. newlines and tabs into your strings (use '\n' and '\t' respectively). But because the first backslash does something, you need a second one to get one through. The escape clause is needed to tell the like operator that the backslash is special to it. This also suggests an alternative (that actually also works): select ... from ... where ... and name like '%+_+___' escape '+'; By not using a backslash here, we prevent the two levels of "specialness" for the backslash. Note that according to the SQL standard, the behavior described above with respect to parsing of backslashes is incorrect. A backslash is nothing special in SQL. I would therefore like to move away from that syntax so that your original query would work. But we're not there. As a transitional step I have introduced (based on prior work in PostgreSQL) a E prefix to strings. With the E prefix, strings are subject to backslash interpretation and this will continue to be the case, even when non-prefixed strings are no longer subject to this. So what works now and will continue to work is: select ... from ... where ... and name like E'%\\_\\___' escape E'\\'; On 30/07/2019 22.41, Gotwisner, Dave wrote:
Jenny,
Now that I have confirmed the syntax, I believe there might be a problem with like and not like as used below.
Here is the test schema:
set schema foof; create table config (a integer); create table data_0000 (a integer); create table data_0001 (a integer); create table data_0002 (a integer); create table data_0003 (a integer); create merge table data (a integer); create table test_table (a integer); create table main__yyyy_mm_dd__hh__0000 (a integer); create table main__yyyy_mm_dd__hh__0001 (a integer); create table main__yyyy_mm_dd__hh__0002 (a integer); create table main__yyyy_mm_dd__hh__0003 (a integer); create merge table main__yyyy_mm_dd__hh (a integer); set schema sys; select * from schemas where name = 'foof';
+--------------+------+----------------------+----------+-----------+ | id | name | authorization | owner | system | +========+======+===========+=======+=======+ | 1050825 | foof | 3 | 3 | false | +-------------+----------+------------------+-----------+-----------+
Select * from tables where schema_id = 1050825;
Shows all the 12 tables I created above
Select * from tables where schema_id = 1050825 and name like '%\_\_____';
Lists all 11 tables. My belief is this should only list the 4 main__ tables and not the main__ merge table. The only table not listed here is 'data'.
Doing the same query without the back-slashes gives the same result.
Doing the complement (not like, instead of like), gives the data table.
For the like expression, I would have expected only 4 tables. For the not like, I would have expected 8 tables.
Is this a bug? Confirming from your response, \ escapes _ to change it from any single character to only the character _?
Dave
-----Original Message----- From: users-list
On Behalf Of Ying Zhang Sent: Tuesday, July 30, 2019 4:55 AM To: Communication channel for MonetDB users Subject: Re: Question with wildcarded expressions for LIKE clause This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On 25 Jul 2019, at 02:32, Gotwisner, Dave
wrote: I am trying to do some constraints off the system tables to select a unique pattern of tables.
We have tables like the following (I can’t give the actual table names): • Config (a single record exists here, no merge tables use it) • Data_0000 (a small number of records exist in each of these) • Data_0001 • Data_0002 • Data_0003 • Data (this is a merge table over the last 4) • Test_table (a small number of records exist here, no merge tables use it) • Main__yyyy_mm_dd__hh__0000 (millions of records, yyyy, mm, dd, hh are all numbers, we have one set of tables per hour, and keep significantly over 1 weeks’ worth of data) • Main__yyyy_mm_dd__hh__0001 • Main__yyyy_mm_dd__hh__0002 • Main__yyyy_mm_dd__hh__0003 • Main__yyyy_mm_dd__hh (this is a merge table over the last 4)
If I do a select * from sys.tables where schema_id in (select id from schemas where name = ‘frodo’) and type in (select table_type_id from table_types where table_type_name = ‘TABLE’) I get the right thing – all non merge tables. I then want to further break it down into the tables that end in __[0-9][0-9][0-9][0-9] and those that don’t (two separate queries for two separate purposes). Consider frodo the name of our schema.
Adding and name like ‘data\_\_____’ appears to work. This should give me the 4 data_* tables.
Adding and name like ‘%\_\_____’ gives me all non-merge tables, including config and test_table. I’m assuming % is the wild card for 0 or more, and _ is the wild card for 1. This matches some documentation I have found. The \ are required because I need to explicitly look for 2 of the underscore character.
Hai Dave, your understanding of the definitions of the wild card characters is correct.
Jennie
Am I doing this correctly? Are these the right wildcarding characters? MS SQLServer gives a # as a match for a single digit, is there an equivalent? Or, is this just a bug in MonetDB?
Thanks, Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe40joYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e=
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe40joYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e= _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
Again, select * from tables where schema_id = 1050862; yields:
+---------+----------------------------+-----------+-------+------+--------+---------------+--------+-----------+
| id | name | schema_id | query | type | system | commit_action | access | temporary |
+=========+============================+===========+=======+======+========+===============+========+===========+
| 1050865 | config | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050868 | data__0000 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050871 | data__0001 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050874 | data__0002 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050877 | data__0003 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050880 | data | 1050862 | null | 3 | false | 0 | 0 | 0 |
| 1050883 | test_table | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050886 | main__yyyy_mm_dd__hh__0000 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050889 | main__yyyy_mm_dd__hh__0001 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050892 | main__yyyy_mm_dd__hh__0002 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050895 | main__yyyy_mm_dd__hh__0003 | 1050862 | null | 0 | false | 0 | 0 | 0 |
| 1050898 | main__yyyy_mm_dd__hh | 1050862 | null | 3 | false | 0 | 0 | 0 |
+---------+----------------------------+-----------+-------+------+--------+---------------+--------+-----------+
12 tuples
(I needed to recreate the schema and tablespace as below, which is why schema_id has a different value).
Doing the following give me no records in the response:
sql>select * from tables where schema_id = 1050862 and name like '%+_+____' escape '+';
+----+------+-----------+-------+------+--------+---------------+--------+-----------+
| id | name | schema_id | query | type | system | commit_action | access | temporary |
+====+======+===========+=======+======+========+===============+========+===========+
+----+------+-----------+-------+------+--------+---------------+--------+-----------+
0 tuples
sql>select * from tables where schema_id = 1050862 and name like '%\\_\\_____';
+----+------+-----------+-------+------+--------+---------------+--------+-----------+
| id | name | schema_id | query | type | system | commit_action | access | temporary |
+====+======+===========+=======+======+========+===============+========+===========+
+----+------+-----------+-------+------+--------+---------------+--------+-----------+
0 tuples
Note, I would expect the data__#### and main__yyyy_mm_dd__hh__#### queries to match these patterns.
Using the E escape form gives me the 8 expected values. Thanks.
Dave
-----Original Message-----
From: users-list
Jenny,
Now that I have confirmed the syntax, I believe there might be a problem with like and not like as used below.
Here is the test schema:
set schema foof; create table config (a integer); create table data_0000 (a integer); create table data_0001 (a integer); create table data_0002 (a integer); create table data_0003 (a integer); create merge table data (a integer); create table test_table (a integer); create table main__yyyy_mm_dd__hh__0000 (a integer); create table main__yyyy_mm_dd__hh__0001 (a integer); create table main__yyyy_mm_dd__hh__0002 (a integer); create table main__yyyy_mm_dd__hh__0003 (a integer); create merge table main__yyyy_mm_dd__hh (a integer); set schema sys; select * from schemas where name = 'foof';
+--------------+------+----------------------+----------+-----------+ | id | name | authorization | owner | system | +========+======+===========+=======+=======+ | 1050825 | foof | 3 | 3 | false | +-------------+----------+------------------+-----------+-----------+
Select * from tables where schema_id = 1050825;
Shows all the 12 tables I created above
Select * from tables where schema_id = 1050825 and name like '%\_\_____';
Lists all 11 tables. My belief is this should only list the 4 main__ tables and not the main__ merge table. The only table not listed here is 'data'.
Doing the same query without the back-slashes gives the same result.
Doing the complement (not like, instead of like), gives the data table.
For the like expression, I would have expected only 4 tables. For the not like, I would have expected 8 tables.
Is this a bug? Confirming from your response, \ escapes _ to change it from any single character to only the character _?
Dave
-----Original Message----- From: users-list
On Behalf Of Ying Zhang Sent: Tuesday, July 30, 2019 4:55 AM To: Communication channel for MonetDB users Subject: Re: Question with wildcarded expressions for LIKE clause This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On 25 Jul 2019, at 02:32, Gotwisner, Dave
wrote: I am trying to do some constraints off the system tables to select a unique pattern of tables.
We have tables like the following (I can’t give the actual table names): • Config (a single record exists here, no merge tables use it) • Data_0000 (a small number of records exist in each of these) • Data_0001 • Data_0002 • Data_0003 • Data (this is a merge table over the last 4) • Test_table (a small number of records exist here, no merge tables use it) • Main__yyyy_mm_dd__hh__0000 (millions of records, yyyy, mm, dd, hh are all numbers, we have one set of tables per hour, and keep significantly over 1 weeks’ worth of data) • Main__yyyy_mm_dd__hh__0001 • Main__yyyy_mm_dd__hh__0002 • Main__yyyy_mm_dd__hh__0003 • Main__yyyy_mm_dd__hh (this is a merge table over the last 4)
If I do a select * from sys.tables where schema_id in (select id from schemas where name = ‘frodo’) and type in (select table_type_id from table_types where table_type_name = ‘TABLE’) I get the right thing – all non merge tables. I then want to further break it down into the tables that end in __[0-9][0-9][0-9][0-9] and those that don’t (two separate queries for two separate purposes). Consider frodo the name of our schema.
Adding and name like ‘data\_\_____’ appears to work. This should give me the 4 data_* tables.
Adding and name like ‘%\_\_____’ gives me all non-merge tables, including config and test_table. I’m assuming % is the wild card for 0 or more, and _ is the wild card for 1. This matches some documentation I have found. The \ are required because I need to explicitly look for 2 of the underscore character.
Hai Dave, your understanding of the definitions of the wild card characters is correct.
Jennie
Am I doing this correctly? Are these the right wildcarding characters? MS SQLServer gives a # as a match for a single digit, is there an equivalent? Or, is this just a bug in MonetDB?
Thanks, Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_ mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41 iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe 40joYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e=
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_m ailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iO HYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe40j oYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e= _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
On 01/08/2019 00.14, Gotwisner, Dave wrote:
Again, select * from tables where schema_id = 1050862; yields: +---------+----------------------------+-----------+-------+------+--------+---------------+--------+-----------+ | id | name | schema_id | query | type | system | commit_action | access | temporary | +=========+============================+===========+=======+======+========+===============+========+===========+ | 1050865 | config | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050868 | data__0000 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050871 | data__0001 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050874 | data__0002 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050877 | data__0003 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050880 | data | 1050862 | null | 3 | false | 0 | 0 | 0 | | 1050883 | test_table | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050886 | main__yyyy_mm_dd__hh__0000 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050889 | main__yyyy_mm_dd__hh__0001 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050892 | main__yyyy_mm_dd__hh__0002 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050895 | main__yyyy_mm_dd__hh__0003 | 1050862 | null | 0 | false | 0 | 0 | 0 | | 1050898 | main__yyyy_mm_dd__hh | 1050862 | null | 3 | false | 0 | 0 | 0 | +---------+----------------------------+-----------+-------+------+--------+---------------+--------+-----------+ 12 tuples
(I needed to recreate the schema and tablespace as below, which is why schema_id has a different value).
Doing the following give me no records in the response:
sql>select * from tables where schema_id = 1050862 and name like '%+_+____' escape '+'; +----+------+-----------+-------+------+--------+---------------+--------+-----------+ | id | name | schema_id | query | type | system | commit_action | access | temporary | +====+======+===========+=======+======+========+===============+========+===========+ +----+------+-----------+-------+------+--------+---------------+--------+-----------+ 0 tuples
Count the underscores! After the + you have 4, the first of which is quoted by the +, so it represents an underscore in the data. Then there are 3 more, so it'll match three characters after the two (escaped) underscores. You don't have that in your data. You only have strings with 4 characters after a row of two underscores.
sql>select * from tables where schema_id = 1050862 and name like '%\\_\\_____'; +----+------+-----------+-------+------+--------+---------------+--------+-----------+ | id | name | schema_id | query | type | system | commit_action | access | temporary | +====+======+===========+=======+======+========+===============+========+===========+ +----+------+-----------+-------+------+--------+---------------+--------+-----------+ 0 tuples
You need the escape clause.
Note, I would expect the data__#### and main__yyyy_mm_dd__hh__#### queries to match these patterns.
Using the E escape form gives me the 8 expected values. Thanks.
Dave
-----Original Message----- From: users-list
On Behalf Of Sjoerd Mullender Sent: Wednesday, July 31, 2019 1:02 AM To: users-list@monetdb.org Subject: Re: Question with wildcarded expressions for LIKE clause This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
The correct syntax is:
select ... from ... where ... and name like '%\\_\\___' escape '\\';
Note two things here: the backslashes have been doubled and there is an escape clause.
The backslashes need to be doubled because the parser, when it sees a backslash in a string, handles the next character specially. In this way you can get e.g. newlines and tabs into your strings (use '\n' and '\t' respectively). But because the first backslash does something, you need a second one to get one through.
The escape clause is needed to tell the like operator that the backslash is special to it.
This also suggests an alternative (that actually also works):
select ... from ... where ... and name like '%+_+___' escape '+';
By not using a backslash here, we prevent the two levels of "specialness" for the backslash.
Note that according to the SQL standard, the behavior described above with respect to parsing of backslashes is incorrect. A backslash is nothing special in SQL. I would therefore like to move away from that syntax so that your original query would work. But we're not there.
As a transitional step I have introduced (based on prior work in PostgreSQL) a E prefix to strings. With the E prefix, strings are subject to backslash interpretation and this will continue to be the case, even when non-prefixed strings are no longer subject to this.
So what works now and will continue to work is:
select ... from ... where ... and name like E'%\\_\\___' escape E'\\';
On 30/07/2019 22.41, Gotwisner, Dave wrote:
Jenny,
Now that I have confirmed the syntax, I believe there might be a problem with like and not like as used below.
Here is the test schema:
set schema foof; create table config (a integer); create table data_0000 (a integer); create table data_0001 (a integer); create table data_0002 (a integer); create table data_0003 (a integer); create merge table data (a integer); create table test_table (a integer); create table main__yyyy_mm_dd__hh__0000 (a integer); create table main__yyyy_mm_dd__hh__0001 (a integer); create table main__yyyy_mm_dd__hh__0002 (a integer); create table main__yyyy_mm_dd__hh__0003 (a integer); create merge table main__yyyy_mm_dd__hh (a integer); set schema sys; select * from schemas where name = 'foof';
+--------------+------+----------------------+----------+-----------+ | id | name | authorization | owner | system | +========+======+===========+=======+=======+ | 1050825 | foof | 3 | 3 | false | +-------------+----------+------------------+-----------+-----------+
Select * from tables where schema_id = 1050825;
Shows all the 12 tables I created above
Select * from tables where schema_id = 1050825 and name like '%\_\_____';
Lists all 11 tables. My belief is this should only list the 4 main__ tables and not the main__ merge table. The only table not listed here is 'data'.
Doing the same query without the back-slashes gives the same result.
Doing the complement (not like, instead of like), gives the data table.
For the like expression, I would have expected only 4 tables. For the not like, I would have expected 8 tables.
Is this a bug? Confirming from your response, \ escapes _ to change it from any single character to only the character _?
Dave
-----Original Message----- From: users-list
On Behalf Of Ying Zhang Sent: Tuesday, July 30, 2019 4:55 AM To: Communication channel for MonetDB users Subject: Re: Question with wildcarded expressions for LIKE clause This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On 25 Jul 2019, at 02:32, Gotwisner, Dave
wrote: I am trying to do some constraints off the system tables to select a unique pattern of tables.
We have tables like the following (I can’t give the actual table names): • Config (a single record exists here, no merge tables use it) • Data_0000 (a small number of records exist in each of these) • Data_0001 • Data_0002 • Data_0003 • Data (this is a merge table over the last 4) • Test_table (a small number of records exist here, no merge tables use it) • Main__yyyy_mm_dd__hh__0000 (millions of records, yyyy, mm, dd, hh are all numbers, we have one set of tables per hour, and keep significantly over 1 weeks’ worth of data) • Main__yyyy_mm_dd__hh__0001 • Main__yyyy_mm_dd__hh__0002 • Main__yyyy_mm_dd__hh__0003 • Main__yyyy_mm_dd__hh (this is a merge table over the last 4)
If I do a select * from sys.tables where schema_id in (select id from schemas where name = ‘frodo’) and type in (select table_type_id from table_types where table_type_name = ‘TABLE’) I get the right thing – all non merge tables. I then want to further break it down into the tables that end in __[0-9][0-9][0-9][0-9] and those that don’t (two separate queries for two separate purposes). Consider frodo the name of our schema.
Adding and name like ‘data\_\_____’ appears to work. This should give me the 4 data_* tables.
Adding and name like ‘%\_\_____’ gives me all non-merge tables, including config and test_table. I’m assuming % is the wild card for 0 or more, and _ is the wild card for 1. This matches some documentation I have found. The \ are required because I need to explicitly look for 2 of the underscore character.
Hai Dave, your understanding of the definitions of the wild card characters is correct.
Jennie
Am I doing this correctly? Are these the right wildcarding characters? MS SQLServer gives a # as a match for a single digit, is there an equivalent? Or, is this just a bug in MonetDB?
Thanks, Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_ mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41 iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe 40joYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e=
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_m ailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iO HYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=qKnnHkmN-t1tWFOLvPwsqL3NDe40j oYp2MqEu57Lbho&s=jEq0840pRXkiOuawO9EM6YB00D2d3iUplyunk5iQRjY&e= _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
participants (3)
-
Gotwisner, Dave
-
Sjoerd Mullender
-
Ying Zhang