> Wouldn't it essentially need to do a full table scan to look for this substring?

Not a full table scan but a full column values scan. As the json values (internally stored as strings) are stored in a dictionary, duplicate column string values are stored only once in the dictionary.

The JSON data type is internally implemented a subclass of the STRING data type, so it will have the same performance as if you would store the JSON values as STRING (or VARCHAR or CLOB) column. However using JSON data type does provide you extras, such as check on JSON validity during insert/update and JSON specific functions.

For matching multiple tags, you potentially need to include multiple conditions such as

  tags LIKE '%foo%'   OR   tags LIKE '%bar%'

However if  bar  must occur after  foo  and both must occur then you can combine them into 1 condition:   tags LIKE '%foo%bar%'

The ILIKE operator is faster than the alternative: UPPER(tags) LIKE '%FOO%', so if you want to match case insensitive use ILIKE. This also applies to VARCHARs, CLOBs and TEXT columns.


If you encounter a performance problem and need to match some known keywords very often, you might do this work once and store its derived result in a dedicated column, for example:
ALTER TABLE my_table ADD COLUMN has_foo BOOLEAN NOT NULL DEFAULT FALSE;
UPDATE my_table SET has_foo = TRUE WHERE tags LIKE '%foo%';

ALTER TABLE my_table ADD COLUMN has_bar BOOLEAN NOT NULL DEFAULT FALSE;
UPDATE my_table SET has_bar = TRUE WHERE tags LIKE '%bar%';

SELECT count(*) FROM my_table WHERE has_foo OR has_bar;
-- many more queries which would need to do a match on foo or bar

This will run very fast.

Regards,
Martin

On 09-11-17 16:31, Brian Ploetz wrote:
Thanks Martin. What are the performance ramifications of using LIKE queries though? Wouldn't it essentially need to do a full table scan to look for this substring? And if I wanted to search for N tags at a time, it would be N full table scans?

On Thu, Nov 9, 2017 at 7:02 AM, martin van dinther <martin.van.dinther@monetdbsolutions.com> wrote:

Yes this is possible. You will need to use:  tags LIKE '%foo%' .  So include SQL % wildcards.

Note that LIKE matches case sensitive. To match case insensitive use ILIKE.


bash-4.4$ mclient
Welcome to mclient, the MonetDB/SQL interactive terminal (unreleased)
Database: MonetDB v11.27.10 (unreleased), 'demo'
Type \q to quit, \? for a list of available commands
auto commit mode: on

sql>CREATE TABLE my_table (tags json NOT NULL);
operation successful (11.805ms)

sql>INSERT INTO my_table VALUES ('["foo","bar","baz"]');
1 affected row (2.805ms)

sql>SELECT tags FROM my_table;
+---------------------+
| tags                |
+=====================+
| ["foo","bar","baz"] |
+---------------------+
1 tuple (1.944ms)

sql>SELECT tags FROM my_table WHERE tags LIKE '%foo%';
+---------------------+
| tags                |
+=====================+
| ["foo","bar","baz"] |
+---------------------+
1 tuple (2.271ms)

sql>SELECT count(*) FROM my_table WHERE tags LIKE '%foo%';
+------+
| L3   |
+======+
|    1 |
+------+
1 tuple (2.459ms)

sql>SELECT count(*) FROM my_table WHERE tags LIKE '%FOO%';
+------+
| L3   |
+======+
|    0 |
+------+
1 tuple (2.265ms)

sql>SELECT count(*) FROM my_table WHERE tags ILIKE '%FOO%';
+------+
| L3   |
+======+
|    1 |
+------+
1 tuple (2.437ms)

sql>\q


Regards,
Martin



On 06-11-17 17:19, Brian Ploetz wrote:
Hi,

Suppose I have a table with a "tags" column that is of the JSON datatype (https://www.monetdb.org/Documentation/Manuals/SQLreference/Types/JSON). The column will contain a JSON array of Strings. For example:

["foo","bar","baz"]

Is it possible to query for rows that contain a specific value in the tags array? For example, if I wanted to find all of the rows where tags contains "foo", my query would look something like:

select count(*) from my_table where tags ???? 'foo';


But it's not clear to me what ???? should be (or if it's even possible) based on the JSON documentation above.

Thanks in advance.
BP



_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list