
Adam, A solution in Python could use the following SQL snippets: -- example data drop table if exists d; create table d as select 'key1=value1, key2=value2, key3=value3' as s union all select 'key1=value4, key2=value5, key3=value6'; -- create utility function create or replace function pcre_replace(origin string, pat string, repl string, flags string) returns string external name pcre.replace; -- split - if you know the key and the number of columns select pcre_replace(s,'.*key1[=]\([a-zA-Z0-9-]*\),.*','\\1','') as value1, pcre_replace(s,'.*key2[=]\([a-zA-Z0-9-]*\),.*','\\1','') as value2, pcre_replace(s,'.*key3[=]\([a-zA-Z0-9-]*\).*','\\1','') as value3 from d; A more general solution in Python could be similar to the following R solution: ## use DBI and MonetDB.R packages library(DBI) library(MonetDB.R) ## create a connection con <- dbConnect(MonetDB.R::MonetDB.R(), dbname="testt") ## example data dbWriteTable(con,"d", data.frame(s=c('key1=value1, key2=value2, key3=value3', 'key1=value4, key2=value5, key3=value6'), stringsAsFactors=FALSE), overwrite=TRUE) ## utility function is.na(dbExecute(con, "create or replace function pcre_replace(origin string, pat string, repl string, flags string) returns string external name pcre.replace")) ## function to split a column and create a new table splitColumn <- function(con, tableName, columnName, newTableName) { ## read the first row d1 <- dbGetQuery(con, sprintf("select %s from %s limit 1", columnName, tableName)) ## get the number of splits n <- length(strsplit(d1$s[1],", ")[[1]]) ## get the key key <- gsub("^(.*)1=.*","\\1",d1$s[1]) ## create the regexs regex <- paste0(".*",key,1:n,"=\\([a-zA-Z0-9_-]*\\).*") ## create the new table dbExecute(con, sprintf("drop table if exists %s", newTableName)) stmt <- sprintf("create table %s as select %s from %s", newTableName, paste0(sprintf("pcre_replace(%s, '%s','\\\\1','') as value%i", columnName, regex, 1:n), collapse=", "), tableName) is.na(dbExecute(con, stmt)) } splitColumn(con, "d", "s", "d2") ## check the new table print(dbGetQuery(con, "select * from d2")) ## clean up dbDisconnect(con) I hope that this is helpful. -- Mark On 06/14/2018 04:41 PM, Doherty, Adam wrote:
Hi Stefan:
Thanks for the reply. I created a UDF which returns a table, however when I try to run the function I get the following error -
Error: Embedded Python is enabled but an error was thrown during initialization. SQLState: PY000 ErrorCode: 0
I have Python 2.7.5 installed on CentOS 7.4
Adam
-----Original Message----- From: users-list [mailto:users-list-bounces+adam.doherty=esso.ca@monetdb.org] On Behalf Of Stefan Manegold Sent: Thursday, June 14, 2018 02:39 To: Communication channel for MonetDB users
Subject: Re: pymonetdb or Python UDF Hi Adam,
if you know how to do it (efficiently) in Python, it's merely a matter of wrapping that Python code into a suitable Python UDF; cf., https://www.monetdb.org/blog/embedded-pythonnumpy-monetdb
Alternatively, you could consider using a (JIT-compiled) C(/C++) UDF; cf., https://www.monetdb.org/node/437
Best, Stefan
----- On Jun 13, 2018, at 6:31 PM, Doherty, Adam adam.doherty@esso.ca wrote:
Hello all:
This is more of curiosity but I have data in my events table currently trapped in a nvarchar field.
I am wondering if I can build a table and do execute a select into with a python UDF pulling data from the existing fields plus the values of my nvarchar field.
My string data has the following format –
“key1=value, key2=val, key3=value”…
I want to break out the data first by splitting on the , and then the =
Keys would be the columns in the table.
I can do this with a python script and pymonetdb but a UDF would be faster.
Adam Doherty
Process Control Systems Administrator
Kearl Technical, Imperial
505 Quarry Park Blvd SE
W3B.224
Calgary, AB T2C 5N1
adam.doherty@esso.ca P 587.476.4640 M 403.869.1636
imperialoil.ca | Twitter | YouTube
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list