Add note warning about NULL merge keys #196

mjalkio · 2017-08-18T04:11:32Z

As the description says, I just added a warning. I had a situation where one of my merge keys (I was using three) was allowed to be NULL and it didn't work because Redshift can't compare NULL = NULL.

Unfortunately, I don't think there's a way to support NULL merge keys.

hito4t · 2017-08-23T00:38:00Z

Thank you for the PR!

As you wrote, merge mode doesn't work for NULL keys.

Unfortunately, I don't think there's a way to support NULL merge keys.

Following SQL will work for NULL keys.

 (tableA.colA1 = tableB.colB1 or tableA.colA1 is null and tableB.colB1 is null)
  and (tableA.colA2 = tableB.colB2 or tableA.colA2 is null and tableB.colB2 is null)
  ...

Do you think it is useful to support NULL keys?

mjalkio · 2017-08-27T19:33:13Z

Hmm the query currently winds up looking something like this:

INSERT INTO tableB (colb1, colb2)
UNION ALL (
  SELECT colA1, colA2
  FROM tableA
  WHERE (colA1, colA2) NOT IN (
    SELECT (colb1, colb2)
    FROM tableB
  )
)

So I'm not sure if your technique would be able to be applied to this style of query.

hito4t · 2017-09-05T01:41:50Z

Thank you for your reply.

The above SQL can be modified to support NULL keys as follows.

INSERT INTO tableB (colB1, colB2)
UNION ALL (
  SELECT colA1, colA2
  FROM tableA
  WHERE NOT EXISTS
  (SELECT * FROM tableB
  WHERE 
    (colB1 = colA1 OR colB1 IS NULL AND colA1 IS NULL) AND
    (colB2 = colA2 OR colB2 IS NULL AND colA2 IS NULL))
)

But the SQL may be slower than the original SQL.
So even if embulk-output-redshift supports NULL merge keys, it should be optional.
(generates different SQLs depending on config property such as null_merge_keys.)

mjalkio · 2017-09-05T01:49:47Z

Wouldn't using WHERE NOT EXISTS return us to the same problem I fixed in #194 ?

hito4t · 2017-09-05T02:15:21Z

The old wrong SQL is as follows.

  ...
  SELECT col1, col2, ...
  FROM temp
  WHERE NOT EXISTS
  (SELECT 1 FROM
    temp, target
    WHERE
      temp.col1 = target.col1 AND
      temp.col2 = target.col2 AND
      ...

The slip is selecting from temporary table in subquery.
The following SQL will work.

  ...
  SELECT col1, col2, ...
  FROM temp
  WHERE NOT EXISTS
  (SELECT 1 FROM
    target
    WHERE
      temp.col1 = target.col1 AND
      temp.col2 = target.col2 AND
      ...

mjalkio · 2017-09-05T02:17:57Z

I think the problem will still exist where WHERE NOT EXISTS will either evaluate to TRUE or FALSE for all rows in temp.

hito4t · 2017-09-07T01:55:45Z

@mjalkio
In the former SQL, all rows in temp will be matched with all rows in target.
In the latter SQL, a temp row in the outside SELECT will be matched with all rows in target.

CREATE TABLE temp (
  col1 int,
  col2 int
);
INSERT INTO temp VALUES(11, 21);
INSERT INTO temp VALUES(12, 22);

CREATE TABLE target (
  col1 int,
  col2 int
);
INSERT INTO target VALUES(11, 21);
INSERT INTO target VALUES(13, 23);

SELECT col1, col2
FROM temp
WHERE NOT EXISTS
(SELECT 1 FROM
  temp, target
  WHERE
    temp.col1 = target.col1 AND
    temp.col2 = target.col2);

col1         col2
--------------------------

SELECT col1, col2
FROM temp
WHERE NOT EXISTS
(SELECT 1 FROM
  target
  WHERE
    temp.col1 = target.col1 AND
    temp.col2 = target.col2);

col1         col2
--------------------------
12           22

mjalkio · 2017-09-07T05:18:17Z

Hm, you're right. That's definitely not how I imagined that query would work, but it does!

hito4t · 2017-09-12T00:56:34Z

We may support NULL merge keys in future ( #198 ).
Other embulk-output-jdbc plugins may not support NULL merge keys ( #199 ).
Anyway, I'll merge this PR.

Add note warning about NULL merge keys

9660548

This was referenced Sep 12, 2017

Support NULL merge keys #198

Open

embulk-output-mysql/postgresql/oracle/sqlserver may not support NULL merge keys #199

Open

hito4t merged commit e353d89 into embulk:master Sep 12, 2017

hito4t added this to the 0.7.12 milestone Nov 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add note warning about NULL merge keys #196

Add note warning about NULL merge keys #196

mjalkio commented Aug 18, 2017

hito4t commented Aug 23, 2017

mjalkio commented Aug 27, 2017

hito4t commented Sep 5, 2017

mjalkio commented Sep 5, 2017

hito4t commented Sep 5, 2017

mjalkio commented Sep 5, 2017

hito4t commented Sep 7, 2017

mjalkio commented Sep 7, 2017

hito4t commented Sep 12, 2017

Add note warning about NULL merge keys #196

Add note warning about NULL merge keys #196

Conversation

mjalkio commented Aug 18, 2017

hito4t commented Aug 23, 2017

mjalkio commented Aug 27, 2017

hito4t commented Sep 5, 2017

mjalkio commented Sep 5, 2017

hito4t commented Sep 5, 2017

mjalkio commented Sep 5, 2017

hito4t commented Sep 7, 2017

mjalkio commented Sep 7, 2017

hito4t commented Sep 12, 2017