Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataform Merge into table

i everyone,

I'm currently trying to leverage Dataform's "incremental" function to perform a merge (upsert) operation on a table (table1). The logic I need is as follows:

INSERT: If a record resulting from my SELECT statement doesn't have a matching record in table1 across all fields, it should be inserted.
UPDATE: If a matching record exists in table1 (based on all fields), it should be updated with the values from the SELECT statement.
I've written the SQL code for this merge logic

MERGE INTO `project-001-prj.Test_merge.Tab_final` AS target
USING (
  SELECT 
    Tab_a.campo1,
    Tab_a.campo2,
    Tab_a.campo3,
    Tab_a.campo4,
    Tab_a.campo5,
    CURRENT_DATE() AS dat_rif
  FROM `project-001-prj.Test_merge.Tab_a` AS Tab_a
  LEFT OUTER JOIN `project-001-prj.Test_merge.Tab_b` AS Tab_b
  ON Tab_a.campo1 = Tab_b.campojoin
) AS source
ON 
  target.campo1 = source.campo1 AND
  target.campo2 = source.campo2 AND
  target.campo3 = source.campo3 AND
  target.campo4 = source.campo4 AND
  target.campo5 = source.campo5
WHEN MATCHED THEN
  UPDATE SET
    target.dat_rif = source.dat_rif,
    target.campo1 = source.campo1,
    target.campo2 = source.campo2,
    target.campo3 = source.campo3,
    target.campo4 = source.campo4,
    target.campo5 = source.campo5
WHEN NOT MATCHED THEN
  INSERT (
    campo1,
    campo2,
    campo3,
    campo4,
    campo5,
    dat_rif
  ) VALUES (
    source.campo1,
    source.campo2,
    source.campo3,
    source.campo4,
    source.campo5,
    source.dat_rif
  )



and when I execute it directly within the Dataform development environment, it works as expected. However, when I try to include this merge operation within a Dataform workflow and run the workflow, it consistently fails with an error.

"
            FROM UNNEST(dataform_columns) AS column);

          EXECUTE IMMEDIATE
          """
            CREATE OR REPLACE PROCEDURE `project-001-prj.Test_merge.df_698f639767141fd09f0849cf6429dd9d5e5bb964719d60c8ff1be6711558757c`() OPTIONS(strict_mode=false)
            BEGIN
              

              INSERT INTO `project-001-prj.Test_merge.Tab_final` 
              (""" || dataform_columns_list || """)
              SELECT """ || dataform_columns_list || """
              FROM (
                
MERGE INTO `project-001-prj.Test_merge.Tab_final`  AS target
USING (
  SELECT 
    Tab_a.campo1,
    Tab_a.campo2,
    Tab_a.campo3,
    Tab_a.campo4,-.....
	"

 

Has anyone encountered a similar issue when using the "incremental" function for this kind of merge/upsert scenario? Could there be specific considerations or configurations within the workflow that I might be missing, which are causing the discrepancy between the development environment and the workflow execution?

 

regards

 

Solved Solved
0 1 87
1 ACCEPTED SOLUTION

I answered myself 🙂
I would leave it for others.

I don't need to write the merge since dataform itself will do it.

you need to write the select and define the uniqueKey in the cofig on which it will then do the logic.

thanks

View solution in original post

1 REPLY 1

I answered myself 🙂
I would leave it for others.

I don't need to write the merge since dataform itself will do it.

you need to write the select and define the uniqueKey in the cofig on which it will then do the logic.

thanks