Focal Point
[CLOSED] Cleanest Way To Handle Concatenating Nullable Columns

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/1381057331/m/6907022196

April 05, 2019, 09:06 AM
Scott Holmes
[CLOSED] Cleanest Way To Handle Concatenating Nullable Columns
Hello-

In a Data Migrator dataflow, I am struggling trying to figure out how best to concatenate a series of columns when any number of them can be null on any given record. In my target transformation editor, I simply use an expression like: first_name || middle_name || last_name. In the transformation calculator, the Sample Data viewer gracefully handles if middle_name is null. However, the Test Transforms viewer returns null for the whole expression, as does the actual running of the dataflow. I could not find any transform function or operator that gracefully handles the null. The only thing I found was back in my SQL object, using the COALESCE() function on every column that may contain a null value. Are there any other ways to gracefully handle null values (i.e., treat null as an empty string) that isn't so manually intensive? Maybe using a different operator than || or some system / flow setting? My real world situation is much more complicated, where I will have dozens of dataflows handling hundreds of columns that may have nulls on any given record. I will literally be writing the COALESCE() function 1000+ times.

Thanks in advance for any insight or advice.

This message has been edited. Last edited by: FP Mod Chuck,


WebFOCUS Release 82; DataMigrator Release 7707
April 09, 2019, 04:16 PM
FP Mod Chuck
Scott

I don't have data to try to replicate your issue but the one question I have is.. Does your master file have MISSING=ON on the fields in question.


Thank you for using Focal Point!

Chuck Wolff - Focal Point Moderator
WebFOCUS 7x and 8x, Windows, Linux All output Formats
April 10, 2019, 09:44 AM
Scott Holmes
Yes, MISSING=ON is present for both the source and target fields. In my (simplified) scenario, the source table has FIRST, MIDDLE, and LAST columns, any of which could be null. My target table has a FULL_NAME column. I actually know that FULL_NAME should never be null because I'll always have at least one source field populated. I just don't know which ones. The Oracle table backing my target has this column as nullable,and I'm guessing MISSING=ON is the default setting in this scenario.

Please let me know if any additional information would be helpful. Thanks.


WebFOCUS Release 82; DataMigrator Release 7707
April 10, 2019, 11:23 AM
jgelona
I'm not a DM person but have you looked at using DEFINEs in the .mas?

In your example, I would create 4 DEFINE fields in the .mas file like this:
DEFINE PFNM/A25=IF FIRST IS MISSING THEN ' ' ELSE FIRST;
DEFINE PMNM/A25=IF MIDDLE IS MISSING THEN ' ' ELSE MIDDLE;
DEFINE PLNM/A25=IF LAST IS MISSING THEN ' ' ELSE LAST;
DEFINE FULL_NAME/A75=PFNM || PMNM || PLNM;


This way you do it once and it is always available. Would something like that work for you?

Another option is:
TABLE FILE FILE1
PRINT FIRST MIDDLE LAST
   ON TABLE SET NODATA ' '
   ON TABLE HOLD MISSING OFF
END
DEFINE FILE HOLD
  FULL_NAME/A75=FIRST || MIDDLE || LAST;
END



In FOCUS since 1985. Prod WF 8.0.08 (z90/Suse Linux) DB (Oracle 11g), Self Serv, Report Caster, WebServer Intel/Linux.
April 10, 2019, 11:52 AM
FP Mod Chuck
Scott

jgelona's first idea was going to be my next suggestion as well. And it is possible to do that with DM.


Thank you for using Focal Point!

Chuck Wolff - Focal Point Moderator
WebFOCUS 7x and 8x, Windows, Linux All output Formats
April 11, 2019, 11:34 AM
Scott Holmes
Thanks for the replies. I'm very new to Data Migrator so I had to research these suggestions as much as I could. If I understand correctly, a DEFINE would be needed for each column that could be null. That would probably end up being an even swap between writing a COALESCE vs. a DEFINE for each column, in my case. I could get a certain level of reusability using a DEFINE in the synonym instead of COALESCE in a flow, but still very manually intensive.

The second example of jgelona's sounds like it might work but it's definitely above my understanding of DM right now. Is the snippet representative of a stored procedure in the FOCUS language? I searched the user guide for the keyword NODATA and couldn't find any occurrences. There are a couple of posts on these forums that give me some clues, but can anyone point me to a document / resource that I can read up on how to write my own version of the example?

Another option I came up with is to write out to a delimited file as my target and then read each row back in as a single column. In my real-world scenario, I am trying end of with a target table that has a single column that holds a pipe-delimited string of several columns from a source table (ex. if middle name missing, ...|somevalue|john||smith|anothervalue|...). Creating a pipe delimited target sidesteps the null issue (no longer concatenating values) and having to learn / write in a language I don't know yet, but of course comes at the cost of passing through a set of data two times instead of once. The performance cost may be acceptable if it minimizes the maintenance cost of supporting the flows.


Thanks again for any ideas, directions, and insight.


WebFOCUS Release 82; DataMigrator Release 7707
April 11, 2019, 02:24 PM
dbeagan
You could have:
 
 DEFINE FUNCTION concatnull(s1/A100,s2/A100,s3/A100)
 S1/A100V = IF s1 EQ MISSING THEN ' ' ELSE s1;
 S2/A100V = IF s2 EQ MISSING THEN ' ' ELSE s2;
 S3/A100V = IF s3 EQ MISSING THEN ' ' ELSE s3;
 concatnull/A300V = LTRIM(RTRIM(S1 || S2 || S3)) ;
 END
 
in your edasprof.prf or other commonly executed place.

Then you could use it, for example:
 
 TABLE FILE systable
 PRINT
 COMPUTE i/I9 = i + 1;
 -* Make up some null/missing values to test with.
 COMPUTE NAMEN/A128    MISSING ON = IF i EQ 1 THEN MISSING ELSE NAME;
 COMPUTE TYPEN/A128    MISSING ON = IF i EQ 2 THEN MISSING ELSE TYPE;
 COMPUTE CREATORN/A512 MISSING ON = IF i EQ 3 THEN MISSING ELSE CREATOR;
 COMPUTE TESTCONCAT/A50 = concatnull(NAMEN, TYPEN, CREATORN) ;
 WHERE READLIMIT IS 4
 END  



WebFOCUS 8.2.06
April 12, 2019, 08:29 AM
jgelona
Scott, actually there are 2 SET options, NODATA and HNODATA. By default, in an alpha format hold file, if the source field is null FOCUS use a period to represent a null/missing data value. NODATA and HNODATA allow you to override that. Here's a link to the 8.2.01 SET commands.

8.2.01 SET Parameter Syntax

Note: IBI is notorious for having undocumented SET commands, so this may not be a complete list.

This message has been edited. Last edited by: jgelona,


In FOCUS since 1985. Prod WF 8.0.08 (z90/Suse Linux) DB (Oracle 11g), Self Serv, Report Caster, WebServer Intel/Linux.