As of December 1, 2020, Focal Point is retired and repurposed as a reference repository. We value the wealth of knowledge that's been shared here over the years. You'll continue to have access to this treasure trove of knowledge, for search purposes only.
Join the TIBCO Community TIBCO Community is a collaborative space for users to share knowledge and support one another in making the best use of TIBCO products and services. There are several TIBCO WebFOCUS resources in the community.
From the Home page, select Predict: WebFOCUS to view articles, questions, and trending articles.
Select Products from the top navigation bar, scroll, and then select the TIBCO WebFOCUS product page to view product overview, articles, and discussions.
Request access to the private WebFOCUS User Group (login required) to network with fellow members.
Former myibi community members should have received an email on 8/3/22 to activate their user accounts to join the community. Check your Spam folder for the email. Please get in touch with us at community@tibco.com for further assistance. Reference the community FAQ to learn more about the community.
I have a request to search a db for misspelled names and the field I need to compare to is a single column that contains full name.
I have a PRSN_LST_NAME column with possible misspelled Last Name “Simth”
I need to compare it to a column on another table that has the INS_FULL_NAME ‘John Smith’.
I can't just say WHERE PRSN_LST_NAME NE INS_FULL_NAME because of the obvious. One has last name only and the other has full name so yes they are different.
What I need to do is be able to pull the last name out of the full name column. Is there a way to read this column and break it up into multiple columns using the ' 'space as a dilimiter?
If there was just First and Last name I think I could do it but we have no way of knowing how many different variations there could be.
Ex. John Smith; Mr John Smith; Mr John Smith III; Rev.John A Smith Jr III
If I could break it out I could then take the Last name column on the focus table and compare that with PRSR_LST_NAME to find only the records that are different.
Any help would be greatly appreciated.
Thanks
JVThis message has been edited. Last edited by: <Kathryn Henning>,
because "Smith" is in "Micheal W. Smith". However, as Waz pointed out, it wo'nt work for multi-word last names. But, You can parse out the fullname and search for each "piece" in the full name... Just some more fun with names...
Posts: 3132 | Location: Tennessee, Nashville area | Registered: February 23, 2005
If I use contains I will only get the names that match, which would only give the names that are correctly spelled. I need to find the misspelled names which could be in either column but not in both.
So there is no way to break the INS_FULL_NAME column out into individual columns? Ex. FNAME MNAME LNAME TNAME John A Smith JR
So there is no way to break the INS_FULL_NAME column out into individual columns?
Well, you could try and use GETTOK as Waz already suggested but this only works if you guarantee that your data follows a common and simple pattern (i.e. last name is always the second "token" and has no spaces in between) which is obviously not your case.
This requires a more advanced "parser" and you may be better off trying to have this separation implemented directly in your database but even doing it at that layer involves a similar complexity if your data was never captured separating the individual components of a name in the first place.
Not necessarily. CONTAINS PRSN_LST_NAME mean it contains all of it, including the trailing blanks. Even where the last name is truly last (no JR.), the truth value of that predicate depends on the sizes of the two fields.
If you can convert them to varchar (AnV), and get the length attribute to equal the trimmed length, it should work. Anyone know how to accomplish that?
- Jack Gross WF through 8.1.05
Posts: 1925 | Location: NYC | In FOCUS since 1983 | Registered: January 11, 2005
Here's a possible solution. Use function STRREP to replace the test value portion (last name) of your source string with a null. If a matching test value is found in the source string, it will be replaced with a null, otherwise no replacement will take place. Then compare the lengths of the source string before and after the replacement. If a null was inserted, a matching value was found and the string lengths will differ. If the string lengths are the same before and after the replacement, then no match was found. In the sample code below, I have upper-cased both source and search strings to ensure no-matches are not the result of differences in capitalization. I created DEFINE SEARCH_STR so some values will be found and some will not (i.e., JAPAN and ITALY). In your case, SEARCH_STR would be replaced by PRSN_LST_NAME. You can uncomment the WHERE clause to get a list of only those instances where no match was found.
DEFINE FILE CAR
SOURCE_STR/A15 = UPCASE(15,COUNTRY,'A15');
SOURCE_LEN/I2 = ARGLEN(15,SOURCE_STR,'I2');
SEARCH_STR/A15 = IF (SOURCE_STR EQ 'ITALY') THEN 'LAT' ELSE
IF (SOURCE_STR EQ 'JAPAN') THEN 'PAJ' ELSE UPCASE(15,EDIT(COUNTRY,'$9999$'),'A15');
STRING_LEN/I2 = ARGLEN(15,SEARCH_STR,'I2');
REPLACED/A15 = STRREP(15,SOURCE_STR,STRING_LEN,SEARCH_STR,0,'X',15,'A15');
REPLACE_LEN/I2 = ARGLEN(15,REPLACED,'I2');
END
-*
TABLE FILE CAR
PRINT SOURCE_STR SOURCE_LEN SEARCH_STR REPLACED REPLACE_LEN
-* WHERE (SOURCE_LEN EQ REPLACE_LEN);
END
Edit: Actually, here's a simpler solution using the POSIT function. Uncomment the WHERE clause to limit output to non-matching instances.
DEFINE FILE CAR
SOURCE_STR/A15 = UPCASE(15,COUNTRY,'A15');
SEARCH_STR/A15 = IF (SOURCE_STR EQ 'ITALY') THEN 'LAT' ELSE
IF (SOURCE_STR EQ 'JAPAN') THEN 'PAJ' ELSE UPCASE(15,EDIT(COUNTRY,'$9999$'),'A15');
STRING_LEN/I2 = ARGLEN(15,SEARCH_STR,'I2');
STRING_POS/I2 = POSIT(SOURCE_STR,15,SEARCH_STR,STRING_LEN,'I2');
END
-*
TABLE FILE CAR
PRINT SOURCE_STR SEARCH_STR STRING_POS
-* WHERE (STRING_POS EQ 0);
END
This message has been edited. Last edited by: Dan Satchell,
WebFOCUS 7.7.05
Posts: 1213 | Location: Seattle, Washington - USA | Registered: October 22, 2007
Why not just compare to see if Full Name Contains the Last Name?
-*-********* TO TEST
DEFINE FILE CENTHR
lname/A14=IF CENTHR.EMPSEG.ID_NUM EQ 1 THEN 'DAMKIEWICZ,' ELSE CENTHR.EMPSEG.LNAME;
END
-***** for Production uncomment
-*DEFINE FILE CENTHR
-*lname/A14=CENTHR.EMPSEG.LNAME;
-*END
TABLE FILE CENTHR
PRINT
CENTHR.EMPSEG.FULLNAME
CENTHR.EMPSEG.LNAME
CENTHR.EMPSEG.ID_NUM
COMPUTE Flag/I1 = IF CENTHR.EMPSEG.FULLNAME CONTAINS lname THEN 0 ELSE 1;
ON TABLE SET PAGE-NUM NOLEAD
ON TABLE NOTOTAL
ON TABLE HOLD AS HOLD_NAMES FORMAT ALPHA
ON TABLE SET STYLE *
$
ENDSTYLE
END
TABLE FILE HOLD_NAMES
PRINT
HOLD_NAMES.HOLD_NAM.FULLNAME
HOLD_NAMES.HOLD_NAM.LNAME
HOLD_NAMES.HOLD_NAM.ID_NUM
HOLD_NAMES.HOLD_NAM.lname
BY LOWEST HOLD_NAMES.HOLD_NAM.Flag
WHERE HOLD_NAMES.HOLD_NAM.Flag EQ 0;
ON TABLE SET PAGE-NUM NOLEAD
ON TABLE NOTOTAL
ON TABLE PCHOLD FORMAT HTML
ON TABLE SET HTMLCSS ON
ON TABLE SET STYLE *
INCLUDE = endeflt,
$
ENDSTYLE
END
The define might not be needed, you could compare right to the Last Name field, but I needed to misspell a last name to validate my logic.
~david
WebFOCUS 8.0.05 Windows- Excel, PDF
Posts: 21 | Location: Crystal City, VA & Kalamazoo, MI | Registered: October 11, 2010
Looks like all these examples are assuming I will know the name they are looking for but I won't. This is a report of misspelled names that will run bi weekly in report caster. Starting to look like this report is not freezable.
Other than trying the SOUNDEX function pointed out by Tony, I don't see how else you'd determine something as misspelt vs. a plain non-matching, unless of course you have access to some sort of dictionary table that gives you a list of possible misspellings for each "known" last name:
LAST_NAME ALTERNATIVE_LM
-----------------------------
Smith Simth
Smith Smiht
Smith Msith
Smith Misth
Jones Jnoes
Jones Jonse
Jones Ojnes
....
Having that should make it a breeze to get your records where PRSN_LST_NAME CONTAINS ALTERNATIVE_LM. Otherwise, well, interesting challenge.
Something like this will give you every possible first/last name combination from your list of full names:
TABLE FILE CAR
PRINT
COMPUTE NAME_ID/I11=LAST NAME_ID + 1;
MODEL AS NAME
COMPUTE WORD_CNT/I11 = 0;
COMPUTE WORD/A24V = '';
ON TABLE SET ASNAMES ON
ON TABLE SET HOLDLIST PRINTONLY
ON TABLE HOLD FORMAT ALPHA
END
-RUN
-REPEAT LOOP &LINES TIMES;
-READ HOLD NOCLOSE &NAME_ID.I11. &NAME.A24.
-SET &WORD_CNT=0;
-IS_WORD
-SET &WORD_CNT=&WORD_CNT+1;
-SET &WORD=GETTOK(&NAME, 24, &WORD_CNT, ' ', 24, 'A24');
-IF &WORD EQ '' GOTO NO_WORD;
MODIFY FILE HOLD
FREEFORM NAME_ID NAME WORD_CNT WORD
MATCH NAME_ID WORD_CNT
ON NOMATCH INCLUDE
DATA
&NAME_ID,&NAME,&WORD_CNT,&WORD,$
END
-GOTO IS_WORD
-NO_WORD
-LOOP
-CLOSE HOLD
TABLE FILE HOLD
PRINT
NAME_ID
NAME
WORD_CNT
WORD
WHERE WORD_CNT GT 0;
ON TABLE HOLD FORMAT FOCUS
END
JOIN FILE HOLD AT NAME_ID TAG FN TO ALL
FILE HOLD AT NAME_ID TAG LN AS J1
WHERE FN.NAME_ID EQ LN.NAME_ID;
WHERE FN.WORD_CNT LT LN.WORD_CNT;
END
TABLE FILE HOLD
PRINT
FN.NAME_ID
FN.NAME
FN.WORD AS FIRST_NAME
LN.WORD AS LAST_NAME;
END
You can then compare that with your list of correct first/last names to find those without an exact match.
I kind of doubt WebFOCUS's SOUNDEX will work well for finding the correct spelling. Ideally you'd have something like a Levenshtein distance subroutine to find the closest match.
If you have multi-word first/last names this won't work. It would really just be a lot easier to do this in an environment with arrays and regular expressions.
If I understand what you are trying to achieve, is this on the right track ?
You will have to do more checks to see if all parts match or just some.
-* Write out a master to read the TMP_NAM1 list
EX -LINES 5 EDAPUT MASTER,TMP_NAM1,CV,FILE
FILENAME=TMP_NAM1, SUFFIX=FIX,$
SEGNAME=TMP_NAM1, $
FIELD=KEY ,ALIAS= ,I9 ,A10, $
FIELD=FULL_NAME ,ALIAS= ,A70 ,A70 ,$
-* Write out a master to read the TMP_NAM1 list
EX -LINES 6 EDAPUT MASTER,TMP_NAM2,CV,FILE
FILENAME=TMP_NAM2, SUFFIX=FIX,$
SEGNAME=TMP_NAM2, $
FIELD=KEY ,ALIAS= ,I9 ,A10, $
FIELD=FIRST_NAME ,ALIAS= ,A30 ,A30 ,$
FIELD=LAST_NAME ,ALIAS= ,A40 ,A40 ,$
-* Write out a master to read the TMP_NAM1 list
EX -LINES 3 EDAPUT FOCTEMP,TMP_NAM1,CV,FILE
1 Georg Johannes von Trapp
2 Mrs Jane Jones
-* Write out a master to read the TMP_NAM1 list
EX -LINES 3 EDAPUT FOCTEMP,TMP_NAM2,CV,FILE
1 Georg von Trapp
2 Jane Jonse
-RUN
FILEDEF TMP_NAM1 DISK tmp_nam1.ftm
FILEDEF TMP_NAM2 DISK tmp_nam2.ftm
-RUN
DEFINE FILE TMP_NAM1
NAME_IDX/I9 = 1 ;
END
TABLE FILE TMP_NAM1
BY KEY
BY FULL_NAME
FOR NAME_IDX
1 OVER
2 OVER
3 OVER
4 OVER
5 OVER
6
ON TABLE HOLD AS HLD_NAM1
END
DEFINE FILE HLD_NAM1
NAME_IDX/I9 = EDIT(E03) ;
END
TABLE FILE HLD_NAM1
PRINT FULL_NAME
COMPUTE PART_NAME/A40 = GETTOK(FULL_NAME,70,NAME_IDX,' ',40,'A40') ;
BY KEY
BY NAME_IDX
ON TABLE HOLD AS FOC_NAM1 FORMAT FOCUS INDEX KEY
END
JOIN KEY IN TMP_NAM2 TO ALL KEY IN FOC_NAM1 AS J1
DEFINE FILE TMP_NAM2
LAST_IDX/I9 = 1 ;
END
TABLE FILE TMP_NAM2
SUM MAX.FIRST_NAME
MAX.LAST_NAME
MAX.FULL_NAME
MAX.PART_NAME
BY KEY
BY NAME_IDX
PRINT COMPUTE DUMMY/A1 = ' ' ;
BY KEY
BY NAME_IDX
FOR LAST_IDX
1 OVER
2 OVER
3 OVER
4 OVER
5 OVER
6
ON TABLE HOLD AS HLD_NAM3
END
DEFINE FILE HLD_NAM3
LAST_IDX/I9 = EDIT(E07) ;
END
TABLE FILE HLD_NAM3
PRINT
FIRST_NAME
LAST_NAME
FULL_NAME
PART_NAME
COMPUTE PART_LAST/A40 = GETTOK(LAST_NAME,70,LAST_IDX,' ',40,'A40') ;
COMPUTE MATCH/A1 = IF PART_LAST EQ PART_NAME THEN 'Y' ELSE 'N' ;
BY KEY
BY NAME_IDX
BY LAST_IDX
END
Ok, thanks for all the help, but it looks like it is not going to happen this way. I was able to find another table that had a last name on it that I could compare to. Not exactly what the customer wanted but it will half to do.