Focal Point
testing for substring pattern

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/7641058331

April 27, 2005, 06:34 PM
<Lisa M>
testing for substring pattern
I currently have a 4000 character alpha field (Comments). I need to test to see if a substring matching the pattern ####-## is located in the field. Usually this number occurs last in the field (though the overall length of the field varies). Most of the time the numbers are preceded by '^'. At a minimum I need to display records containing these numbers. Ideally, I'd like to populate these numbers into a define field. Any suggestions?

Just FYI, I'm using Dev Studio 523.

Lisa Smiler
April 27, 2005, 07:19 PM
ET
The posit subroutine is probably what you need to use. I have pasted the the instructions for this function below:

POSIT: Finding the Beginning of a Substring
Available Operating Systems: All
Available Languages: reporting, Maintain
The POSIT function finds the starting position of a substring within a larger string. For
example, the beginning position of the substring DUCT in the string PRODUCTION is
position 4. If the substring is not in the parent string, the function returns the value 0.
Syntax How to Find the Beginning of a Substring
POSIT( parent, inlength, substring, sublength, outfield)
where:
parent
Alphanumeric
Is the field containing the parent character string.
inlength
Integer
Is the parent field length. If this argument is less than or equal to 0, the function
returns 0.
substring
Alphanumeric
Is the substring whose position you want to find. This can be the substring enclosed
in single quotation marks, or the field that contains the string.
sublength
Integer
Is the length of substring. If this argument is less than or equal to 0, or if it is greater
than the inlength argument, the function returns a 0.
outfield
Integer
Is the name of the field to which the position is returned, or the format of the output
value enclosed in single quotation marks.
Note: In Dialogue Manager, the format must be specified. In Maintain, the name of
the field must be specified.

Example Finding the Beginning of a String
In the following example, POSIT determines the position of the first capital letter I in
LAST_NAME, and saves the result in I_IN_NAME.
TABLE FILE EMPLOYEE
PRINT LAST_NAME AND COMPUTE
I_IN_NAME/I2 = POSIT(LAST_NAME, 15, 'I', 1, 'I2');
WHERE DEPARTMENT EQ 'PRODUCTION'
END
The output is:

LAST_NAME I_IN_NAME
--------- ---------
STEVENS 0
SMITH 3
BANNING 5
IRVING 1
ROMANS 0
MCKNIGHT 5
April 27, 2005, 07:55 PM
<Lisa M>
Thanks ET, I've tried using the '^' as the identifier, but it may appear many times in the string. Sorry, should have specified that in the previous post.

I did get the POSIT, as suggested, to count to the first '^' successfully, but when I tried using the $$$$-$$, it returns a 0 for all records.

I think that I really need to somehow do a CONTAINS test on $$$$-$$ or ^$$$$-$$, but I can't get it to work that way.

Lisa
April 27, 2005, 08:53 PM
Francis Mariani
It has to be a two-step process. One to extract the data that matches the pattern, the other to select based on the extracted data.

Here's a working example. The assumption is that there could be only one - in the data.

DEFINE FILE CAR
TESTFIELD/A4000 = IF COUNTRY IN ('JAPAN', 'W GERMANY')
THEN 'THIS IS A TEST FOR' || (' ' | COUNTRY) || ' - WITHOUT NUMERIC DATA'
ELSE 'THIS IS A TEST FOR' || (' ' | COUNTRY) || ' ' | EDIT(RPM,'$9999') | '-' | EDIT(SEATS,'$99') | ' WITH NUMERIC DATA';
-* POSIT(parent, inlength, substring, sublength, outfield)
POS/I4 = POSIT(TESTFIELD,4000,'-',1,POS);
-* SUBSTR(inlength, parent, start, end, sublength, outfield)
EXTRACTED/A7 = SUBSTR(4000, TESTFIELD, POS - 4, POS + 2, 7, 'A7');
END
TABLE FILE CAR
PRINT
TESTFIELD
POS
EXTRACTED
WHERE EXTRACTED FROM '0000-00' TO '9999-99';
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLESHEET *
TYPE=REPORT, GRID=OFF, FONT='VERDANA', SIZE=8, $
ENDSTYLE
END

TESTFIELD is sample test data.
POS determines the position of the - char.
EXTRACTED is the data that matches the string around the - char.
The WHERE statement selects only numeric data.
April 28, 2005, 02:51 PM
ET
I do not know why it does not work. Our version of focus only allows a max length for an alpha field of 256. Following is an example using the car file. All works here. You can then use the substr function already posted to extract what you need.
Is it an alpha field or a txt field? If a txt field then I think you need to define it as an alpha field first and then do the if tests on the alpha field.

DEFINE FILE CAR
RECNUM/I1 WITH COUNTRY=LAST RECNUM + 1;
TESTFIELD/A256 WITH COUNTRY =
IF RECNUM EQ 1 THEN 'BEGINS IN 14 $$$$-$$' ELSE
IF RECNUM EQ 2 THEN 'BEGINS IN 16 $$$$-$$' ELSE
'HAS NONE';
POS/I4 WITH COUNTRY =POSIT(TESTFIELD,256,'$$$$-$$',7,POS);
END
TABLEF FILE CAR
PRINT COUNTRY POS TESTFIELD
END

COUNTRY POS TESTFIELD
------- --- ---------
ENGLAND 14 BEGINS IN 14 $$$$-$$
FRANCE 16 BEGINS IN 16 $$$$-$$
ITALY 0 HAS NONE
JAPAN 0 HAS NONE
W GERMANY 0 HAS NONE

results of defines

--------------------------------------------------

TABLEF FILE CAR
PRINT COUNTRY POS TESTFIELD
IF TESTFIELD CONTAINS $$$$-$$
END

COUNTRY POS TESTFIELD
------- --- ---------
ENGLAND 14 BEGINS IN 14 $$$$-$$
FRANCE 16 BEGINS IN 16 $$$$-$$

If test with contains works

-------------------------------------------------

TABLEF FILE CAR
PRINT COUNTRY POS TESTFIELD
IF POS NE 0
END

COUNTRY POS TESTFIELD
------- --- ---------
ENGLAND 14 BEGINS IN 14 $$$$-$$
FRANCE 16 BEGINS IN 16 $$$$-$$

If test on pos field also works.

-------------------------------------------------
April 28, 2005, 08:46 PM
<Lisa M>
ET, thanks again for your suggestions. We decided to take the "90% is OK" route and have just created 2 DEFINE fields. One which determines the location of the last character. Then a second which pulls the last 7 digits out of the field. I've listed the 2 formulas below...

LENGTH/I11=ARGLEN(4000, COMMENTS, LENGTH);

SUBSTRING/A7=SUBSTR ( 4000 , COMMENTS , (LENGTH-6) ,LENGTH, 7 , SUBSTRING );

Thanks again for all the help!

Lisa Smiler