Focal Point
[CASE OPENED] Data Profiling on master file join using FOCUS Code.

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/4507033196

May 26, 2019, 06:07 AM
Neelima
[CASE OPENED] Data Profiling on master file join using FOCUS Code.
Hi All,

I need to implement the data profiling FOCUS code to in fex. please educate me here.

Thanks In Advance.

-Neelima

This message has been edited. Last edited by: FP Mod Chuck,


WebFocus 8104,8204
Excel/PDF/HTML/HTMLTABLE/XML/ALPHA/GIF file/GRAPH/Active technologies
May 27, 2019, 06:16 AM
Frans
PROFDATA CAR
HOLD FORMAT FOCUS AS TST

TABLE FILE TST PRINT *
END  



Test: WF 8.2
Prod: WF 8.2
DB: Progress, REST, IBM UniVerse/UniData, SQLServer, MySQL, PostgreSQL, Oracle, Greenplum, Athena.
June 02, 2019, 03:35 PM
FP Mod Chuck
Nice Thread

I had never seen that before, good to know!


Thank you for using Focal Point!

Chuck Wolff - Focal Point Moderator
WebFOCUS 7x and 8x, Windows, Linux All output Formats
June 03, 2019, 03:03 AM
Neelima
Thank you Frans!!

When I am applying this code on cluster.its taking lots of time.
Can we improve this performance by selected columns. if yes , please help me with the code.

Thank you so much!!!

-Neelima


WebFocus 8104,8204
Excel/PDF/HTML/HTMLTABLE/XML/ALPHA/GIF file/GRAPH/Active technologies
June 03, 2019, 09:33 AM
Clif
You can profile an individual column, for example:
PROFDATA CAR.COUNTRY.ORIGIN 

However if you want multiple columns that would require multiple calls.

If your signature is correct and you are still using 7.7.03 you would see improved performance by upgrading to the current Release 7.7.09. Also in that release the full syntax of the command is:

PROFDATA synonym[.segment[.field]] [,,,BRIEF]  


Where the new option, BRIEF, suppresses generation of the statistics Patterns Count, Mode and Median resulting in faster processing.


N/A
June 04, 2019, 08:45 AM
Doug at Kencura
I had not used PROFDATA so played with it. It could be very useful for profiling data.

Turning SQL traces on, however, you can see it runs into optimization issues when profiling SQL tables.
 SET TRACEOFF = ALL
 SET TRACEON = STMTRACE//CLIENT
 SET TRACEON = SQLAGGR//CLIENT
 SET XRETRIEVAL = ON
 SET TRACEUSER = ON 



The PROFDATA command wants to do multiple LST prefix operations which the SQL adapter doesn't like. The SQL engine doesn't summarize the answer set, but instead returns all records to let WebFOCUS do the aggregation and prefix operators. Here is the message:
 (FOC2590) AGGREGATION NOT DONE FOR THE FOLLOWING REASON:
 (FOC2617) MULTIPLE LST. IN REQUEST: 


Even when profiling a single column, PROFDATA still has a SQL optimization issue:
 (FOC2590) AGGREGATION NOT DONE FOR THE FOLLOWING REASON:
 (FOC2595) ONLY ADD, SUM, CNT, AVE, MIN, AND MAX PREFIXES CAN BE AGGREGATED 



Using WF8.2.04, the BRIEF option didn't seem to help with this issue.

So perhaps somebody can tweak the PROFDATA command for SQL optimization.

This message has been edited. Last edited by: Doug at Kencura,


Sincerely,

Doug Lautzenheiser
Multiple products (FOCUS, WebFOCUS, iWay), releases (4-8.2), platforms (e.g., MVS, OpenVMS, Linux, iSeries, Windows), databases (e.g., Oracle, SQL Server, DB2, IMS, Hyperstage, etc.), integrated technologies (e.g., C/C++, R, Python)
Currently doing great things on WF8.2
June 04, 2019, 01:51 PM
Clif
DataMigrator 7.7.09 corresponds to WebFOCUS 8.2.05.


N/A
June 05, 2019, 07:42 AM
Neelima
Hi Clif, I am using 8204.

Doug,

Is there any way to get the location of PROFDATA batch file. So that we can drive the method from there.

Thanks.


WebFocus 8104,8204
Excel/PDF/HTML/HTMLTABLE/XML/ALPHA/GIF file/GRAPH/Active technologies
June 06, 2019, 08:04 AM
Doug at Kencura
If PROFDATA is a FOCEXEC, it is hidden well. I suspected a LET command might be changing what PROFDATA actually meant but don't see that either.

It could very well be that PROFDATA is an internal function that only IB can change. Opening a case about PROFDATA and SQL optimization might be a good idea.

This message has been edited. Last edited by: Doug at Kencura,


Sincerely,

Doug Lautzenheiser
Multiple products (FOCUS, WebFOCUS, iWay), releases (4-8.2), platforms (e.g., MVS, OpenVMS, Linux, iSeries, Windows), databases (e.g., Oracle, SQL Server, DB2, IMS, Hyperstage, etc.), integrated technologies (e.g., C/C++, R, Python)
Currently doing great things on WF8.2
June 07, 2019, 02:22 AM
Neelima
Hi Doug,

OK. I opened the case for same and lets wait for their some good response.

Thanks
Neelima


WebFocus 8104,8204
Excel/PDF/HTML/HTMLTABLE/XML/ALPHA/GIF file/GRAPH/Active technologies
September 12, 2019, 04:51 AM
Frans
We've opened a case once for the performance thing, the problem is that patern, median, mode etc. cannot be converted to SQL in most cases.

if you use the BRIEF option it will only use optimized statements, be be aware to use the right syntax with 3 comma's:

PROFDATA CAR,,,BRIEF
HOLD AS TST

TABLE FILE TST PRINT *
END


Test: WF 8.2
Prod: WF 8.2
DB: Progress, REST, IBM UniVerse/UniData, SQLServer, MySQL, PostgreSQL, Oracle, Greenplum, Athena.