Focal Point
[SOLVED] Using Join Fields Cuts my Result Set in Half

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/8447017966

March 19, 2014, 04:15 PM
J.Hines
[SOLVED] Using Join Fields Cuts my Result Set in Half
This has been perplexing me for several hours now, but the setting is pretty simple.

I have one table (call it TABLE_DATA) with many records, and I'm joining it to another table (TABLE_DATA_LOOKUP) that should return just one record.

Here's my join:
 JOIN
 T_SPS_TOA_LOT_LOAD AT T_SPS_TOA_LOT_LOAD.T_SPS_TOA_LOT_LOAD.LOAD_ID
 TO UNIQUE TABLE_DATA_LOOKUP AT TABLE_DATA_LOOKUP.TABLE_DATA_LOOKUP.LOAD_ID
 TAG J0 AS J0
 END 


I am then using the two single fields in my HEADING:
  HEADING
"TOA Lot Load"
"Client: <TABLE_DATA.TABLE_DATA.CLIENT_ID "
"CUSIP: <J0.TABLE_DATA_LOOKUP.CUSIP"
"Custody Account #: <J0.TABLE_DATA_LOOKUP.CUSTODY_ACCOUNT" 


I then set a READLIMIT for testing, and no matter what I put in for READLIMIT, it returns exactly half the results. So if I ask for 100, it gives me 50 records back. If I remove the fields from the HEADING, then it returns the full 100 records.

Does anyone have a clue what's going on? This is driving me nuts!

This message has been edited. Last edited by: J.Hines,



Prod: 8.2.0.4 OS:Windows 10 Output:AHTML, Excel 2007+

The life of a designer is a life of fight against the ugliness.
March 19, 2014, 04:22 PM
Francis Mariani
If they're in the heading, once can assume they're dimensions, so make sure you have BY fieldname NOPRINT.

JOIN
 T_SPS_TOA_LOT_LOAD AT T_SPS_TOA_LOT_LOAD.T_SPS_TOA_LOT_LOAD.LOAD_ID
 TO UNIQUE TABLE_DATA_LOOKUP AT TABLE_DATA_LOOKUP.TABLE_DATA_LOOKUP.LOAD_ID
 TAG J0 AS J0
 END 

TABLE FILE T_SPS_TOA_LOT_LOAD
SUM
...
BY TABLE_DATA.TABLE_DATA.CLIENT_ID NOPRINT
BY J0.TABLE_DATA_LOOKUP.CUSIP NOPRINT
BY J0.TABLE_DATA_LOOKUP.CUSTODY_ACCOUNT NOPRINT

HEADING
"TOA Lot Load"
"Client: <TABLE_DATA.TABLE_DATA.CLIENT_ID "
"CUSIP: <J0.TABLE_DATA_LOOKUP.CUSIP"
"Custody Account #: <J0.TABLE_DATA_LOOKUP.CUSTODY_ACCOUNT"
END



Francis


Give me code, or give me retirement. In FOCUS since 1991

Production: WF 7.7.05M, Dev Studio, BID, MRE, WebSphere, DB2 / Test: WF 8.1.05M, App Studio, BI Portal, Report Caster, jQuery, HighCharts, Apache Tomcat, MS SQL Server
March 19, 2014, 04:46 PM
J.Hines
Thanks, I gave that a try, but it still returns just half.

I also just tried changing to WHERE RECORDLIMIT EQ XXX, and that did "fix" it in the sense that all the records were returned, but I don't want to sue that because we could be talking about a difference of one- to two-hundred thousand records, and I really just want to see the first (say 5000) pulled, not wait for everything to come back first.

Also, if I take off any limit, I get back all the records correctly as well. It's just 156,000 of them, which is why I leave the full to Excel, but in the web just do a READLIMIT of 5000.

So is there something about READLIMIT that's special and causing this halving?



Prod: 8.2.0.4 OS:Windows 10 Output:AHTML, Excel 2007+

The life of a designer is a life of fight against the ugliness.
March 20, 2014, 05:04 AM
George Patton
This starts to ring some very distant bells ...

I seem to remember that if the total length of the fields in the MFD (now called a Synonym) isn't correct then Focus (as it was then) would read every other line correctly. You might check and see if you are getting the every other line syndrome - and therefore only half the records you are expecting.


WebFOCUS 7.7.05 Windows, Linux, DB2, IBM Lotus Notes, Firebird, Lotus Symphony/OpenOffice. Outputs PDF, Excel 2007 (for OpenOffice integration), WP
March 20, 2014, 05:05 AM
Danny-SRL
Jess,
Are you joining 2 SQL tables?
If so, take a look at how WF translated to SQL.
Another possibility: you use JOIN... AT and not JOIN... IN. Maybe you need a WHERE in your JOIN?


Daniel
In Focus since 1982
wf 8.202M/Win10/IIS/SSA - WrapApp Front End for WF

March 20, 2014, 09:41 AM
J.Hines
George:
quote:
You might check and see if you are getting the every other line syndrome - and therefore only half the records you are expecting.

Thanks, but no, it's getting every line up to half the READLIMIT, then cutting off.

Danny:
I am joining 2 SQL Server tables. The Join language is what was generated by the GUI tool:
 JOIN
 INNER FILE TABLE_DATA
  AT  TABLE_DATA.TABLE_DATA.LOAD_ID
 TO UNIQUE FILE TABLE_DATA_LOOKUP 
 AT TABLE_DATA_LOOKUP.TABLE_DATA_LOOKUP.LOAD_ID
 TAG J0 AS J0
 WHERE
 ( TABLE_DATA.TABLE_DATA.LOAD_ID EQ J0.TABLE_DATA_LOOKUP.LOAD_ID ) 
 AND ( TABLE_DATA.TABLE_DATA.LOAD_ID EQ '&Load_Id' );
 END 


I also checked the SQL, and unless I'm missing something, it looks correct, as I expected. If I set WHERE READLIMIT EQ 5000 then I will only get back 2500 records, but the SQL looks like this. Running this SQL at the database does return 5000 records:
 SELECT TOP 5000 T1.LOAD_ID
(other fields)
FROM (server).dbo.TABLE_DATA T1,
(server).dbo.TABLE_DATA_LOOKUP T2
WHERE
(T2.LOAD_ID = T1.LOAD_ID) AND
(T1.LOAD_ID = 'ESRX20140319093140') AND
(T1.LOAD_ID = 'ESRX20140319093140')




Prod: 8.2.0.4 OS:Windows 10 Output:AHTML, Excel 2007+

The life of a designer is a life of fight against the ugliness.
March 20, 2014, 10:37 AM
J.Hines
Ooo, I just thought of a possible answer, but maybe others can confirm this.

READLIMIT is a limit on the number of read operations, NOT the number of records pulled (as the documentation states). So if I have a join, then for each RECORD in the left table, I'm performing two READS, one for each table, right? Thus the halving.

Now, the documentation for 7.7.0.3 also states that READLIMIT is:
quote:
a number greater than 0, and indicates the number of records to be retrieved.


What I'm primarily after with all of this is quickly returning a max of 5000 records to the user, and I was under the impression that RECORDLIMIT would pull everything, but only display 5000, which I didn't want because I don't want the user waiting forever as it cranks over say half a million rows before getting just 5000 rows back.

But I've also head from very knowledgeable people, including Bob Zinn that this is indeed what happens:
READLIMIT is translated in SQL to SELECT TOP whereas RECORDLIMIT is not, and thus you always get everything back from the table that matches your other WHERE criteria .

So then another question is this: If my table has say 1 million rows, and my WHERE key=some_value clause is pulling out a subset of 100,000 rows, is RECORDLIMIT then applied after the SQL runs?
I think this is the case since if I want 5000 records out of 150,000, doubling READLIMIT to 10000 is still much faster than doing RECORDLIMIT at 5000.

This message has been edited. Last edited by: J.Hines,



Prod: 8.2.0.4 OS:Windows 10 Output:AHTML, Excel 2007+

The life of a designer is a life of fight against the ugliness.
March 20, 2014, 12:59 PM
Tony A
The answer is yes, READLIMIT gives TOP n in MS SQL and WHERE ROWNUM = n in Oracle SQL whilst RECORDLIMIT will be applied to the result set from your query and not the query itself - which could be a tablespace scan which would explain why READLIMIT is generally faster than RECORDLIMIT against SQL sources.

T



In FOCUS
since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2  
WebFOCUS App Studio 8.2.06 standalone on Windows 10 
March 21, 2014, 03:51 AM
Danny-SRL
What bothers me in your assumption about the number of READS is that you say:
"Running this SQL at the database does return 5000 records".
After all, WF generates the SQL and sends it to MSSQL, so there shouldn't be a difference between what the database retrieves and what WF retrieves. Unless there is another number returned by the database which WF tests and stops the display.
A question though: I see that the condition "T1.LOAD_ID = 'ESRX20140319093140'" is issued twice. Could this have anything to do with it?


Daniel
In Focus since 1982
wf 8.202M/Win10/IIS/SSA - WrapApp Front End for WF

March 21, 2014, 04:13 AM
Tony A
quote:
Running this SQL at the database does return 5000 records:

Missed that! Frowner

Like Danny I find that a little disconcerting.

T



In FOCUS
since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2  
WebFOCUS App Studio 8.2.06 standalone on Windows 10 
March 21, 2014, 03:04 PM
J.Hines
I thought so too at first, but since it's reading from two tables, it's doing two "reads" for each record that I pull, hence it stops at 2500.

I haven't tried, but I would be willing to bet that if I joined to a third table and put a ReadLimit of 33 on, I would only get back 11 rows.



Prod: 8.2.0.4 OS:Windows 10 Output:AHTML, Excel 2007+

The life of a designer is a life of fight against the ugliness.