Efficiency with large tables

Assuming two tables: TABLESMALL and TABLEBIG, where TABLESMALL has about 10 records spanning two accounts and TABLEBIG has millions of records spanning thousands of accounts, how to join most efficiently? Two methods I've tried:

1)
TABLEF FILE TABLEBIG
PRINT *
WHERE ACCOUNT EQ 'A' OR 'B' // 'A' and 'B' known to be in TABLESMALL
ON TABLE HOLD
END

JOIN ACCOUNT IN TABLESMALL TO ALL ACCOUNT IN HOLD

TABLEF FILE TABLESMALL
PRINT *
END

2)
JOIN ACCOUNT IN TABLESMALL TO ALL ACCOUNT IN TABLEBIG

TABLEF FILE TABLESMALL
PRINT *
END

-------------------------

Both methods produce the same result. Method 2 is slightly faster. I'm concerned with both speed and system resources. I'm not sure if there is a difference in resource utilization between the two methods or not. Comments appreciated.

Posts: 38 | Registered: October 10, 2006

Anatess

Guru

posted

Hide Post

If these tables are relational databases like ORACLE or SQL Server, etc. Native SQL calls might prove faster. Like this:
Sample using ODBC connection

ENGINE SQLODBC SET DEFAULT_CONNECTION myconnect
SQL SQLODBC
SELECT S.*
FROM TABLESMALL S
JOIN TABLEBIG B ON (S.ACCOUNT = B.ACCOUNT);
TABLE FILE SQLOUT
PRINT *
ON TABLE PCHOLD FORMAT HTML
END
-RUN

Hope this helps.

WF 8.1.05 Windows

Posts: 333 | Location: Orlando, FL | Registered: October 17, 2006

Ignored post by Anatess posted

Show Post

Leah

Virtuoso

posted

Hide Post

As Anatess says, SQL may be more efficient, (2nd option of yours is the more efficient if you only need the small table information.) but do you need to bring back all the columns in the tables? Bring back only what you really need to save your app server some processing space.

Leah

Posts: 1317 | Location: Council Bluffs, IA | Registered: May 24, 2004

Ignored post by Leah posted

Show Post

FrankDutch

Virtuoso

posted

Hide Post

Jud

It also depends upon the location of these tables.

Suppose the TABLEBIG is an ORACLE Database table and the TABLESMALL is on an other system/server and also an INGRESS or RMS database. The direct join might give a result if you have the time to wait.
In that case I would first select the key fields from the table small (you say "known", but I would not believe it, since 'C' could be one key too).
In that case

SET HOLDLIST=PRINTONLY
TABLE FILE TABLESMALL
BY ACCOUNT
ON TABLE HOLD AS HKEYS FORMAT ALPHA
END
-RUN
TABLE FILE TABLEBIG
PRINT needed fields
WHERE ACCOUNT IN FILE HKEYS;
ON TABLE HOLD AS HDATA FORMAT XFOCUS INDEX ACCOUNT
END

TABLE FILE TABLESMALL
PRINT needed fields
WHERE ACCOUNT IN FILE HKEYS;
ON TABLE HOLD AS HDATA2 FORMAT XFOCUS INDEX ACCOUNT
END

and then do the join

And depending on the fields in the TABLESMALL there can be an other solution.

Suppose the TABLESMALL holds only the account and the accountname, then this table is needed to decode the account to a readable name, in that case I would not join the two tables but use the small table in a DECODE function.

	*Frank*
	prod: WF 7.6.10 platform Windows, databases: msSQL2000, msSQL2005, RMS, Oracle, Sybase,IE7 test: WF 7.6.10 on the same platform and databases,IE7

Posts: 2387 | Location: Amsterdam, the Netherlands | Registered: December 03, 2006

Ignored post by FrankDutch posted

Show Post

Darin Lee

Virtuoso

posted

Hide Post

As Frank mentions, a big factor would be whether BIG and SMALL are on multiple platforms and/or databases and/or WF Servers. Second factor would be if ACCOUNT is a key field or at least indexed in BIG and SMALL. Before going into all possible scenarios, could you provide us with this information?

Regards,

Darin

In FOCUS since 1991
WF Server: 7.7.04 on Linux and Z/OS, ReportCaster, Self-Service, MRE, Java, Flex
Data: DB2/UDB, Adabas, SQL Server Output: HTML,PDF,EXL2K/07, PS, AHTML, Flex
WF Client: 77 on Linux w/Tomcat

Posts: 2298 | Location: Salt Lake City, Utah | Registered: February 02, 2007

Ignored post by Darin Lee posted

Show Post

Jason K.

Master

posted

Hide Post

select
big.field, small.field from
big left outer join small on field = field
where account in ('A','B');

Option 1)
create a bitmap (Oracle only) or EVI (DB2 only) index over account in your 'big' (aka Fact) table.

Option 2)
If you're not using either of those databases, then you'll have to create a basic btree index in order of cardinality, I'm assuming Account, Field1, Field2, whatever_fact. <- in that order. If you create this index to include the fact information [such as sales) you will not need to access the original table, generally. If you're using an iSeries, however, just throw in the towel now. Optimizing SQL on iSeries is as finicky as it gets.

Prod: Single Windows 2008 Server running Webfocus 7.7.03 Reporting server Web server IIS6/Tomcat, AS400 DB2 database.

Posts: 611 | Registered: January 04, 2007

Ignored post by Jason K. posted

Show Post

Jud

Silver Member

posted

Hide Post

Thank you for the helpful comments - much appreciated.

Changing 'print *' to 'print needed fields' made a substantial difference.

In answer to some questions, TABLESMALL is an intermediate table - a hold file saved as format focus index account. TABLEBIG resides in an MSSQL db where ACCOUNT is part of a composite key.

Physical location of db I'm not certain of but would guess on a db server running windows while focus running on another windows machine.

Posts: 38 | Registered: October 10, 2006

Ignored post by Jud posted

Show Post

Majid Jeddi

Guru

posted

Hide Post

Hi,

The Big table is located on a machine running SQL server under windowss server or workstation.

Majid.

WebFocus 7.6.5
AND WebLogic server as web server
sql2005 as database server

Posts: 273 | Location: Europe | Registered: May 31, 2007

Ignored post by Majid Jeddi posted

Show Post

GinnyJakes

Expert

posted

Hide Post

I would go with Frank's solution. Because the two tables are of different types, you are doing a hertogeneous join and WebFOCUS winds up doing all the work.

WHERE IN FILE is the way to go here.

Ginny
---------------------------------
Prod: WF 7.7.01 Dev: WF 7.6.9-11
Admin, MRE,self-service; adapters: Teradata, DB2, Oracle, SQL Server, Essbase, ESRI, FlexEnable, Google

Posts: 2723 | Location: Ann Arbor, MI | Registered: April 05, 2006

Ignored post by GinnyJakes posted

Show Post

Please Wait. Your request is being processed...

Read-Only Topic

Focal Point

Focal Point Forums

WebFOCUS/FOCUS Forum on Focal Point

Efficiency with large tables

Focal Point | Privacy Statement | Terms of Service