How to Optimize SQL SELECT Statement When Retrieving Data from Invoice History
SELECT
CPINVLIN.INV_ITM_ITM_NO, CPINVLIN.INV_ITM_INV_DATE, CPINVLIN.INV_ITM_DESC_1,
CPINVLIN.INV_ITM_REASON_CODE,
CPINVLIN.INV_ITM_DESC_2, CPINVLIN.INV_ITM_QTY_ORDER,
CPINVLIN.INV_ITM_TOT_QTY_SHP,
CPINVLIN.INV_ITM_UNIT_PRICE, CPINVLIN.INV_ITM_UNIT_COST,
(INV_ITM_UNIT_PRICE*INV_ITM_QTY_ORDER) AS ExtPrice,
CPINVLIN.INV_ITM_INV_NO,
CPINVLIN.INV_ITM_PROD_CATE,
CPINVHDR.INV_TYPE, CPINVLIN.INV_ITM_CUST_NO,
CPINVHDR.INV_SHIP_TO_NO,
CPINVHDR.INV_SHIP_TO_NAME, CPINVHDR.INV_SHIP_TO_ADDR_1,
CPINVHDR.INV_SHIP_TO_CITY,
CPINVHDR.INV_SHIP_TO_ST, CPINVHDR.INV_SHIP_TO_ZIPCD,
CPINVHDR.INV_SHIP_TO_COUNTRY
FROM CPINVHDR INNER JOIN CPINVLIN ON CPINVHDR.INV_NO
= CPINVLIN.INV_ITM_INV_NO
WHERE CPINVLIN.INV_ITM_ITM_NO='RU1022-22'
AND CPINVLIN.INV_ITM_INV_DATE>=20160101
And CPINVLIN.INV_ITM_INV_DATE<=20171130;
A - I used the same SELECT statement against my test database where CPINVHDR.BTR is 750MB and CPINVLIN.BTR is 540MB. These two tables initially are not cached in the server memory, so it took 15 minutes before it returned the data. Then I tried it second time. Since the data was already cached in server memory, this time around it took 26 seconds. Even though the second time is significantly faster, I still don’t like the fact that it took 26 seconds. So I tried to see what I could change to make it faster. The following revised SQL SELECT statement took only 2 seconds:
SELECT
CPINVLIN.INV_ITM_ITM_NO, CPINVLIN.INV_ITM_INV_DATE, CPINVLIN.INV_ITM_DESC_1,
CPINVLIN.INV_ITM_REASON_CODE,
CPINVLIN.INV_ITM_DESC_2, CPINVLIN.INV_ITM_QTY_ORDER,
CPINVLIN.INV_ITM_TOT_QTY_SHP,
CPINVLIN.INV_ITM_UNIT_PRICE, CPINVLIN.INV_ITM_UNIT_COST,
(INV_ITM_UNIT_PRICE*INV_ITM_QTY_ORDER) AS ExtPrice,
CPINVLIN.INV_ITM_INV_NO,
CPINVLIN.INV_ITM_PROD_CATE,
CPINVHDR.INV_TYPE, CPINVLIN.INV_ITM_CUST_NO,
CPINVHDR.INV_SHIP_TO_NO,
CPINVHDR.INV_SHIP_TO_NAME, CPINVHDR.INV_SHIP_TO_ADDR_1,
CPINVHDR.INV_SHIP_TO_CITY,
CPINVHDR.INV_SHIP_TO_ST, CPINVHDR.INV_SHIP_TO_ZIPCD,
CPINVHDR.INV_SHIP_TO_COUNTRY
FROM CPINVLIN, CPINVHDR
WHERE CPINVLIN.INV_ITM_ITM_NO='RU1022-22'
AND CPINVLIN.INV_ITM_INV_DATE>=20160101
And CPINVLIN.INV_ITM_INV_DATE<=20171130
AND
CPINVHDR.INV_NO = CPINVLIN.INV_ITM_INV_NO;
On the other hand, the first SQL statement caused PSQL to perform an inner join of the entire two database tables CPINVHDR and CPINVLIN first. Both tables are big, so this is why it was slow. Finally, when these two tables were joined, the system narrowed down the records with the WHERE condition. This is why the first statement was so slow.
Unfortunately, PSQL does not always know whether it should perform the join first or the where condition first to optimize performance. So we have to evaluate each scenario to influence PSQL to make the right choice in order to run faster. Most Elliott users do not have the expertise to do this kind of SELECT statement optimization, so don’t feel bad if you don't understand this article. This is what Netcellent does best, so just talk to us and we will help you.
EMK