Many users encounter situations where a particular relational engine SQL query can sometimes be slow. This can be complicated to diagnose due to a lot of reasons. Sometimes it's because of a SQL SELECT statement that's not optimized. On occasion, we receive a general question on how to write a SELECT statement that's optimized for performance. Unfortunately, that is really a million-dollar question. It is like asking how to write a program without bugs. We can only help you to optimize a SELECT statement on case by case basis.
In this article, we will give you some commonly seen scenarios that can cause SQL query performance problems. It is far from a comprehensive list. We just hope that you can draw lessons from the scenarios we provide to help you improve your relational engine performance.
Un-Optimized SQL SELECT Statement
There can be many scenarios for Un-Optimized SQL SELECT statements. Sometimes, to do it right, you need to be a DBA (Database Administrator) who understands Elliott databases really well. Therefore, do not be frustrated if you can't do this right. Netcellent is here to help you if you should be frustrated with a slow running SQL query. The following are some examples why the SELECT statement can be slow.
Not Using Index or Not Using Correctly
When you try to retrieve data from a table, you should try to use the index if possible. For example, the following SELECT statement will not yield good result:
SELECT * FROM ARCUSFIL WHERE CUS_CORR_NAME LIKE 'MORRISON%';
The reason is because CUS_CORR_NAME is not part of the index for ARCUSFIL (A/R Customer File). To find out the index of a table, you can use PCC (Pervasive Control Center), right click on the table, and choose Properties, then go to the Index tab. The following is the index for ARCUSFIL:
Therefore, you can improve the performance if you change the above SELECT statement to:
SELECT * FROM ARCUSFIL WHERE CUS_NAME LIKE 'MORRISON%';
One more trick you should be aware of is that the name save in CUS_NAME is case sensitive. Therefore, if the customer name saved in the CUS_NAME is "Morrison...", then the above SELECT statement won't work. The good thing is the value that's saved in CUS_SEARCH_NAME are all converted to upper case, therefore, the best strategy in this case is to use the following SELECT statement:
SELECT * FROM ARCUSFIL WHERE CUS_SEARCH_NAME LIKE 'MORRISON%';
While PSQL can take advantage of index when searching with begin matching, it will not be able to use the index for contain search like:
SELECT * FROM ARCUSFIL WHERE CUS_SEARCH_NAME LIKE '%MORRISON%';
Keep in mind some of these type of performance problem due to lack of index can be bridged significantly by giving more server memory so PSQL engine can cache entire database in the memory. As long as the table is cached in the memory, the above operation can still finish faster even if there is no index to operate on.
Be Aware of Combo Keys
Many indexes in Elliott are combo keys which involve multiple columns. If you specify the value of the second column of the index, but not the first column of the index, then PSQL will not be able to utilize the index. The following is an example of the index of CPINVLIN (COP Invoice Line Item File):
As you can see the invoice date column, INV_ITM_INV_DATE is part of the index KEY01, but it is the second part of that index. If we should use the following SELECT statement, it will not yield good performance:
SELECT * FROM CPINVLIN WHERE INV_ITM_INV_DATE = 20171201;
This will cause the PSQL relational engine to scan through the entire table of CPINVLIN in order to return the result. If CPINVLIN is not currently cached in the PSQL engine, then this can result in a very slow operation. If it is cached in the server memory, the peformance will probably be acceptable, but not the best because PSQL can't use KEY01 to locate the necessary records.
If you want to take advantage of the index, you will need to specify value of the first part of the index. Therefore, if you have the item information, then you can use the following SELECT statement:
SELECT * FROM CPINVLIN WHERE INV_ITM_ITM_NO = 'ABC001' AND INV_ITM_INV_DATE = 20171201;
This SELECT statement will perform very well because PSQL will be able to use KEY01 to locate its record. But this may not be what you want.
Let's say you know that the invoice number for 20171201 should be greater than 150000. Then you can use the following SELECT statement:
SELECT * FROM CPINVLIN WHERE INV_ITM_INV_NO > 150000 AND INV_ITM_INV_DATE = 20171201;
This will cause the PSQL relational engine use the KEY00 index first to look for any records where INV_ITM_INV_NO is greater than 150000, and then matching on the secondary condition INV_ITM_INV_DATE = 2171201. If there are not a lot of records after invoice number 150000, then this will be fast. If there are still a lot of records, then you can specify a range by using BETWEEN:
SELECT * FROM CPINVLIN WHERE INV_ITM_INV_NO BETWEEN 150000 AND 160000 AND INV_ITM_INV_DATE = 20171201;
The Strategy of Using Join
If the above alternative solution does not work well for you and you just want to see all the invoice line items that were invoiced on 12/01/2017, then you should explore using a different strategy. The following is the index for CPINVHDR (Invoice Header):
As you can see, the INV_DATE is equal to the KEY03 index. Therefore, if you want to get all the invoice records from CPINVHDR, you can easily do so by using:
SELECT * FROM CPINVHDR WHERE INV_DATE = 20171201;
This one will run very quickly. But wait. We want the record from CPINVLIN, not CPINVHDR. Therefore, we need to use the JOIN operation to get both records from CPINVHDR and CPINVLIN. The following is how we join these two tables together:
SELECT CPINVLIN.* FROM CPINVHDR, CPINVLIN WHERE INV_DATE = 20171201 AND INV_ITM_INV_NO = INV_NO;
Note that we specify the two tables as "CPINVHDR, CPINVLIN," not "CPINVLIN, CPINVHDR." This is because we want to use the index of CPINVHDR first to narrow down the records as INV_DATE = 20171201. Then, based on a small subset of records, we want to join to CPINVLIN by using the condition "INV_ITM_INV_NO = INV_NO."
Note that we prefer to inner join the two tables through the WHERE condition clause, instead of using the INNER JOIN ... ON clause. You should put the table that can quickly yield a small subset of records on the left, then join to the other table on the right.
The following is an example of a SELECT statement that will accomplish the same as previous one, but the performance will not be as good:
SELECT CPINVLIN.* FROM CPINVLIN INNER JOIN CPINVHDR ON INV_ITM_INV_NO = INV_NO WHERE INV_DATE = 20171201;
The reason for the slow down is because the system will perform the inner join from CPINVLIN to CPINVHDR in its entirety first. Since both of these two tables tend to be big, this will take times even though systems peform join by using index. This can be especially slow if these two tables are not cached in the memory yet.
Using of Views
Note that there is a view CPINVLIN_VIEW, which joins the CPINVLIN with the CPINVHDR table. For accessing large tables, it is our recommendation that if you can perform the join yourself instead of relying on the view, it will yield better results. This is because you can optimize your join when you do your own join, while that may not be the case when you use a view. To find out how we join to create the pre-dfined views, please look for the following files:
- V7.x - <ElliottRoot>\DDF40\Eli7View.SQL
- V8.x - <ElliottRoot>\Bin\DDF40\Eli8View.SQL
The following is an artilce on how a user was trying to retrieve Notes and Invoice header data through NOTE_INV_VIEW and could not get good performance. The problem is resolved by doing the join manually:
It has come to our attention that in one case the user had poor PSQL relational engine peformance due to the DDF being altered (e.g., the index portion of the DDF). The problem was confirmed when a database was created using the standard Elliott DDF and the same query compared against the new database created using the standard DDF: it ran siginficantlly faster than the one that's damaged. To create a new database, see the following article:
It would be best if your server has enough memory that can cache your entire database. To figure your Elliott database size, just look at your <ElliottRoot>\DATA folder for all the *.BTR files size. You need to count all companies' database sizes if you are actively using multiple companies. Let's say your data files size is 20GB and your server has a 32GB capacity; then you have enough memory to cache all your data. In that case, your PSQL server can perform reasonably well even if you perform query filtering on the none index column. On the other hand, if you don't have enough server memory to cache your entire Elliott database, then when you perform query filtering on the none index column, your performance can be poor.
Level 1 vs. Level 2 Cache Memory
By default, PSQL will use 20% of the server memory for Level 1 Cache, and up to 60% for Level 2 Cache. Level 1 Cache is non-compressed and Level 2 Cache is compressed. While Level 2 Cache will allow you to fit more data in the cache, because is it compressed, it will be slower. If you have enough server memory, then we suggest that you set 80% of your server memory for Level 1 Cache and 0% for your Level 2 Cache. See the following Knowledge Base article for more details:
Some Elliott users use virtual servers. Some IT departments like to assign the virtual server starting with a low number of CPU. The thinking is if it is not enough, then that can be easily changed because of the virtualization environment. In that case, you should monitor your PSQL server performance and watch the CPU utilization rate. If your CPU utlization rate is constantly over 50%, then you should consider adding more CPU to handle load fluctuation better. Just as an example, for a user with 150 Elliott user licenses that also has websites constantly accessing Elliott data, as well running Crystal Reports, we have noticed 4 CPU is somewhat on the low side for the PSQL server. Six to 8 CPU would be more appropriate in that case.