Now it became time to conveert some SQL statements. Even though many SELECT SQL statements will run unchanged in MYSQL it might be wise to check and see if they are performing well. There are some differences in how VFP and MYSQL optimize queries and a query that just performs great in VFP might be very slow in MYSQL. And then there are some cases where we have to replace things that are possible in VFP but not supported in MYSQL.
First lets assume that our tables all have lots of records. After all it does not pay to spend time on optimizing on tables with a few dozend records. I also will not go into VFP optimization here as this is a MYSQL blog.
Lets assume for our first case we might have a table with invoice detail records and a table with items that are on sale. So now we want to get maybe 2 queries. One to show a total of all the items that were on sale and the second a total of all the items not on sale. I will leave date selections etc out of this as I just want to focus in on some performance issues
In VFP we might have had something like
SELECT <fieldlist> FROM detail det WHERE INDEXSEEK(det.item, .f., “promotions”, “item”)
SELECT <fieldlist> FROM detail det WHERE .NOT. INDEXSEEK(det.item, .f., “promotions”, “item”)
respectively depending which set we want. In VFP it was quite fine to use and INDEXSEEK in the Where clause. I sometimes used it in place of a join as it seemed to work faster and in previous version of VFP was a way to get around limits in the number of JOINS and the complexity of a query vfp supports.
In MYSQL now you could replace the INDEXSEEK with an IN or EXISTS(SELECT….) Subquery so the query rewritten will look something like
SELECT <fieldlist> FROM detail det WHERE EXISTS (SELECT * FROM promotions p WHERE p.item = det.item)
Or the “NOT EXISTS” version for the oposite result. You could also use IN but in some cases that might even be more inefficient as IN retrieves the whole matching subset and EXISTS stops at the first occurance.
If we run the above query through “EXPLAIN” to see what MYSQL is doing we will see that on ther “SELECT <fieldlist> FROM detail det” it will perform a table scan and then the item index of the promotions table. So having a large “details” table that has also an “item” index this is not very efficient. In MYSQL the way to improve on that is to rewrite the query as a join. like
SELECT <field list> FROM detail det JOIN promotions p ON det.item = p.item
In this case MYSQL will use both item indexes and only retrieve the items matching. It gets a little bit more complicated with the case where we want to select items that have no promotions but we can do a join there too with
SELECT <field list> FROM detail det LEFT JOIN promotions p ON det.item = p.item WHER p.item = NULL
This one again will make the selection on the index level and depending on the amount of records you are dealing with result in great performance gains.
Lets now look at another scenario. Lets assume we have a invoice master table with as “balance” field and we want to filter out as part of a query all records that are open so we are looking for “BALANCE <> 0″ In vfp we created an index on balance and that improved the statement
SELECT <fieldlist> FROM master WHERE balance <> 0
drastically. Now if we do the same in mysql and we check with
EXPLAIN SELECT <fieldlist> FROM master WHERE balance <> 0
We will find that MYSQL is doing a table scan. MYSQL seems to have a problem with “<> <keyfield>”
You can get a slight improvement by
SELECT <fieldlist> FROM master WHERE balance < UNION SELECT <fieldlist> FROM master WHERE balance > 0
As it seems MYSQL has an easier time using the index with just the > and the < and a UNION but you are running into unnecessary overhead there too. What I have found most efficient so far is to add a field to the table
And then create an index on this field and set it in the BEFORE UPDATE and BEFORE INSERT triggers with
IF NEW.balance = 0 THEN SET NEW.unpaid = 0; ELSE SET NEW.unpaid = 1; END IF;
THE sql statement now looks like
SELECT <fieldlist> FROM master WHERE unpaid = 1
I used the BIT type field as it nicely translates into a VFP logical field with .f. for 0 and .t. FOR 1. But you probably can achieve the same by using an TINYINT or CHR(1) or BOOLEAN field.
The performance improvement was drastic.
Beyond this it is always good to remember to
1.) make sure you have indexes on the fields in the “WHERE” clause and they are being used. USE “EXPLAIN ….” in the MYSQL query editor to find out.
2.) keep the row count to a minimum especially if you are using subqueries. So it often makes sense to first cut the number of records down with a subquery. I will go into detail on this in a seperate blog
As always comments/suggestions are welcome as this is all work in progress