Converting and optimizing SQL Statements

Now it became time to conveert some SQL statements. Even though many SELECT SQL statements will run unchanged in MYSQL it might be wise to check and see if they are performing well. There are some differences in how VFP and MYSQL optimize queries and a query that just performs great in VFP might be very slow in MYSQL. And then there are some cases where we have to replace things that are possible in VFP but not supported in MYSQL.

First lets assume that our tables all have lots of records. After all it does not pay to spend time on optimizing on tables with a few dozend records. I also will not go into VFP optimization here as this is a MYSQL blog.

Lets assume for our first case we might have a table with invoice detail records and a table with items that are on sale. So now we want to get maybe 2 queries. One to show a total of all the items that were on sale and the second a total of all the items not on sale. I will leave date selections etc out of this as I just want to focus in on some performance issues

In VFP we might have had something like

SELECT <fieldlist> FROM detail det WHERE INDEXSEEK(det.item, .f., “promotions”, “item”)

SELECT <fieldlist> FROM detail det WHERE .NOT. INDEXSEEK(det.item, .f., “promotions”, “item”)

respectively depending which set we want. In VFP it was quite fine to use and INDEXSEEK in the Where clause. I sometimes used it in place of a join as it seemed to work faster and in previous version of VFP was a way to get around limits in the number of JOINS and the complexity of a query vfp supports.

In MYSQL now you could replace the INDEXSEEK with an IN or EXISTS(SELECT….) Subquery  so the query rewritten will look something like

SELECT <fieldlist> FROM detail det WHERE EXISTS (SELECT * FROM promotions p WHERE p.item = det.item)

Or the “NOT EXISTS” version for the oposite result. You could also use IN but in some cases that might even be more inefficient as IN retrieves the whole matching subset and EXISTS stops at the first occurance.

If we run the above query through “EXPLAIN” to see what MYSQL is doing we will see that on ther “SELECT <fieldlist> FROM detail det” it will perform a table scan and  then the item  index of the promotions table. So having a large “details” table that has also an “item” index this is not very efficient. In MYSQL the way to improve on that is to rewrite the query as a join. like

SELECT <field list> FROM detail det JOIN promotions p ON det.item = p.item

In this case MYSQL will use both item indexes and only retrieve the items matching. It gets a little bit more complicated with the case where we want to select items that have no promotions but we can do a join there too with

SELECT <field list> FROM detail det LEFT JOIN promotions p ON det.item = p.item WHER p.item = NULL

This one again will make the selection on the index level and depending on the amount of records you are dealing with result in great performance gains.

Lets now look at another scenario. Lets assume we have a invoice master table with  as “balance” field and we want to filter out as part of a query all records that are open so we are looking for “BALANCE <> 0″ In vfp we created an index on balance  and that improved the statement

SELECT <fieldlist> FROM master WHERE balance <> 0

drastically. Now if we do the same in mysql and we check with

EXPLAIN SELECT <fieldlist> FROM master WHERE balance <> 0

We will find that MYSQL is doing a table scan. MYSQL seems to have a problem with “<> <keyfield>”

You can get a slight improvement by

SELECT <fieldlist> FROM master WHERE balance < UNION SELECT <fieldlist> FROM master WHERE balance > 0

As it seems MYSQL has an easier time using the index with just the > and the < and a UNION but you are running into unnecessary overhead there too. What I have found most efficient so far is to add a field to the table

unpaid BIT

And then create an index on this field and set it in the BEFORE UPDATE and BEFORE INSERT triggers with

IF NEW.balance = 0 THEN SET NEW.unpaid = 0; ELSE SET NEW.unpaid = 1; END IF;

THE sql statement now looks like

SELECT <fieldlist> FROM master WHERE unpaid = 1

I used the BIT type field as it nicely translates into a VFP logical field with .f. for 0 and .t. FOR 1. But you probably can achieve the same by using an TINYINT or CHR(1)  or BOOLEAN field.

The performance improvement was drastic.

Beyond this it is always good to remember to

1.) make sure you have indexes on the fields in the “WHERE” clause and they are being used. USE “EXPLAIN ….” in the MYSQL query editor to find out.

2.) keep the row count to a minimum especially if you are using subqueries. So it often makes sense to first cut the number of records down with a subquery. I will go into detail on this in a seperate blog

As always comments/suggestions are welcome as this is all work in progress

Tags: , , ,

3 Responses to “Converting and optimizing SQL Statements”

  1. [...] Converting and optimizing SQL Statements Now it became time to conveert some SQL statements. Even though many SELECT SQL statements will run unchanged in MYSQL it might be wise to check and see if they are performing well. There are some differences in how VFP and MYSQL optimize queries and a query that just performs great in VFP might be very slow in MYSQL. And then there are some cases where we have to replace things that are possible in VFP but not supported in MYSQL. [...]

  2. [...] Converting and optimizing SQL Statements « Martin's MYSQL Blog [...]

  3. George says:

    Why is this query giving a error message ?

    select 1 where exists (select * from tablename where condition)

    If I use
    select 1 From table name1 where exists(select * from tablename2 where condition)
    IT’S WORKING but this doesn’t help.

    I don’t want a FROM in the first part of the query because I’m importing these from VFP where they are accepted and it’s impossible to change them all.

    I’m using ANSI mode. Could this be the problem ?

    Thanks.

Leave a Reply