SQL Anti-Patterns

August 27th, 2010

When tuning SQL queries, sometimes I see these patterns, and I can generally speed them up with these solutions:

“EXISTS” beats “IN”.

If you’re writing a query like:

 SELECT *
FROM foobar
WHERE foobar.bar_id IN
(  SELECT bar.id
   FROM bar)

This query will often be faster (and I’ve never seen it slower)

SELECT *
FROM foobar
WHERE EXISTS
(  SELECT *
   FROM bar
   WHERE foobar.bar_id = bar.id)

No DISTINCT in subqueries

“Distinct” causes two extra steps (sort and filter).  If you’re writing a query like this:

 SELECT *
FROM foobar
WHERE EXISTS
(  SELECT DISTINCT *
   FROM bar
   WHERE foobar.bar_id = bar.id)

The distinct is unnecessary and may not be “optimized away”  by the database engine.

 

Avoid Subqueries Where Possible

Relational engines are built to do joins.   So you an often (but not always) rewrite an “in” subquery as a join.   Especially if the subquery is on the “one” side of a “one-to-many” relationship with the outer query.

 

Only Correlate 1 Level Deep

If you’re writing code like this:

SELECT *
FROM foobar
WHERE EXISTS
(  SELECT *
   FROM bar
   WHERE foobar.bar_id = bar.id
   AND EXISTS
   (  SELECT *
      FROM snafu
      WHERE bar.id = snafu.bar_id)
   )

The interior queries can be flattened into a join, which will be quicker.

 SELECT *
FROM foobar
WHERE EXISTS
(  SELECT *
   FROM bar
   JOIN snafu ON bar.id = snafu.bar_id
   WHERE foobar.bar_id = bar.id
)
Avoid Applying Functions to Columns in WHERE Clauses

This query:

SELECT *
FROM foobar
WHERE LEFT(foobar.name, 5) = 'Chris'

May be slower (and will never be faster) than:

SELECT *
FROM foobar
WHERE foobar.name LIKE 'Chris%';

(This is because in the first query, an index on foobar.name may not be usable, since the engine has to change the value of the column to do the comparison.  It’s no longer a “search argument” or “SARG”)

Avoid “select from ( select)”

I’ve never seen this be fast.

SELECT *
FROM
(  SELECT *
   FROM foo
)

You can take whatever you have been doing in the two statements and flatten them into a traditional query using joins and aggregates.

And lastly…

Complicated “WHERE” Clauses Are Slow

You can’t always get around it – sometimes you’re asking a complicated question.   But the more complicated the question, the more likely the database server is going to give up and just brute-force the solution.   Sometimes you can do better by breaking the query into a series of simpler queries.  Options for doing this are:

  1. Using a temporary table.
  2. Using UNION (or UNION ALL) instead of OR

manipulating mysql system catalogs

July 27th, 2010

collected this information from a bunch of places.  I was trying to do a bulk conversion of tables from MyISAM to InnoDB:

First good thing:  information_schema.TABLES:

SELECT *
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'zga_billing_automation_ccc'
AND TABLE_TYPE = 'BASE TABLE'
AND ENGINE = 'MyISAM';

Second thing…concat() operator.  Apparently the double-pipe operator only works if you have ANSI/PIPES_AS_CONCAT set, and I can’t be bothered.

SELECT CONCAT('alter table ', TABLE_NAME,  ' engine = innodb;')
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'zga_billing_automation_ccc'
AND TABLE_TYPE = 'BASE TABLE'
AND ENGINE = 'MyISAM';

So in SQLYog you can do “copy all rows to clipboard” and then paste back into the query window and execute.

ALTER TABLE auth_group ENGINE = INNODB;

ALTER TABLE auth_group_permissions ENGINE = INNODB;

ALTER TABLE auth_message ENGINE = INNODB;

ALTER TABLE auth_permission ENGINE = INNODB;

ALTER TABLE auth_user ENGINE = INNODB;

ALTER TABLE auth_user_groups ENGINE = INNODB;
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = ‘zga_billing_automation_ccc’
AND TABLE_TYPE = ‘BASE TABLE’
AND ENGINE = ‘MyIS

wow, chrome!

December 9th, 2009

well, I just switched my default browser on the Mac to Chrome from Firefox.  About the only thing that I was waiting for was Firebug, and it’s not needed — there are developer tools built right into Chrome.

I’m very impressed.

__del__ and python classes

December 3rd, 2009

How could I write Python for 12 years without knowing this?

Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it.

Well, I guess that never having __del__ work the way I thought it did has saved me from many nasty memory leaks.  And learning about this has solved another one.  I guess there’s always something new to learn.

Microsoft (finally)

September 10th, 2009

Well, I’m finally learning the MS stack, so I can add that to the Python/LAMP stuff that I’ve been doing for so long.  I’m certainly more clear on what the general .NET architecture is (and where Mono fits as well).  C# is mostly nice, if you can get past the static typing.  Typing “Foo foo = new Foo()” gets old, and the patterns for declaring and allocating arrays just will not stick in my brain.

Either I’m missing something really basic, or the interface from C# to ASP.NET is just silly.  I can’t figure out an easy way to allow a designer to create pages in HTML/CSS/JS and then tie them in with your C# code.  C# to the database is also still odd, LINQ is a mess.  I’m glad that there is a port of Hibernate to use as an ORM and the Castle ActiveRecord libraries to limit the number of “XML sit-ups” that I have to do.  (Although I’ve only been able to make ActiveRecord talk to MS SQL, and not MySQL, which is not encouraging.)  I’m really quite shocked that there is no JSON support in the core libraries.

I’ve had good luck working with SharpDevelop as an Open Source tool for developing Windows-based (aka Winforms) applications.  Sadly, SharpDevelop does not work when developing ASP.NET web-based applications, and it looks like that is not even on the horizon.   The free  MS “Express” editors are not bad, even though the MS IntelliSense always seems to pick the wrong thing for me.   I can’t quite figure out why one would buy Visual Studio when the Express versions are there.  I’m sure I will find out someday.  In any case, the projects seem to be so hopelessly intertwined that you can’t really develop without an IDE.  I’m also trying to figure out how you develop in teams with these toolkits — I can’t figure out what should go into source control and what should be local.   I do have unreserved kudos for the “build a distribution package” for Winforms applications.

So anyway, I’ve got my C# application talking JSON to a Django web service, then pulling data and populating an ASP.NET page and an MS-SQL database.  If nothing else, this app is buzzword-compliant!

sum(distinct)?

May 19th, 2009

Wow ,  I learned something new about SQL today.  Apparently, you can now use the “distinct” keyword inside sum().  Like this;

mysql> create table foo (a int);
Query OK, 0 rows affected (0.05 sec)
mysql> insert into foo (a) values (1);
Query OK, 1 row affected (0.11 sec)
mysql> insert into foo (a) values (1);
Query OK, 1 row affected (0.02 sec)
mysql> select sum(distinct a)
    -> from foo;
+-----------------+
| sum(distinct a) |
+-----------------+
|               1 |
+-----------------+
1 row in set (0.02 sec)

I’m not sure how this is useful, but it’s interesting.

hints for recruiters

April 30th, 2009

I’ve just lined up another client (who may become a full-time employer) after a four-month “unscheduled vacation.”  I’m very happy to be working (and earning!) again.  During my time off, I dealt with a lot of recruiters.  Most were reasonable.  Some were great.  A few were jerks.

Feedback: this is the most important thing I need from recruiters.  Look at my resume.  Make some constructive comments, so that it improves, and so that I know you’ve read it.  (I swear, I’m going to put together a resume with a line buried in there saying “Hiring manager, this recruiter did not read this resume.  You’re not getting value for your dollar.”)

Don’t make me call you. This really angers me.  When I take time off, get dressed up, travel to an interview and spend time with a hiring manager, then I have to call you to find out how it went.  I know I’m not going to get every job — but don’t leave me hanging, wondering what’s going on .

Do your own work. There seems to be a pattern developing where a hiring manager gives a req to Recruiter A, who then farms it out to Recruiters B and C, who in turn send it to Recruiters D, E, F and G.  Then I see the same job posted on every hiring board (nobody checks the individual job boards — we all use aggregators, like Indeed).  And then when I pick a lucky winner to talk to, there are three recruiters that all want to get paid, which makes me too expensive for the customer, and everyone loses.

An occasional call works wonders. Of all the recruiters I worked with, I had exactly one that ever called me just to “touch base.”  I like this guy.  I will be calling him the next time  I need work.  Or just to “touch base.”

django content management systems

April 20th, 2009

Today’s effort is to see if there is an open-source CMS for Django.  (I know that the Django authors have Ellington, but that’s a for-pay one, and I’m trying to bootstrap a business here.)

So I find a “checkbox matrix” at http://code.djangoproject.com/wiki/CMSAppsComparison, and while that’s a crappy way of picking software, and it may be out-of-date, I’ll start there and try a few.

PyLucid installs fine.  But once I get it installed, I can’t figure out what to do with it.  I can’t figure out how to even add a new page to the system.  The authors say that “writing documentation is not as much fun as writing code”, which is true, but a very bad sign.  I’ll try to post a question in their forum, as of now, I’m waiting to be approved to use the forum.

DjangoCMS installs pretty easy, and it appears that you get a really basic web page out of it (which you then use CSS to style, I suppose), but I realize that my target audience is really going to want in-place editing — I can’t confuse them with an admin interface and a public interface.

So I’ll try django-page-cms, which seems to have the most recent active development and commentary in the forum.  But this is very alpha software…I have to start installing package dependencies from subversion repositories, and things are just not working.  I have posted a question in their forum, will see if I get any response.

using jQuery for an autocomplete field

April 15th, 2009

I had trouble figuring this out, so here it is…

  • Download the “autocomplete” library from http://docs.jquery.com/Plugins/Autocomplete .  Unzip it and put the contents (js and css files) somewhere where your web server can serve them.
  • Create your HTML to look like this:
<input type="text" class="ac_input" id="teams" />
  <div class="ac_results" style="display: none; position: absolute; width: 150px; top: 23px; left: 93px;">
    <ul style="overflow: auto; max-height: 180px;">
       <li class="ac_even"></li>
       <li class="ac_odd"></li>
    </ul>
  </div>
  • In the header of your page, put this (your URLs may differ, test them directly).  The first line pulls in the styles for the formatting.  The second pulls in the JavaScript for the autocomplete.  The script that is there (and should probably be in its own file) links the autocomplete code to your input field.:
  <link rel="stylesheet" type="text/css" href="/site_media/jquery-autocomplete/jquery.autocomplete.css" media="screen"/>
  <script type="text/javascript" src="/site_media/jquery-autocomplete/jquery.autocomplete.js"></script>
  <script type="text/javascript">
    $(document).ready(function() {
      var url = "/hoops/team_names";
      $("#teams").autocomplete(url, { limit : 10 } );
    });
  </script>
  • Now you have to create the code that will answer to /hoops/team_names/?q=[whatever is typed into the box]&limit=10.  In my Django app server, it looks like this:
from django.http import HttpResponse
def team_names(request):
    try:
        limit = request.GET['limit']
    except:
        limit = 10

    name = request.GET['q'].title()
    teams = Team.objects.filter(name__startswith=name)[:limit]
    ret = "\n".join([team.name for team in teams])

    return HttpResponse(ret)

Note that the responses are sent as a series of strings separated by newlines. I tried and tried to send back JSON-formatted data, but I can’t make it work. Oh well, this is “good enough”.

April 13th, 2009

Looks like I backed the unpopular horse with Pylons, so I’m building a personal site with Django.  Hmm, can I show it to the world?