Multi-Table Update in MySQL


I think one thing I love about web development is that I'm always learning something new! Even with my database skills, just when I think I “know it all” (HA HA HA), I learn some new trick! This week, I had to write some scripts to move some data around for a site as the upgrade will include a DB change. Basically we were using a field, position ID, as a foreign key in a large number of tables, but now we need to use a unit ID, so we have to retroactively add the unit ID to the referring tables.

Normally, you'd have to write out code to pull the new/old IDs, then loop through the tables and do a huge batch of update statements:

<cfset tablesToAddTo = "plans,reports,archive_plans,archive_reports,plans_to_teammembers,archive_plans_to_teammembers,reports_to_participants,archive_reports_to_participants" />

<cfquery name="qUnitIDs" datasource="#dsn#" cachedwithin="#CreateTimeSpan(1,0,0,0)#">
    SELECT positionid, unitid
    FROM temp_unitpositions
    ORDER BY unitid
</cfquery>

<cfloop query="qUnitIDs">
    <cfloop list="#tablesToAddTo#" index="thisTable">
        <cfquery datasource="#dsn#">
            UPDATE #thisTable#
            SET units_unitid = #unitid#
            WHERE positions_positionid = #positionid#
        </cfquery>
    </cfloop>
</cfloop>

<p>Finished setting the unit IDs in all tables</p>

This requires a crazy number of database connections, especially when we were doing this on a list of twelve tables, two of which are pushing the million record mark. It takes a long time to run (well over ten minutes if it doesn't time out) and it eats up system resources. But it was the way I knew how to do it….until I found myself wondering if I couldn't somehow do an update statement with a subselect for each table instead. It seemed like it would be faster if I just do something like this:

<cfset tablesToAddTo = "plans,reports,archive_plans,archive_reports,plans_to_teammembers,archive_plans_to_teammembers,reports_to_participants,archive_reports_to_participants" />

<cfloop list="#tablesToAddTo#" index="thisTable">
    <cfquery datasource="#dsn#">
        UPDATE #thisTable#
        SET units_unitid = (SELECT unitid FROM temp_unitpositions WHERE positions_positionid = #positionid#)
        WHERE positions_positionid = #positionid#
    </cfquery>
</cfloop>

Well, that didn't work, but it triggered me to head to Google, wondering if anyone else had done something similar, and that's when I learned about multi-table update statements! Freaking awesome! It took that mess above to just:

<cfset tablesToAddTo = "plans,reports,archive_plans,archive_reports,plans_to_teammembers,archive_plans_to_teammembers,reports_to_participants,archive_reports_to_participants" />

<cfoutput>
    <cfloop list="#tablesToAddTo#" index="thisTable">
        <cfquery datasource="#dsn#">
            UPDATE #thisTable#,temp_unitpositions
            SET #thisTable#.units_unitid = temp_unitpositions.unitid
            WHERE #thisTable#.positions_positionid = temp_unitpositions.positionid
        </cfquery>
       
        <cfquery name="checkCount" datasource="#dsn#">
            SELECT COUNT(*) as NumNull
            FROM #thisTable#
            WHERE units_unitid IS NULL
        </cfquery>
       
        <p>Finished setting the unit IDs in the #thisTable# table - #checkCount.NumNull# missing thier IDs</p>
    </cfloop>
</cfoutput>

Even with the added feedback on whether any records were missed and doing the full list of tables, the whole thing ran in milliseconds! It could even be run on the DB directly if you wanted to skip the ColdFusion part, which is a nice bonus.  

So far, the main places I could see this being useful are in cases like this, when I'm dealing with scripts to update content from one table with values from another to deal with a DB change.  Still, I suspect it may have much wider usage, and even if this does end up being the only time I use it, it was just what I needed at the time, saved us a lot of time and server resources both during testing and when we aunch this patch and server resources. I love finding new tools!