Monday, June 4, 2012

SQL Azure speed test - multiple inserts

I ran an interesting test on SQL Azure database tonight.  I'm trying to insert about 3.5 million random test records into a table with a script.  The script was initially calling each INSERT statement individually.  When I tested locally, it went quite fast, like about 1000 inserts per second.  I'd seen some posts where people had estimated that speed for SQL Server (not Azure) so I thought I'd achieve the same on Azure.

What I found was that I only achieved about 122 individual insert statements per minute on SQL Azure.  And adding to that frustration was that the script that generated the INSERTS kept timing out after 2 minutes, so I had to sit and keep refreshing it. I ran it about 8 times before I realized that it would probably be much faster to put a long list of values into one insert statement rather than inserting one row per statement.  That was confirmed by this post on StackOverflow.  But I had already started a table to measure results, so I ran it a couple more times for nine tries total.  The results are below.

When I changed my script to insert about 300 rows per INSERT, that put me over 200 rows per second, but still not as fast as I was hoping.  At least I changed the script timeout as I learned from this post so that I could refresh the page a little less than every 2 minutes.

(Update: After I later switched to 1000 rows per INSERT, I randomly got 3 trials of really good results, almost 1000 rows inserted per second! Then it was back to below 200 again the next time. Maybe it has to do with server load and I got lucky to finally hit a dead patch for half an hour.)

Note that SQL Azure is not dependent on the size or number of instances that you choose for your compute services.  I don't believe there is any way to customize the hardware specs for SQL Azure.  But of course you can tune your own database with indexes, etc.

I was also interested in the storage size of my table since SQL Azure charges by storage space. My table is pretty small with the following field types (sizes acquired from this helpful post):
int (PK) (4 bytes)
smallint (2 bytes)
float (4 bytes)
float (4 bytes)
smalldatetime (4 bytes)
int (4 bytes)
int (4 bytes)

Seems like it should only be 26 bytes per row.  Instead, I was getting about 68 bytes per row, over double what I expected!  And that was before I added another index.  Adding one index with two fields bumped it up double again to 136 bytes per row.  I expected there might be some overhead, but nowhere near that much.

Here are those initial insert test results from the INSERTs:

Total Storage Size (MB)Total RowsRows inserted in 2 min.Inserts / secBytes / row