Everybody Likes Ninjas

Scaling SQLAzure

Monday, 15 February 2010 08:23 by Guy

Guy

If your company is planning on using the Azure platform and SQL Azure for a high load OLTP application, you should change the way you think about application and database design.

The limitation on SQL Azure described in my former post, will force you to think differently regarding scaling your applications. Application wise, you must take into consideration that long running queries will be disconnected by the SQL Azure server and connections which are open for a long period over the internet, tend to disconnect due to network issues.

To limit the effect of these issues you should implement the following:

  • Connection polling in your application (built-in feature using .Net), will reduce the load on the database server. Connection polling also saves resources on creation and reduces the latency of creating a connection over the internet.
  • Use short transactions\batches, open the connection in the last second, execute the query commit and close the connection as soon as possible.
  • Implement retry mechanism in the events you get disconnected or unable to create the connection over the internet.

When you considering scaling options, since you don’t have any control over the hardware resources, scaling out is your only option. You have several options for scaling out, depending on your application. All of them will require a logic method in your application that will select the appropriate connection string, or in other words: your application must be partition aware.

Let’s review some of the options:

  • One Database per client- if you are a SaaS provider, you can open a database per client. This option has a few advantages: privacy and security wise, each client has its own database. Billing wise, you can bill your client just for SQL Azure recourses their database is using (see below: Master DB).
  • Configuration Database – if your database holds static information, updated once in a long period of time, you can create multiple copies of your database using Sync Framework to synchronies the changes from the primary one. You will also need to add a round robin method on your application side to select the connection string and distribute the load on the different instances.
  • Logical partitioning or Elastic provisioning– what if your application design allows that? You might be able to separate your data according to a natural key. Could be as easy as modulo 10 your client user id or opening a database per country\game\event … in your application. Of course, that too will require changes in the application connection string module logic. One company all ready implementing such a solution is www.ticketdirect.com. You can read more on Microsoft Case Studies.

I just had the privilege to join a closed Microsoft Connect session on SQL Azure . It seems that some new exiting features are on the way, which will help with scaling. Some of the performance execution limitations will be reduced and a new clone feature is being developed to help with instantly creating copies of your database for scale out usage and backup.

To enable the partitioning process, Master DB has changed considerably: it no longer sits at the center of the SQL Server and two new role gateways were added to login operation and logger of usage operations.

The login process uses two tables of sys.sql_logins for the login operation a new table sys.firewall_rules for configuring access to the “DataBase Server”. A successful login must be backed up with valid data in both tables (information about the login process can be found in my former post).

The second role as an activity logger for billing process is back up by two new tables named sys.bandwidth_usage sys.database_usage. By monitoring the values on these tables, you can calculate the operation cost of any single DB and add it to your client billing quote or monitor the operation cost of an event on your system.

Guy

SQL Azure

Monday, 25 January 2010 10:52 by Guy

Guy

Sql Azure is Microsoft’s solution for relational database cloud computing, it comes as part of Azure platform. As with the Azure platform in general, to implement such a solution many limitation were placed on the Azure version of SQL server. The implantation as technical solution is amazing and requires some IT background to figure the complexity of the solution.

The best place to start the explaining SQLAzure is with describing the infrastructure. As a user you are granted with a token, this token is used to create all your databases. First DB to set up is the Master DB, very similar the SQL Server Master DB(more on the functionality of the Master DB in the next post). All traffic to the SqlAzure “Server” is routed through a firewall that allows you to limit the accesses to specific sources using a well design web interface. Then the connection is rerouted to a gateway that diverts the connection to the current server hosting the online copy of your database. All SQL server instances that build the SQLAzure platform are installed on cluster machines to support high availability to improve availability the created database is automatically replicated across different machines . In the case of failure the gateway will automatically point to the current online database instance.

sqlazure

To increase scalability of your application and to maintain overall balances of the SQLAzure platform servers, your databases are placed across multiple servers. I.E. even though it looks like it, physically the your company databases are not placed on one server! What actually happens is that several companies share the same hardware! So how do you reach a specific database? The trick is in the login : the connection string must include the destination database, your connection undergoes authentication using the login setting saved in the Master DB and then diverted to the requested database that will probably be located on a different server.

Farther more to allow a fair resource distribution between the different databases and SQLAzure clients, several limitations are set on the databases settings and resource utilization. A detailed list of the limitation can be found in the following link, I bring the highlights:

  • The major change is that can’t manipulates local resources:
    • There is only one file group per database.

      The following technical resources are a good starting point to understand the SQLAzure platform :

    • No partitioning.
    • No fill factor.
    • No user define data types.
    • No full text indexes.
    • No memory settings or any other server level settings.
  • All tables must include a clustered index.
  • Each connection can only function in the boundaries of the database it is connected to.
    • No Trace or Profiler, due to the fact that when you connect to a specific DB you can interact with the Master DB .
    • Can’t use the USE <database> command.
    • Can’t use distributed transactions or queries.
    • Can’t use global temp tables.
    • No service broker
  • All logins are sql server logins, there is no domain support.
  • Maximum database size of 10G.
  • No SSIS or SSAS support
Connection throttling, to support fair resources distribution, is placed in the form of limitation on
  • Execution time limited to 5 min
  • Maximum open connection time
  • Limiting long running transactions and extensive resource usage CPU and memory.

These limitations will actively disconnect the client connection and STOP QUERY PROCESSING. Selecting the 10 GB (Business Edition) version over the 1 GB (Web Edition) version will grant your database with more resources and raise the limitation settings.

It is evident from the description above that SQLAzure is not a straight forward solution for high load OLTP databases or data warehouse implementation. Working tips scalability design considerations and more on the next blog.

Guy