Great tips on SEO for business blogging

The key takeaway is that you shouldn’t split your domains (even subdomains like blog.domain.com). Ideally use a directory structure such as domain.com/blog/.

Think about your blog site directory structure. Make it easier to navigate for both users and search engine robots.

http://www.airpair.com/seo/seo-focused-wordpress-infrastructure

FluentMigrator timeout when adding a new column to a large table

I recently had the requirement to add a new column to a large but not massive table, which had over 12 million rows. I needed to allow logical deletes, so I needed to add a boolean (BIT) column to that table. Arguably, I should have created the table originally with such a column, but hindsight is always 20-20.

My FluentMigrator scripts was simple:

[Migration(201407221626)]
public class LogicalDeleteMyTable : Migration
{
	public override void Up()
	{
		// new column mimetype
		this.Alter.Table("MyTable")
			.AddColumn("IsDeleted")
			.AsBoolean()
			.NotNullable()
			.WithDefaultValue(0);
	}

	public override void Down()
	{
		this.Delete.Column("IsDeleted").FromTable("MyTable");
	}
}

We have a number of automatic deployments for database migration using FluentMigrator.NET. The first few deployments were running against small test databases. Our test database has a bit of data in it, but nothing like the volume in production.

Luckily we had decided to pull back a copy of production to our UAT environment for this deployment. I was also working on a few anonymization and data archiving scripts, so I had needed a copy of production anyway. This turned out to be our saving grace.

As I said earlier, the production table had just over 12.5 million rows in it. When the FluentMigrator process step kicked off in Octopus Deploy, the script eventually timed out. Rather than immediately try and rework the script, I decided to up the timeout. Digging around in the FluentMigrator.NET settings wiki, I found that Paul Stovell had very smartly added a SQL command timeout override (in seconds) as a command line runner option/flag/parameter:

migrate --conn "server=.\SQLEXPRESS;uid=testfm;pwd=test;Trusted_Connection=yes;database=FluentMigrator" --provider sqlserver2008 --assembly "..\Migrations\bin\Debug\Migrations.dll" --task migrate --timeout 300

I tried a few more times whilst continually extending the timeout value, but the runner still timed out. Finally I extended the timeout to 10 minutes (600 seconds) and the script completely successfully. Wheeew!

In a future post I intend to cover ways in which you can add new columns to extremely large columns without such a performance hit.

The story of AllowRowLocks equals false. When indexes go bad.

I had a bad day yesterday. It was a combination of factors that took a total of six years to appear. This is the story of indexes gone bad. All because of a single index flag – AllowRowLocks.

For years we had a database just worked, with a variety of applications connecting to it on a daily basis with a large number of users. Then a couple of months ago we changed the way our core application connected to the database. Boom… deadlocks, failed deletes, the pain just got worse and worse, and there was no obvious reason.

There were no clear exceptions were being logged in the event log. The cause was not obvious. We were testing the same release code in a multitude of testing and staging environments, and in every case the code worked. But it didn’t work in production. WTF? The code itself was simple. It was deleting a single row in the database via ADO.NET.

I watched the web application make the request back to the server, then saw no error, then watched the record seem to miraculously re-appear. It made no sense. Why wasn’t the record being deleted? Why was there no error?

I thought I was going crazy so I asked a colleague to do a code review with me. He thought it looked OK too, so he suggested we use SQL Profiler to see what was going on. We saw the TSQL batch go across. The delete was there, then the code retried the request 4 more times, then silently failed. What was going on? We decided to run the request ourselves manually. Interestingly it wasn’t doing what we expected:

DELETE FROM myTable WHERE Id = X

It was doing the delete with a ROWLOCK requested:

DELETE FROM myTable WITH (ROWLOCK) WHERE Id = x

Running this query directly gave us the following error:

Cannot use the ROW granularity hint on the table because locking at the specified granularity is inhibited.

That nice error (thanks Microsoft) basically means:

The WITH (ROWLOCK) query option is not compatible with ALLOWROWLOCKS=FALSE on a table index.

The fix is simple:

  1. Disable the index or change the index to enable row locks.
  2. Use page locks or table locks instead.

The general advice is that you should leave both row and page locking on unless you have a damn good reason not to, so that the SQL Server Database engine can work out its own locks. This diagram from MSDN shows the trade-off you are making when it comes to locking:

Why AllowRowLocks matters
Why AllowRowLocks matters

Needless to say, we had indexes that had forcibly switched row locks off. More detailed information concerning the different types of index locks can be seen a SQLServer-dba.com:

Question:

What does the ALLOW_ROW_LOCKS and ALLOW_PAGE_LOCKS mean on the CREATE INDEX statement ? What is the cost\benefit of ON|OFF? Is there a performance gain?

Answer:

  1. SQL Server takes locks at different levels – such as table, extent, page, row. ALLOW_PAGE_LOCKS and ALLOW_ROW_LOCKS decide on whether ROW or PAGE locks are taken.
  2. If ALLOW_PAGE_LOCKS = OFF, the lock manager will not take page locks on that index. The manager will only user row or table locks
  3. If ALLOW_ROW_LOCKS = OFF , the lock manager will not take row locks on that index. The manager will only use page or table locks.
  4. If ALLOW_PAGE_LOCKS = OFF and ALLOW_PAGE_LOCKS = OFF , locks are assigned at a table level only
  5. If ALLOW_PAGE_LOCKS = ON and ALLOW_PAGE_LOCKS = ON , SQL decides on which lock level to create according to the amount of rows and memory available.
  6. Consider these factors , when deciding to change the settings. There has to be an extremely good reason , backed up by some solid testing before you can justify changing to OFF

I found a nice bit of advice on StackOverflow from @Guffa concerning the use of WITH(ROWLOCK):

The with (rowlock) is a hint that instructs the database that it should keep locks on a row scope. That means that the database will avoid escalating locks to block or table scope. You use the hint when only a single or only a few rows will be affected by the query, to keep the lock from locking rows that will not be deleted by the query. That will let another query read unrelated rows at the same time instead of having to wait for the delete to complete. If you use it on a query that will delete a lot of rows, it may degrade the performance as the database will try to avoid escalating the locks to a larger scope, even if it would have been more efficient. The database is normally smart enough to handle the locks on it’s own. It might be useful if you are try to solve a specific problem, like deadlocks.

Another blogger (Robert Virag) at SQLApprentice states in his conclusion concerning AllowRowLocks and deadlock scenarios:

In case of high concurrency (especially writers) set ALLOW_PAGE_LOCK and ALLOW_ROW_LOCK to ON!

So how do you fix this, and on a large table is this going to cause me a timely index rebuild? You can use the procedure sp_indexoption to change the options on indexes, although this is due to be phased out in favour of ALTER INDEX (TSQL) after SQL Server 2014. The syntax to ALLOWROCKLOCKS looks like this:

ALTER INDEX IX_Customer_Region
ON DBO.Customer
SET
(
ALLOW_ROW_LOCKS = ON
);
GO

You can also identify any other indexes that have row locks switched off (ALLOW_ROW_LOCKS = 0):

SELECT
  name,
  type_desc,
  allow_row_locks,
  allow_page_locks
FROM sys.indexes
WHERE allow_row_locks = 0 -- OR allow_page_locks = 0 -- if you want

Now armed with that we can take a look at the statistics for each specific index:

DBCC SHOW_STATISTICS(Customer, IX_Customer_Region)

Notably, Thomas Stringer also notes that the BOL reference states:

Specifies index options without rebuilding or reorganizing the index

Job done. Now repeat for each problematic AllowRowLocks index. You could write a script to do them all.

Saving Table Space Quick And Dirty


declare @tables table (name varchar(max), ID int identity(1,1), cnt int, size int)

declare @i int, @count int, @name varchar(max), @sql varchar(max)

insert into @tables

(name)

select

TABLE_SCHEMA + '.' + TABLE_NAME

from INFORMATION_SCHEMA.tables

where TABLE_TYPE='base table'

select @count=count(*) from @tables

set @i=1

while @i<=@count

begin

create table #temp

(

name varchar(max),

rows varchar(max),

reserved varchar(max),

data varchar(max),

index_size varchar(max),

unused varchar(max)

)

select @name=name from @tables where ID=@i

insert into #temp

(

name, rows, reserved, data, index_size, unused

)

exec sp_spaceused @name

update @tables

set size=left(data,len(data)-3),

cnt=rows

from #temp a

cross join @tables b

where b.id=@i

drop table #temp

set @i=@i+1

end

select *,

(size*1.0)/cnt as Ratio

from @tables

where cnt>0

order by (size*1.0)/cnt desc

Less Than Dot – Blog – Saving Table Space Quick And Dirty.

Deploying database migrations in .NET using FluentMigrator, TeamCity and Octopus Deploy

A while back I wanted to setup database migrations on a .NET project I was working on. I had previously been using Roundhouse but I have to be honest, I didn’t like it.

Too much ‘Powershell-foo’ and a reliance on the way you named your scripts, plus it didn’t work flawlessly with a group of developers and source control. And don’t even dare forget to mark your script as ‘Build Action – Content’, because the whole walls of Jericho come tumbling down if you don’t.

I wanted a replacement that worked for me. After a bit of research I came across FluentMigrator.net. At first I couldn’t grok it. I felt a bit like some of the stuff I’d seen in Ruby Migrations demos. I’d also used Subsonic (also has migrations), but there were a were niggly questions I had, namely:

  1. How to work with an existing (mature) database?
  2. How to deploy the migrations to production?
  3. How to manage a dependency on .NetTiers (historical yuck)?

I posted the question on StackOverflow but it never got any love, nor any responses. Anyway, I managed to solve this problem, and this blog post documents my path through to the solution.

Most of my problems were to do with a lack of understanding of how FluentMigrator.net actually work. the Github page outlines what FluentMigrator is quite well:

Fluent Migrator is a migration framework for .NET much like Ruby Migrations. Migrations are a structured way to alter your database schema and are an alternative to creating lots of sql scripts that have to be run manually by every developer involved. Migrations solve the problem of evolving a database schema for multiple databases (for example, the developer’s local database, the test database and the production database). Database schema changes are described in classes written in C# that can be checked into version control.

The wiki is missing a short overview of how it works though. So I’ll have a stab at outlining it here:

FluentMigrator allows developers to create up and down migration scripts using a ‘fluent’ interface in C#, which is a language most C# developers are familiar with! Most basic SQL commands, such as those to create or update schema are supported. Example would be the creation or alteration of a table, or adding an index, or deleting a foreign key. It supports more complex schema and data changes through embedded or inline scripts. The NuGet package includes an executable called migrate.exe, which runs against your compiled assembly. It scans through your assembly for scripts to run, orders them by the migration id, checks which ones have already been run in that database (it looks at a table in the database to see which ones have already run) and then runs each migration in turn until that database is upgraded or downgraded to the correct version (as required). Migrate.exe takes a number of command line parameters, which allow you to set things like the database connection string and the assembly to run against.

I’ll pick out my questions from Stackoverflow and answer them myself one-by-one.

What is the right way to import an existing database schema?

I couldn’t find a ‘right way’, but I did find a way that worked for me! I made the following decisions:

  1. I setup my first ‘iteration’ as an empty database. The reasoning for this is that I can always migrate back down to nothing.
  2. I scripted off the entire database as a baseline. I included all tables, procs, constraints, views, indexes, etc. I setup my first iteration as that baseline. I chose the CREATE option without DROP. This will be my migration up.
  3. I ran the same script dump but choose DROP only. This will be my migration down.

The baseline migration just has to use the `EmbeddedScript` method to execute the attached script (I organised the scripts into iteration folders as well).

[Tags(Environments.DEV, Environments.TIERS, Environments.CI, Environments.TEST)]
[Migration(201403061552)]
public class Baseline : Migration
{
   public override void Up()
   {
     this.Execute.EmbeddedScript("BaselineUp.sql");
   }

   public override void Down()
   {
     this.Execute.EmbeddedScript("BaselineDown.sql");
   }
}

My project now looks like this:

Capture

For each ‘sprint’ (Agile) I create a new iteration. It helps to keep track of which migrations are to be expected in the following or preceding releases.

Baseline database solved…

How to deal with .NetTiers

Ok, this was somewhat of a challenge. I created a specific .NetTiers database which I would use to run the .NetTiers code generation. In FluentMigrator you can ‘tag’ migrations. I decided to tag based on environments. Hence I have a ‘tiers’ tag as well as tags for ‘dev’, ‘test’, ‘uat’, ‘prod’, etc. How these get run will follow later.

When making schema changes I create the migration and use the tag ‘tiers’ to focus on the .NetTiers schema change. I then run migrate.exe out of Visual Studio external tools using that specific tag as a parameter. The app.config database connection that matches my machine name will be the database connection used, so I point it at the tiers database. Now my migrate up has run my .NetTiers source database is ready. I can now run the .NetTiers Codesmith code generation tool to produce the new DLLs.

Note: If you are using a build server such as TeamCity then you can simply check that migration code change in to your VCS and then the build trigger can automatically run the .NetTiers build (you need CodeSmith on the build server though).

You can then replace the current .NetTiers DLLs with the new ones either automtically if you have the build server generating them, or manually if you run Codesmith generator manually.

.NetTiers solved…

What is the right way to deploy migrations to a production environment?

I am using Octopus Deploy and to be perfectly honest, if you are deploying .NET applications, especially to multiple servers, this should be your absolute go-to-tool for doing so!

I won’t go into the details of Octopus Deploy, but at a basic level you can hook TeamCity and Octopus deploy together. OD provide two items to get you going.

  1. A program called Octopack that wraps up your application as a NuGet package.
  2. A TeamCity plugin that makes TeamCity build the NuGet package and offer it as an artifact exposed on a NuGet feed.

Octopus Deploy then consumes that NuGet feed and can deploy those packages to the endpoint servers. Part of this deployment process is running a PreDeploy and PostDeploy Powershell script. In here is where I am going to run the migrate.exe application with my specific tags:

function Main ()
{
   Write-Host ("Running migration " + $OctopusEnvironmentName)
   Invoke-Expression "& `"$OctopusOriginalPackageDirectoryPath\Migrate.exe`" --provider sqlserver2008 --tag $OctopusEnvironmentName --a Database.Migrations.dll"
   Write-host("Migration finished " + $OctopusEnvironmentName)
}

Notably, my `$OctopusEnvironmentName` match my tags. Therefore each environment deployment will match the correct database migration target. You can just run the database migrations project as a step in a OD project. You simply select the `Database.Migrations` project name (which is the name of my project) from the NuGet feed server.

Deployment solved…

RIP Robin Williams

Usually when celebrities die it means little to me, but today’s news concerning the death of Robin Williams was something different. As a child I watched Robin Williams with my Dad. He showed me Mork and Mindy and funny standup scenes, which were probably more adult themed than I should have watched.

I remember the brilliant films he was in, serious ones like Good Morning Vietnam,  Good Will Hunting and Dead Poet Society and the more light humoured ones like Mrs Doubtfire and Patch Adams.

It is so ironic that such a funny man, that made so many people laugh, was so sad inside. I hope his family are left to grieve by themselves without the media interfering.

Thank you for the laughs Mr Williams. You were a genius.

Ben Powell is Microsoft .NET developer providing innovative solutions to common business to business integration problems. He has worked on projects for companies such as Dell Computer Corp, Visteon, British Gas, BP Amoco and Aviva Plc. He originates from Wales and now lives in Germany. He finds it odd to speak about himself in the third person.