BottleIT - When you're on a good thing

Autogenerating constants from a database

by Peter Hancock 1. October 2007 20:02

I’ve frequently worked on projects that use a factory to create objects that have been persisted to a database.  The type of object is stored in an integer in the database and there is a corresponding #include file to enumerate the types of objects that can be created sorta like this…

enum Products
{
    Espresso = 1,
    Latte = 2,
    Long_Black = 3
};

and can write code like

if(productType == Products::Latte) // Ignore the switch on type, just go with the argument here…
    Product::ServeWithMilk();

The problem of course, is with our product list, we have over 30, and are adding them at a fairly regular clip. This means we need to

INSERT INTO Product VALUES (4, ‘Flat White’)

and then add to the enum Product enum as Flat_White = 4. Given the concept that we should update once only, the following script can be used to automatically generate the include file…
declare enumCursor cursor fast_forward for
 select typeId, productName
   from Product
declare @enumVal AS VARCHAR(10)
declare @enumName AS VARCHAR(50)
declare @line AS VARCHAR(80)
print ‘/** ‘
print ‘  * Product type cursor is autogenerated from sql’
print ‘  */’
print ‘enum ProductType’
print ‘{’
open enumCursor
 fetch next from enumCursor into @enumVal, @enumName
 while @@fetch_status = 0
 begin
  set @line = ‘    ‘ + replace(@enumName, ‘ ‘, ‘_’) + ‘ = ‘ + @enumVal

  fetch next from enumCursor into @enumVal, @enumName
  if(@@fetch_status = 0)
   set @line = @line + ‘,’
  print @line
 end
close enumCursor
deallocate enumCursor
print ‘};’

Save the script as a generateProducts.sql, and use osql to call it as a pre-build step…

osql.exe -E -dCoffeeApp -h-1 -oProductEnum.h -n -i  generateProducts.sql -b

This is not all that difficult to do, and keeps the database and the include files in sync.  What amazes me about this though is that in four of the seven organisations I’ve recently worked at, nobody has actually done it.  The other thing that has really surprised me has been the resistance to my changes to implement it this way.  Not from managers - who haven’t actually cared all that much, but by the senior developers in the group!  Quite remarkable really.  I guess they don’t like handing over that manual control…

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Scripting and Regenerating foreign key constraints

by Peter Hancock 19. September 2007 09:04

Quite a while ago I wrote an article on continuous database integration. It involved scripting out a database and then using the scripts each iteration to either generate from scratch, or migrate an existing database. At the time of writing, it all worked perfectly, but recently, we've run into some problems with foreign key constraints. Basically, they now require us to run the table scripts in a particular order - something I'm not that keen on doing anymore.

Initially, we decided to just do away with these constraints, as we considered that they slow performance. But - and hindsight is 20/20 - we probably shouldn't have, and we've had to do some serious data cleansing to resolve some referential integrity issues. Realistically, we were only using the performance issue as justification for deciding not to undertake the (we thought) difficult task of working out how to make the foreign key constraints work properly. Turns out, it wasn't that difficult in the end anyway.

The first thing we need to do is to script all the foreign key constraints out to a text file. I've used wsf scripts and SQLDMO, but you could use scptxfr.exe to generate the files differently - thanks Jamie! The main reason I've continued to use the wsf files instead of migrating everything across to scptxfr.exe is because I like the extra control I have using the SQLDMO objects. Or - you could just use Enterprise Manager to do it manually. Personally, at BottleIT, we tend to steer clear of Enterprise Mangler because it makes it easier to integrate command line stuff into our continuous builds.

serverName = "(local)";databaseName = "Northwind";outputPath = "c:buildNorthwindCreate ScriptsConstraints";
server = connectServer(serverName);database = getDatabase(server, databaseName);
tableCollection = new Enumerator(database.Tables);
for( ; !tableCollection.atEnd() ; 
tableCollection.moveNext())
{
	table = tableCollection.item();
	if(!table.SystemObject)
	{	
		WScript.Echo(“Processing “ + table.Name);
		var script = scriptForeignKeys(table);
		if(script.length > 10)
		{
			var file = fso.CreateTextFile(fso.BuildPath(outputPath, table.Name + “.sql”));
			file.Write(script);
			file.Close();
		}
	}
}
server.Disconnect();
 
function connectServer(serverName)
{
	server = new ActiveXObject(“SQLDMO.SQLServer”);
	server.LoginSecure = true;	server.Connect(serverName, null, null);
	return server;
}
 
function getDatabase(server, databaseName)
{
	database = new ActiveXObject(“SQLDMO.Database”);
	database = server.Databases(databaseName);
	return database;
 }
 
function scriptForeignKeys(table)
{
	var params = SQLDMOScript_DRI_ForeignKeys;
	return table.Script(params, null, null, SQLDMOScript2_NoCollation);
}

This ensures that we won't lose anything.
Now we can drop all the foreign key constraints....

DECLARE tableConstraints SCROLL CURSOR FOR
	SELECT object_name(parent_obj) as constraintTable, name as constraintName
	FROM sysobjects
	WHERE xtype = 'F'           
	    AND object_name(parent_obj) IN ( SELECT name FROM sysobjects WHERE type='U')
DECLARE @tableName AS NVARCHAR(128)
DECLARE @constraintName AS NVARCHAR(128)
OPEN tableConstraints
FETCH NEXT FROM tableConstraints INTO @tableName, @constraintName
WHILE (@@FETCH_STATUS = 0)
BEGIN
	EXECUTE ('ALTER TABLE ' + @tableName + ' DROP CONSTRAINT ' + @constraintName)
	FETCH NEXT FROM tableConstraints INTO @tableName, @constraintName
END
CLOSE tableConstraints
DEALLOCATE tableConstraints

Voila - no more constraints in the database.

Now that we don't have any constraints, we can drop and recreate our tables, and the order of creation really doesn't affect us. When we've finished, we can just iterate over the constraint files in the outputPath (c:\build\Northwind\Create Scripts\Constraints) in the above example, and our declarative referential integrity constraints magically reappear.

So the steps again...

  1. Script the existing constraints to files
  2. Drop all the constraints
  3. Perform DDL stuff
  4. Recreate constraints from the files.
The reason I'm doing it this way, is that our continuous database integration server at BottleIT does it all for us. At the start of each test suite, we rebuild the database from scratch, so we wanted to try and reduce the need to have the scripts running in a specific order, whilst keeping the requirement of having foreign keys enforced at the database level.

 

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Typing speed

by Peter Hancock 16. September 2007 07:23

I've just finished renovating my office. So, in order to celebrate the occassion it was time to catch up on news, blogs and emails. And one of the first ones that I came across was a typing test entry. Being a naturally competitive person I had to have a crack. And rather pleasantly surprised by the results.

75 WPM

It's interesting though - I love my keystrokes in the IDE, but sometimes, I'm quicker typing than ctrl-space and cursor down. I wonder if that makes me slower all up?

Have a go yourself if you've got three minutes to spare.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

General

Continuous database integration

by Peter Hancock 8. September 2007 10:00

Introduction

The following article came about after reading reading Evolutionary Database Design by Martin Fowler and Pramod Sadalage.  The concepts were great, but I couldn't find anything on the web that really showed how to go about implementing it.  So I decided to have a go at it myself from the ground up.  Following is how I've managed to do it for the current project I'm working on.

The Environment

Developers box

  • SQL Server Instances
    • DEV
    • DEPLOY
  • osql.exe (or isql.exe)
  • Nant 0.85
  • Some form of source control (I'm using Perforce)

Build server

  • SQL Server Instances
    • DEV
    • DEPLOY (I can't afford a separate server for staging at the moment)
  • osql.exe (or isql.exe)
  • Nant 0.85
  • CruiseControl.NET
  • Some form of source control (I'm using Perforce)

Production server (ProdServer)

  • SQL Server Instances
    • LIVE

Folder/Project structure

  • CreateScripts
    • Databases
    • Tables
    • Views
    • Constraints
    • Triggers
    • StoredProcedures
    • Functions
  • MigrationScripts

Requirements

Our current project requires regular releases, and these often involve database changes.  We are using Perforce as our source control system.  CruiseControl.NET runs on our build server and is configured to fire of a NAnt build script on every code checkin.  The build requirements at checkin are...

  1. Create an empty database for unit testing.  Run all unit tests on it.
  2. Test the deployment of a new production or upgrade of an existing production database.

Implementation

The setup

As indicated in the folders section of the environment, it's important that the folders for the project are set up in the above way.  There are rules however that need to be enforced within this structure.

The CreateScripts\Databases and CreateScripts\Tables folders should be read only for developers.  This is because creating a database, or creating a table requires that these database objects don't exist in the first place.  If we have an existing database, and we CREATE DATABASE on it, or an existing table and we CREATE TABLE on it, the script will fail (or worse if we include IF EXISTS... DROP TABLE).  So instead, these scripts should be located in the migration scripts.  Any changes to existing tables (ALTER TABLE) or database (ALTER DATABASE) also belong in the MigrationScripts folders.

The remaining folders can be used quite happily to store the objects.  Destroying the objects and recreating them will not affect the underlying data.

One last area to note is that the CreateScripts\Constraints folder should contain only referential integrity constraints.  These constraints should NOT be included in the CREATE TABLE script that is placed in the MigrationScripts folder. 

The driver script

We start off first by building the overall application.build script use by CruiseControl.NET to kick off the build. 

<?xml version="1.0" encoding="UTF-8" ?><project name="Application" basedir="." default="checkinBuild">
    <!-- Target runs on every checkin -->
    <target name="checkinBuild">
        <!-- build the unit test database first -->
        <property name="database.server" value="(local)"/>
        <property name="database.instance" value="DEV"/>
        <nant buildfile="Database.build" inheritall="true" target="build" failonerror="true"/>
        <!-- And now, run the deployment -->
        <property name="database.server" value="(local)"/>
        <property name="database.instance" value="DEPLOY"/>
        <nant buildfile="Database.build" inheritall="true" target="restoreDatabase"/>
        <nant buildfile="Database.build" inheritall="true" target="deploy" failonerror="true"/>
    </target>
</project>

The default target is checkinBuild, indicating this NAnt target will be run on every code checkin.  First, we call the database.build file build target.  The build target is determined by setting the database.server and database.instance properties prior to calling the target with inheritall=true.  The second target - deploy - works the same way, except it executes either a new build, or upgrades an existing build.  Note also that we restore the database first.  This is because we don't want to run the scripts on the current database as it stands (it may have already had migration scripts already run).  Instead, we want to restore the latest available production database backup.  This way, we are testing on a real system.

The new database script

The new database script is created in the database.build NAnt script.  The sole purpose of this target is to completely destroy the existing database, and overwrite it with a brand new one, created from scratch.  The result meets requirement 1.  A blank database ready for unit testing.

<target name="build" description="Create a brand new database">
    <call target="createDatabase"/>
    <call target="createTables"/>
    <call target="createViews"/>
    <call target="migrate"/>
    <call target="createConstraints"/>
    <call target="createProcedures"/>
    <call target="createFunctions"/>
</target>

Key points to note about the above script are that we call the migrate target (described later) after the createDatabase and createTables targets.  The reason we do this, is because after creating the database and tables, the migration scripts may include alterations to tables, new tables, or removal of colums from existing tables.  These all need to be included to complete the database structure.

Delving deeper into each of the createXxxx tasks, they look like this.

<target name="createTables">
    <foreach item="File" in "CreateScripts\Tables" property="script" failonerror="false">
        <call target="osql" failonerror="true"/>
    </foreach>
</target>
<target name="createProcedures">
    <foreach item="File" in "CreateScripts\Procedures" property="script" failonerror="false">
        <call target="osql" failonerror="true"/>
    </foreach>
</target>

Each of these loops through all the scripts in the appropriate folders, and runs them using osql on the database.  (There is an osql task explained at the bottom for this)

* Astute readers may have noted that we are running scripts on the Databases and Tables folder, yet I mentioned that these should be read only for developers.  In short, these folders will be automatically populated at the end of an iteration.  I will explain further below.

The result, after running the build target, and calling all the create scripts is that we have a blank newly created database, from scripts entered by developers.

The deploy database script

The second requirement was that we be able to test the deployment of a database to an existing server.  There are two requirements here.  First, if the database doesn't already exist, then we need to create a brand new one.  If it DOES exist however, we just need to upgrade.  We use osql to connect to the database by the database.name property.  We assume that if osql fails, the database does not exist, therefore we need to create it from fresh.  If it succeeds however, we can safely assume there is already a database by that name there, and we need to upgrade it.  The selection script looks like this...

<target name="deploy">    <!-- Attempt to connect to the named database.  If the database exists, osql -b will return 0 -->    
<exec program="osql" commandline="-b -E -s ${database.connection} -d ${database.name} -Q "GO"  resultproperty="osql.result" failonerror="false"/>
    <if test="${osql.result == '0'}">
        <call target="upgrade"/>
    </if>
    <if test="${osql.result == '1'}">
        <call target="build"/>
    </if>
</target>

The upgrade script, is similar to the new script, but excludes creating the database and creating the table scripts.  (Otherwise we may potentially drop and recreate them, losing data)

<target name="upgrade" description="Upgrades an existing eventlive database">
    <call target="backupDatabase" failonerror="true"/>
       <property name="nant.onfailure" value="restoreDatabase"/>
       <call target="migrate" failonerror="true"/>
    <call target="createConstraints" failonerror="true"/>
    <call target="createProcedures" failonerror="true"/>
    <call target="createFunctions" failonerror="true"/>
</target>

The curly bits

This is where it gets a bit curly.  When migrating data or upgrading database as a series of tasks, the order that the tasks are executed is vitally important.  For example, the first task may add - say - a category table and a column to the Suppliers table.  It also adds the referential integrity constraint between suppliers and categories.  Order is important here.  First - we can't create the constraint until both table and column are added.  This is why the migration is run before the constraints.  If a later data migration task then wishes to add new category to the table, we need to ensure that the create table script has been run first.  In this case, the order within the MigrationScripts folder is important.

There are a number of ways to do this.

  1. Manually
  2. DateCreated
  3. DataModified

I use date created by performing a dir /b /od /tc *.sql > order.txt, and then iterating through the resulting text file and running the scripts in that order.  Alternatively, you could remove the dir that I execute and generate order.txt anyway you like.  The migrate target, whilst similar to the createXxx targets, doesn't iterate the directory though - rather, the order.txt file.

<target name="migrate" description="Run the migration scripts on the database">
    <exec program="${solution.dir}\tools\listFiles.bat" commandline="*.sql order.txt" workingdir="Migration Scripts" failonerror="false"/>
            <foreach item="Line" in="Migration Scripts\order.txt" property="script">
        <property name="script" value="Migration Scripts\${script}"/>
        <call target="osql" failonerror="true"/>
    </foreach>
</target>

Summary so far...

The above scripts are called for every checkin within the iteration.  Every code change that affects the data model, every data migration, data insertion or cleanse always gets put into the MigrationScripts directory.  Each time it is checked in, the NAnt build rebuilds the database from scratch.  It also takes the latest backup of the production database, and restores it to the DEPLOY instance - before finally running the deploy target on this database to test the deployment.

End of Iteration

So far, everything has been going along nicely, checkins have been rebuilding that database, and testing the migration scripts executing on the staging database.  Everything works as it should, and everybody is happy.  All we need to do now is finish off the iteration.  First, we deploy to the production server.  We know this works because we tested it by restoring from production and test running the scripts on the last build script.  Deployment is simple...

Nant.exe -buildfile:ApplicationDir\Database.build deploy -D:database.server=ProdServer -D:database.instance=LIVE

The last thing we need to do is cleanup after the iteration.  This involves archiving the migration scripts - we don't want to be rerunning them again.  The problem is though - if we archive the migration scripts and they contain create table scripts, we need to be able to run them again to create the new database for unit testing in the next iteration.  So what we do instead, is use upgraded database (on DEPLOY) to automatically generate the create database and create table scripts for us.  These scripts are then placed in the CreateScripts\Databases and CreateScripts\Tables folders - by the build server not by a developer.  Once these scripts are generated, we can move the MigrationScript items into an archive folder.  The nant targets look like this.

<!-- Script database takes an existing created database and scripts the tables and database    -->
<!-- creation routine for later checking back in to source control          -->
<!-- Note that this should ONLY be called when delivering on an iteration -->
<target name="scriptDatabase">    <exec program="cscript.exe" verbose="true">
       <arg line="${solution.dir}/Tools/scriptDatabase.wsf //Nologo //B"/>
       <arg value="//job:database"/>
       <arg value="${database.connection}"/>
       <arg value="${database.name}"/>
       <arg value="Create Scripts/Databases"/>
   </exec>
   <exec program="cscript.exe" verbose="true">
       <arg line="${solution.dir}/Tools/scriptDatabase.wsf //Nologo //B"/>
       <arg value="//job:tables"/>
       <arg value="${database.connection}"/>
       <arg value="${database.name}"/>
       <arg value="Create Scripts/Tables"/>
   </exec>
 </target>
 <target name="archiveScripts" description="Archive the database scripts" depends="scriptDatabase">
    <!-- Now we need to archive the migration scripts -->
    <tstamp property="archive.dir" pattern="yyyyMMdd"/>
    <move todir="Archive\${archive.dir}">
        <fileset basedir="Migration Scripts">
            <include name="*"/>
        </fileset>
    </move>
</target> 

Note - we are calling out to a script host program that generates the table and database scripts via SQLDMO - briefly...

   var objArgs, serverName, databaseName
   objArgs = WScript.Arguments
   if(objArgs.Length != 3)
   {
	WScript.Arguments.ShowUsage();
	WScript.Quit(1);
   }
   serverName = objArgs(0);
   databaseName = objArgs(1);
   outputPath = objArgs(2);
   var fso = new ActiveXObject("Scripting.FileSystemObject");
   if(!fso.FolderExists(outputPath))
   {
	WScript.Echo("Folder doesn't exist");
	WScript.Quit(1);
   }      
   var sqlServer, sqlDB, tableCollection, table, e;     
   server = new ActiveXObject("SQLDMO.SQLServer");
   server.LoginSecure = true;
   server.Connect(serverName, null, null);
   database = new ActiveXObject("SQLDMO.Database");
   database = server.Databases(databaseName);
      /* DMO Constants */
      var SQLDMOScript_Drops = 1;
      var SQLDMOScript_IncludeIfNotExists = 4096;
      var SQLDMOScript_IncludeHeaders = 131072;
      var SQLDMOScript_Default = 4;
      var SQLDMOScript2_NoCollation = 8388608;
      var SQLDMOScript_DRI_PrimaryKey = 268435456;
      var SQLDMOScript_DRI_Checks = 16777216;
      var SQLDMOScript_DRI_Defaults = 33554432;
      var SQLDMOScript_DRI_UniqueKeys = 67108864;
      var SQLDMOScript_ToFileOnly = 64; 
      tableCollection = new Enumerator(database.Tables);
      for( ; !tableCollection.atEnd() ; tableCollection.moveNext())
     {
	table = tableCollection.item();    
	if(!table.SystemObject)    
	{     
		var params = 	SQLDMOScript_Drops | SQLDMOScript_IncludeIfNotExists |  
                                                    	SQLDMOScript_IncludeHeaders |  QLDMOScript_Default |  SQLDMOScript_DRI_PrimaryKey |  
				SQLDMOScript_DRI_Checks | SQLDMOScript_DRI_Defaults | 
				SQLDMOScript_DRI_UniqueKeys | SQLDMOScript_ToFileOnly;
		table.Script(params, fso.BuildPath(outputPath, table.Name + ".sql"), null, SQLDMOScript2_NoCollation);          
		WScript.Echo("Processing " + table.Name);
	}
   }
   server.DisConnect();

We have a complete set of CreateScripts in their respective folders, the archive target is called which creates a new directory with the date set.  The MigrationScript files are then moved to this directory.  (And later checked in to source control)  The iteration is closed.

Conclusion

It's not that difficult to get it running.  The advantages of spending the effort are tremendous though.  In previous companies I've worked at, the running of scripts has been done by the DBA, and based on a weeks work getting everyone together to rollback failed ones, update the ones that depended on the failed ones, and generally just make sure it works.  It's prone to transcription errors, permission errors, script errors and inconsistencies.  By testing the deployment each time, as well as unit testing, we can reduce this.  One last thing worthy of note though is that just because it's easier to change, doesn't mean that developers can hack tables.  Thought still needs to be put into the design of the database.

Notes:

  • I've ignored the backupDatabase and restoreDatabase targets - these are left as the dreaded exercise for the reader.
  • I've used .wsf files for scripting the tables and database creation because it's fast enough and it works for me.  If speed is an issue, vb, C#, C++, java or whatever you want can be used instead.
  • I'm assuming that the production database backups are able to be transferred to the build server.  In some environments, production data is sensitive or bound by privacy laws.  In these cases, a test database could be used instead.  It's important though that it be of the same structure as the end of the previous iteration.
  • Any database refactoring scripts or data cleanse / migration scripts that are stored in the MigrationScripts folder should have SQL verifying the values at the start and finish of the script to ensure that it migrated correctly.  Failures should throw errors forcing a build failure.  RAISEERROR could be appropriate.

Currently rated 5.0 by 3 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Loosely couple your tests to your implementation

by Peter Hancock 29. August 2007 08:58

One of the most common set of tests that I see is the upfront naming of one fixture for one class.  The difficulty occurs when later on the classes become refactored into other classes and you're left with fixtures named after classes that no longer exist.  Perhaps more insidious though is that by having this coupling, you increase the amount of inertia that must be overcome in order to refactor a class.  In order to maintain a perceived consistency, you have to rename the test fixtures.  Even worse, many current source code implementations refuse to play happily with renaming of files, so the refactoring tools are less effective, and the resistance to change increases again.  By decoupling the tests from the implementation, the tests can stand on their own, and the implementation is free to be refactored as needed.

Consider a very simple implementation of a Blog Engine.  The story is as follows...
A blog consists of a number of entries.  Each entry must have a title and content.  Each entry gets a date created which refers to when the entry was initially drafted, and the date posted which indicates when the entry was published.  An entry can be created and added to the blog without being posted - a draft.  An entry can be created and immediately posted.  An draft can be posted at a later date.


Take 1
A fairly typical implementation would consist of the following fixtures...  (after some refactoring)

TestBase.cs

    /// <summary>
    /// Contains common constants and objects for testing blog and entry classes
    /// </summary>
    public class TestBase
    {
        protected Entry _entry;
        protected const string _testTitle = "Test Title";
        protected const string _testContent = "Test Content";
        protected void SetUp()
        {
            _entry = new Entry();
        }
    } 

EntryFixture.cs

    /// <summary>
    /// Tests to ensure Entrys are valid and have the correct defaults
    /// </summary>
    /// 
   [TestFixture]
    public class EntryFixture : TestBase
    {
        [SetUp]
        public void SetUpEntryFixture()
        {
            SetUp();
        }
        [Test]
        public void CanGetAndSetProperties()
        {
            _entry.Title = _testTitle;
            _entry.Content = _testContent;
            Assert.AreEqual(_testTitle, _entry.Title);
            Assert.AreEqual(_testContent, _entry.Content);
        }
        [Test]
        public void EntryCreatedGetsCreatedDate()
        {
            Assert.AreEqual(DateTime.Today, _entry.Created);
        }
        [Test]
        public void ValidEntryHasTitleAndContent()
        {
            _entry.Title = _testTitle;
            _entry.Content = _testContent;
            Assert.IsTrue(_entry.IsValid);
        }
        [Test]
        public void EntryWithoutTitleIsInvalid()
        {
            _entry.Content = _testContent;
            Assert.IsFalse(_entry.IsValid);
        }
        [Test]
        public void EntryWithoutContentIsInvalid()
        {
            _entry.Title = _testTitle;
            Assert.IsFalse(_entry.IsValid);
        }
    }

BlogFixture.cs

    /// <summary>
    /// Provides tests around the behaviour of the blog.
    /// </summary>
    [TestFixture]
    public class BlogFixture : TestBase
    {
        private Blog _blog;
        [SetUp]
        public void SetUpBlogFixture()
        {
            SetUp();
            _blog = new Blog();
        }
        [Test]
        public void PostingEntryProvidesPostedDate()
        {
            _entry.Title = _testTitle;
            _entry.Content = _testContent;
            _blog.Post(_entry);
            Assert.AreEqual(DateTime.Today, _entry.Posted );
        }
        [Test]
        public void PostingEntryIncreasesBlogEntryCount()
        {
            _entry.Title = _testTitle;
            _entry.Content = _testContent;
            _blog.Post(_entry);
            Assert.AreEqual(1, _blog.Count);
        }
        [Test]
        [ExpectedException(typeof(ArgumentException))]
        public void AnInvalidBlogCannotBePosted()
        {
            Entry entry = new Entry();
            _blog.Post(entry);
        }
    }

This leads to two classes - Entry and Blog.  This makes sense of course, and the implementation is quite simple and neat.  In short, this path of TDD leads to a successful implementation of Blog and Entry, I can post, get the dates, and have rudimentry validation on my blog.

The downside though is that the reader / business analyst / new developer, that picks up these tests has an increased inertia in changing them.  The very name of the test fixtures themselves forces an almost subliminal desire to maintain the current Blog / Entry structure, reducing the flexibility and creativity of the developer.

Take 2.

Same story, but remove any sort of artificial constructs and just add the tests one after the other refactoring mercilessly.  I started with the simplest thing I could think of that provided some behaviour...

/// 
/// Tests.  Note that there is currently no naming scheme - we leave that for refactoring to find...
/// [TestFixture]
public class Fixture
{
    [Test]
    public void PostSingleItemIncreasesCount()
    {
        Blog.Post("Test Title", "Test Content");
        Assert.AreEqual(1, Blog.Count);
    }
}

After the addition of the second test, which is to assert that the PostedDate is attached to the Blog, the following information arrises - one - we need an Entry class with which we can populate the blogs Posted with, and two, we can refactor out the setup of both tests into a setup method and give the fixture a readable name.  This covers of the requirements of "number of entries" and "contains a posted date" from the story.

///
/// The tests are starting to flesh out - we can now rename things - for instance, the 
/// Fixture has now become SuccessfulPosting, and we've extracted the requirements for a successful posting into the setup. 
///
[TestFixture]
public class SuccessfulPosting
{    
private int _index = 0;
    private Blog _blog;
    [SetUp]
    public void SetUp()
    {
        _blog = new Blog();
        _index = _blog.Post("Test Title", "Test Content");
    }
    [Test]
    public void PostSingleItemIncreasesCount()
    {
        Assert.AreEqual(1, _blog.Count);
    }
    [Test]
    public void PostSingleItemProvidesPostedDate()
    {
        Assert.AreEqual(DateTime.Today, ((Entry)_blog.Entries[_index]).Posted);
    }
}

Focussing next on invalid entries leads to the extraction of a simpler setup super class which contains methods to instantiate a blog and provide clean entry classes for testing succesful postings, and invalid posting data.

///
/// Manage overall setups for blog test
///
public class TestBase
{
    protected Entry _cleanEntry;
    protected Blog _blog;
    protected void Prepare()
    {
        _cleanEntry = GetCleanEntry();
        _blog = GetTheBlog();
    }    
    public static Blog GetTheBlog()
    {
        return new Blog();
    }
    public static Entry GetCleanEntry()
    {
        return new Entry();
    }
}
/// 
/// Test cases focussing on the normal flow of operations and the successful outcome
/// 
[TestFixture]
public class SuccessfulPosting : TestBase
{
    private const string testTitle = "Test Title";
    private const string testContent = "Test Content";
    protected Entry _validEntry;
    [SetUp]
    public void SetUp()
    {
        Prepare();
        _validEntry = GetValidEntry();
        _blog.Post(_validEntry);
    }
    [Test]
    public void IsContainedInBlog()
    {
        Assert.Contains(_validEntry, _blog.Entries);
    }
    [Test]
    public void PopulatesDatePosted()
    {
        Assert.AreEqual(DateTime.Today.Date, ((Entry) _blog.Entries[0]).PostedDate);
    }
    public static Entry GetValidEntry()
    {
        Entry entry = GetCleanEntry();
        entry.Title = testTitle;
        entry.Content = testContent;
        return entry;
    }
}
/// 
/// Tests to ensure that only valid data gets posted (Alternative flows)
/// 
[TestFixture]
public class PostingValidation : TestBase
{
    [SetUp]
    public void SetUp()
    {
        Prepare();
    }
    [Test]
    [ExpectedException(typeof(ArgumentException))]
    public void FailsIfTitleNotPopulated()
    {
        _cleanEntry.Content = _testContent;
        _blog.Post(_cleanEntry);
    }
    [Test]
    [ExpectedException(typeof(ArgumentException))]
    public void FailsIfContentNotPopulated()
    {
        _cleanEntry.Title = _testTitle;
        _blog.Post(_cleanEntry);
    }
}

Note that this is now inherently more readable.  Posting validation rules have been moved and renamed into a group, as have the rules around successful postings.  The last thing two things to deal with are the default values on creating an entry, and the addition of draft entries to the blog without posting.

/// 
/// Ensure that entries are created with default values
/// 
[TestFixture]
public class EntryDefault : TestBase
{
    [SetUp]
    public void SetUp()
    {
        Prepare();
    }
    [Test]
    public void CreatedDateIsToday()
    {
        Assert.AreEqual(DateTime.Today.Date, _cleanEntry.CreatedDate.Date);
    }
}

And finally, the addition of entries as drafts...

///
/// Ensure that draft entries can be persisted
/// 
[TestFixture]
public class DraftAddition : TestBase
{
    [SetUp]
    public void SetUp()
    {
        Prepare();
        _blog.Add(_cleanEntry);
    }
    [Test]
    public void IncrementsBlogCount()
    {
        Assert.AreEqual(1, _blog.Entries.Count);
    }
    [Test]
    public void IsContainedInBlog()
    {
        Assert.Contains(_cleanEntry, _blog.Entries);
    }
}

This now reads almost like a set of business rules...

  • DraftAddition.IncrementsBlogCount
  • DraftAddition.IsContainedInBlog
  • EntryDefault.CreatedDateIsToday
  • PostingValidation.FailsIfTitleNotPopulated
  • PostingValidation.FailsIfContentNotPopulated
  • SuccessfulPosting.IsContainedInBlog
  • SuccessfulPosting.PopulatesDatePosted
  • SuccessfulPosting.IncrementsBlogCount

The best thing though is that none of the rules or tests are constraining the implementation.  The rules stand on their own.  Any changes to the implementation through refactoring tools will change the tests, which is a good thing, but they aren't artificially constrained by the tests.

In short - by removing a preconceived structure from the test fixtures you allow a more organic growth of the code as it adapts to new business rules and constraints.  Refactoring mercilessly leads to removed duplication, and frequently, the promotion of "TestHelpers" that create valid objects into product code "Factory" objects.  The readability of the tests is enhanced, and even non-developers are able to read them.  Finally, the tests are less brittle as you are no longer focussing on forcing the code to fit the test structure, but rather, focussing on how the code solves the behavioural requirements of the tests.

Currently rated 5.0 by 7 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Design Patterns and Wine Tasting

by Peter Hancock 25. August 2007 08:36

These two go together surprisingly well. Apart from the fun inherent in drinking wine, these two fairly distinct areas share a common thread. Vocabulary. It’s difficult to truly taste a wine until you know some of the flavours. It’s taken a few years, but I’ve learnt what oak, truffles, cigar box, cedar, tobacco, tannins etc. mean - to me. They also tend to mean fairly similar things to other people. It’s good, because I know now what flavours I like. I can use these words to help me define a wine, and from that, communicate to others the wines that I enjoy. Design patterns are similar. A vocabulary which can be used to communicate how an area of software actually works. It’s also a way of saying what types of patterns complement each other and what don’t.

I wrote a threadpool once which used the command pattern to encapsulate the process to be threaded. That object was submitted to the singleton threadpool queue, which was protected by the double checked locking idiom. The singleton used a facade and an adapter to start up threads and execute the command before putting the threads back into a wait state. Now try explaining that to somebody who doesn’t understand the patterns involved. The beauty of patterns is that they can sum up the way a group of objects relate into only a few words. I remember when Taka code reviewed my original threadpool, he lent me The Book on patterns. I was surprised how many I had actually implemented - without even thinking about it.

The problem that I see with patterns though, is that many of the “architects” I’ve dealt with try and tell developers to “use such-and-such” a pattern. Patterns help us communicate ideas, but unlike libraries, or ActiveX controls, or whatever other pluggable compileable unit you choose to deal with, patterns just don’t cut it as plug and play building blocks. We should be highlighting that patterns help us develop by helping us communicate.

The wine maker doesn’t sit down and say, “OK, I’m going to make a wine that has strong earthy truffle notes with just a hint of oak, and some strong tannins”, and then proceed to make that. Instead, the wine maker takes the raw materials, the knowledge of the grapes, the climate history over the past however many months - adds a touch of mystique, a hint of art, and LOT of experience - and produces a wine that should be a winner. The knowledge of the flavour vocabulary, and the experience to use it, allows the wine maker to fine tune the wine, to create a structure that is beautiful on the palate, and, depending on the grapes and market, a wine that should be drunk now, or will age gracefully.

A software developer is the same. You can teach the words, and then use the words to speed up the education, but just the act of knowing patterns doesn’t make a great developer. It’s knowing how to see the patterns, identify the weaknesses, highlight the strengths, and blend the patterns together to create a fine bottle of software. The patterns, like the flavours, will emerge. And, unlike wine, we can influence these patterns by continually tuning and refactoring the software throughout its entire life.

Hmmm… maybe another glass of Pinot is in order after that.

Currently rated 4.5 by 2 people

  • Currently 4.5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

New look!

by Peter Hancock 24. August 2007 23:40

Well, it's taken long enough, but finally, BottleIT has got around to producing its new look website.  BottleIT is about bottling the good things in life, be it wine, sport, travel, good company or good food.  And what better thing for inspiration than seven weeks sabbatical in France.  So, during the twenty two odd hours on the flight home from Dubai, I was able to put some time aside whilst not working for our customers, and prepare this.

Now the next thing is to start migrating my old blog across.

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

General

Use a computed or calculated column in SQL Server

by Peter Hancock 13. November 2005 00:52

We have a simple contact table CONTACT - FirstName - LastName - DateOfBirth It’s wrapped as a C# class. (with appropriate constructors and other logic to instantiate from the database)

class Contact
{
	public string FirstName;
	public string LastName;
	public DateTime DateOfBirth;
	...
}

We want to list all the contacts who are over 18. So we do something like…

   foreach(Contact currentContact in rangeOfContacts)
      if((DateDiff(DateInterval.Month, currentContact.DateOfBirth, DateTime.Now) / 12) > 18)
         Console.WriteLine("{0}", currentContact.LastName);

Basically, we’re pulling the date of birth from the database, and then using our own code to calculate the age. Of course we could wrap the age up as a property of Contact, and we could even extract the year difference calculation method into a utility class in case we need to do other whole year based methods. I’ve seen this code a lot in my time. But why not let the database do it? Create a calculated column on the table…

CREATE TABLE Contact(
	FirstName		VARCHAR(20),
	LastName		VARCHAR(20),
	DateOfBirth 	DATETIME,
	Age 		AS Cast(DateDiff("mm", DateOfBirth, GetDate())/12 AS INTEGER)
)

It’s pretty simple, and it means everyone uses the same method. Yes - it does bend normalization rules, but I think it increases readability far more. Plenty of other applications for it also….

CREATE TABLE ExpiryThingy
(	
	ExpiryDate DATETIME,
	Expired AS (CASE
	                 WHEN (ExpiryDate < getDate()) THEN '** Expired **'
	                 ELSE '** Valid **'
	            END)
)

Go to town!

Currently rated 5.0 by 5 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Agile teams and rotating developers

by Peter Hancock 12. February 2005 23:07
I've been kicking this idea around with a guy from work for about 6 months.

I'm currently working in a team that has an extremely good development environment. I was given a config spec for Clearcase, and was able to use this to immediately download the entire sourcecode base, the tools, and the utilities to build the software. All that was required was my standard development PC, and the support guys to "push out" a particular control package. It worked wonders. Download and compile with the application working first time in under 3 hours. Unfortunately, our project is missing automated build scripts, unit testing, code coverage and analysis and documentation tools.

Then - I looked at another project in the same company, with developers with skills similar to mine. The installation and development was what I would loosely call a mess. Not that they weren't using the right tools, or tools different to me, or anything like that, just that you couldn't get everything from source control, and get a "standard" development machine to run it. Instead, you had to download certain tools from the net, set up specific configuration directories, class paths, environment variables and testing configurations. This project however, has great unit testing and code coverage tools, excellent documentation via a wiki page, and automated build scripts on check-in. Meanwhile, they considered our testing environment and methods a mess.

The difference is that the skill sets of the developers are in different areas. The focus in our team has been more on packaging and consistency, whilst the focus in the other team has been more on making things more automated, and unit testing for developers. There is a massive opportunity here. Why not rotate one developer out of this team, into our team, and vice versa? In fact, we have other projects also, that could benefit from this. I'm sure that there are things that both the teams I've been involved in could learn a lot from other projects, and other projects could learn a lot from us.

This all lead to the idea of rotating developers between projects, on a regular basis. I initially thought that Paul and I might be a little insane to suggest this, but after reading Martin Fowlers take, I figured maybe I'm not barking mad after all.

The idea basically goes by implementing a queue. Each project generally has around 5 developers in it. First in, first out. At the end of each month, the developer who has been in the project the longest gets rotated into the next project. In larger projects, developers spend longer periods, supports the idea that larger projects generally have a larger code base.

Advantages

  1. Seeding of ideas as new developers come in
  2. Homogeneous use of tools and techniques as developers rotate between projects
  3. Freshness. Developers who aren't enjoying a particular project know it is a finite sentence
  4. Learning. Developers are continually learning about the business environment on a macro scale. They see the business as a whole, not as "their" project.
  5. Mentoring. New ideas and techniques bought in need to be taught to existing developers.
  6. Reduces key person risk by providing a broader range of developers familiar with the project.

Disadvantages

  1. Losing a good developer to another team
  2. Time lost in introducing a new developer
  3. Cost of moving developer
  4. Increased chance of middle managers perceiving developers as swappable resources. Thanks Steve - comment below

Of the disadvantages, I think the first one is the least relevant. The point is that in a large organisation, the project is below the entire company. It is a shortsighted view if we consider that a good developer can't provide benefit to the entire organisation by being rotated around. The corollary to this is that a "bad" developer will also be rotated out. But if an organisation has no bad developers, then does it matter if a good developer gets rotated out and another good developer gets rotated in? Another area however that DOES need to be addressed is that with the rotation out of any developer there is a corresponding loss in project specific business knowledge. For example, rotating from a share trading system into a CRM system will involve the loss of that particular developer. There are two ways I think that this can be addressed. The first is to ensure that there are enough quality business analysts that remain in the team and they contain the specific business knowledge. The second way is to ensure that adequate documentation is provided for in the code. Tools such as Doxygen and NDoc go a long way towards assisting this.

Time lost in introducing a new developer will certainly be an issue in the short term. New developers will not only need to learn a probable new environment and method of doing things, but also have to come to grips with the business understanding. The first issue will be mitigated as ideas and techniques start to become more homogeneous throughout the organisation, leaving the developer with the requirement of learning the project specific business rules. Again, the creation of adequate documentation in the code will go a long way towards mitigating this.

Finally, the last disadvantage really can't be helped. There is often a cost involved in moving a developer around. The cost of moving equipment, chairs and the plethora of books, marketing toys and junk that accumulates over a developers life in their current location, is a cost that can't be ignored. Whilst each developer might move only every 3-7 months (depending on team size), there will always be some movement. I'm not really sure how to mitigate this, given agile requirements of putting developers close to users, analysts and testers where possible. This is something that the individual organisation needs to weigh up.

I really think this is doable, and Martin Fowler suggests that Thoughtworks actively do it already. An interesting thought. There is one thing that I would like to mention though - and that is that I am NOT assuming that developers are interchangeable resources. I'm proposing we rotate them because they're NOT interchangeable. There are a LOT of benefits that can be gained from differences in development skills.

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Code and comments

by Peter Hancock 11. February 2005 10:00

Going through an existing C++ code base at the moment.  I have to say that a large proportion of it is pretty good.  Yet there are very VERY few comments.  I find I'm not missing them for the most part.  But then I come across some really strange code, that WAS commented.  Unfortunately, it was some of those useful comments, you know - along the lines of

/** This method returns the name of the user logged in**/ 

Now that's great - I can see that Session::getLoggedInUser() does just that.  But WHY does it do it in the way it does?  It seems a very convoluted method of querying a database and comparing roles and generally mixing things up.  So I can see WHAT the code is doing - good, self documenting code - but I can't see WHY the author did it in this particular way.  For me - the why is always important.  I can determine from the why whether the user tried the standard library method of doing it and found problems that forced them to roll their own, or I can see that the author didn't know that there was a library function that would do it, or I can see that it has to be convoluted to achieve all the hidden side affects that this otherwise simple method may be hiding.

So now, my options are

  1. Replace the current method with a library method
  2. Try and debug the current convoluted method

Now - if the previous comment had of been /** This method returns the name of the user logged by using a custom method because of a memory leak in the library method **/ then I'd know not to go down that path.

So - when commenting code, tell me why you're doing the things you're doing.  Tell me why you made the design choices you did, why inheritance instead of aggregation?  Why are you throwing exceptions by pointer?  Otherwise, I'll either assume you knew what you were doing, and continue doing it the same way when I shouldn't, or I might assume you didn't know what you were doing, and go and correct it  - when I shouldn't.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Software development

Recent posts

Recent comments