Title Image

Don Xml's Grok This

The home of Don Demsak
Welcome to Don Xml's Grok This Sign in | Help
in Search

This Blog

Syndication

Site Sponsors

DonXml's All Things Techie

Performance Statistics of Various Implementations of the Data Mapper Pattern

What started off as a quick how-to example of rehydrating business objects from data access layers (in reply to this Jay Kimble post) morph’d into a whole lot more (thanks to Scott Hansleman’s DataSet post, and my thoughts on the topic).  The idea was pretty simple.  I wanted to see how long it took to rehydrate a multi-dimension object graph from data retrieved from a database using the most common methods in .Net.  Going into this project, I had some preconceived notions on what the results would be, but I was very surprised at the results.

For those of you that are not pattern aware (yet), the process that you go through when converting data to business objects is called a Data Mapper Pattern.  The Data Mapper layer handles the creation of your business objects (aka your domain) from data, and hopefully don’t even know that there is a database (thanks to the data access layer).  In a domain-less (aka no business object) world the mapping step is skipped entirely, and datasets are used, with the business rules all tied up in the presentation tier or down in the database.  A slight advancement over the pure dataset method is strongly-type datasets, but unless you modify the generated code, you can not add the business rules into your strongly-typed datasets (at least until .Net 2.0’s partial classes).


In this project, I decided to use the DAAB v3.1 as the basis for my data access layer, since I know that a lot of people use it, and it saves me from have to explain my data access layer (more on that another time).  All I did is add my Connection String encryption project to it (keeping it a more enterprise ready example), and added a service layer around the DAAB (to encapsulate the Data Access, and hide it as much as possible from the business objects).

For the actual data mapping, that code is actually only dependent on one thing, how you exposing the data to the data mapper code.  There are three popular ways to pass data from the data access layer to the layers above, the DataReader, the DataSet, and the XmlReader.  Both the DataReader and the DataSet require you to create the actual data mapping code by hand, but with the XmlReader, you can use Xml Serialization to deserialize the XML into an instance of your business objects.  There is one more way that I’ve been working on to pass data up from the data access layer, and that is by using XPathNavigators, so I added this as a fourth example.  This is the one implementation that I was most interested in finding out the performance figures for.

The sample database I used was the all too familiar Northwind database on SQL Server 2000 (with all the latest service packs).  I created a cloned copy of the database and created the following stored procedure to test against (well actually there are 2 versions of the stored proc, one “normal” proc, and one that returned an XML stream using the For Xml Auto clause) that would return all the employees, with their associated Territories, and the Region for the Territory.

SELECT  Employee.EmployeeId as "Employee.Id",
 Employee.LastName as "Employee.LastName",
 Employee.FirstName as "Employee.FirstName",
 Employee.Title as "Employee.Title",
 Employee.BirthDate as "Employee.BirthDate",
 Territory.TerritoryId as "Territory.Id",
 RTRIM(Territory.TerritoryDescription) as "Territory.Description",
 Region.RegionID as "Region.Id",
 RTRIM(Region.RegionDescription) as "Region.Description"
from dbo.Employees Employee
INNER JOIN dbo.EmployeeTerritories et
on Employee.EmployeeID = et.EmployeeID
INNER JOIN dbo.Territories Territory
on et.TerritoryID = Territory.TerritoryID
INNER Join dbo.Region Region
on Region.RegionID = Territory.RegionID
order by Employee.EmployeeID, Territory.TerritoryId, Region.RegionId;

This would give me a nice 3 dimensional object graph to test the performance numbers against.  If you want to see all the code, you can pull it down from here (Or you can get the code from the Mvp.Xml SourceForge Project).  Just add the stored procs to a version of Northwind, and compile and walk thru the code (until I can get some time to write this all up in a series of articles).

Now, on to the results.  I used my Dell 8500 laptop as the test machine, with WindowsXP SP2, VS.Net 2k3, and SQL2k.  I wrote a small console app for each of the 4 test cases and they all used the High Performance Timer code to get better time measurements.  In order to try to get more consistent results, the console app will load the business object one time (to get everything compiled and into memory, including connection and all the SQL Server side stuff), and then start the timer and loop thru 1000 times.

DataReader - 341

DataSet - 411 ( 18.6% slower than the DataReader)

XPathNavigator - 450 (27.7% slower than the DataReader)

XmlSerialization - 542 (46.3% slower than the DataReader)


It wasn’t real surprising that the DataReader was the fastest of the implementations, but what surprised me was how slow the XmlSerialization was, and that the XPathNavigator was slower than the DataSet.  (oh, and if you are wondering how I calculated the percent changed, see this article on why what you were taught in high school was wrong and very misleading).  I went over my code pretty well, but I’m sure that people will find some performance enhancements, but overall, I’m pretty sure that things were done pretty consistently, and most enhancements will effect all the results and wash out of the final analysis ( and yes, I made sure that I cached the XmlSerialization instance, so that isn’t why that implementation method is so slow).  But, if you find something, definitely let me know so that I can update it.

I’m sure there are lots of other ways to implement a DataMapper in .Net.  You are welcome to clone the code, implement it, and let us know what your results were.

Published Friday, June 25, 2004 1:09 PM by donxml
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

TrackBack said:

June 28, 2004 6:16 AM

jay@bk-web.com (Jay Kimble) said:

Don,

I'm a little confused here. Are you saying that from purely a performance standpoint that the DataReader is fastest. And (strangely enough) that the datamapper using XPathNavigator is slower than the dataset (as is the XMLSerializer)?

I can understand why the DataMapper might be better than a Dataset, but I now have a CodeSmith Template that uses the DataReader and maps data directly to my business objects.

(I guess what I'm asking is that are you saying that the original aims -- as I understand them -- to create an object that is smaller and faster than the dataset have only partially been sucessful).
June 29, 2004 5:40 AM

DonXML Demsak said:

Yes, I did think that XPathNav, with it's compiled queries would be faster than a DataSet, but I could not get that to happen. But, XPathNav does make it easier (IMHO) to rehydrate objects from XML, since you can use an XSLT-like methodology.

I have not had a chance to run this code in .Net 2.0, but I'm hoping that the performance enhancements to XPathDocument will help make XPathNav faster than the DataSet.
June 29, 2004 7:18 AM

Rich said:

Hey Don,

I'm a little confused. Who cares if percent change is additive? Anyone who has even a basic understanding of numbers and division knows that they shouldn't expect two calculations of percent change to add up to the total change.

Anyway, AFAIK, percent change from A to B is still calculated as (B - A) / A. The AGM function mentioned in the article is something completely different, that achieves a completely different result from the standard percent change that everyone knows and uses. Now, to say that there's a better way to describe percent change is one thing, but to say that (B - A) / A is wrong because it's not additive is a bit extreme. That's like me saying division is wrong because (A / B) + (C / D) != (A + B) / (C + D).

Anyway, I guess at the very least you should (like you did) say that you calculated the percentage change using the AGM function, so people like me won't think you just flubbed the math.

-Rich
June 29, 2004 2:05 PM

TrackBack said:

July 2, 2004 9:49 AM

Jiho Han said:

I don't know whethere this is still valid but I've found this performance comparison of DataReader vs. DataSet (vs. XmlReader). It's also missing XPathNavigator approach. It's kind of dated since it's using Beta 2(1.0? 1.1?)

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/bdadotnetarch031.asp
July 2, 2004 10:34 AM

Brad Gronek said:

I just wanted to thank you for this article. It is the best presentation of data access methods I have seen so far.

July 20, 2004 12:28 PM

TrackBack said:

Don Demsak (DonXML) has a great posting that compares the relative performance of four ways to stuff an object from the database.
July 21, 2004 10:43 AM

Mark Bonafe said:

I'm afraid I must agree with Rich on the percent change question.

Quote: "For example, a quantity rises from 100 to 200, then from 200 to 400. With the arithmetic mean as the divisor, the first change is 66.6%,..."

I'm sorry, the first change is 100%; always will be.
July 21, 2004 11:59 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit

About donxml

I’m an independent consultant, specializing in .Net solutions architecture, based out of New Jersey who also doubles as an evangelist for XML, Domain Driven Design, enterprise architecture and .Net. I do not work for Microsoft, the W3C or any other big company that you may know of (at least not yet). I’ve been an indie for over ten years, and although I’ve been tempted a couple times to take a job with companies like Microsoft, I’ve haven’t found something better than my current situation. I work mostly with the large pharmaceuticals that are based here in New Jersey, and usually find myself on long term contracts. Definitely not the prototypical indie consultant, but it lets me dedicate time to my non-income generating activities like the developer community stuff, plus financing open source projects like XPathmania and MVP-XML. If you would like to talk to me about doing some contract work, just contact me via the contact page. My rates vary widely, depending on lots of different variables, but mostly distance from Jersey, and type of work. Plus, I’ve been known to donate some of my code for various projects.
Powered by Community Server, by Telligent Systems