Title Image

Don Xml's Grok This

The home of Don Demsak
Welcome to Don Xml's Grok This Sign in | Help
in Search

This Blog

Syndication

Site Sponsors

DonXml's All Things Techie

Cache Or Session State - Similar But Different

This week at TechEd Microsoft announce the Velocity project, a distributed in-memory object caching system, which got folks like Dare and ScottW talking about using a distributed caching solution for boosting the performance of web sites. That got me thinking more about the differences between Cache and Session State.  Although they seem to be the same, and often caching solutions are used for storing session data, I'm not a big fan of putting session in a cache solution (and I really hate putting session in a relational database, since there is nothing relational about the data).  But before I describe my preferred solution, let's define the terms:

Cache (via Wikipedia) - a cache is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache. In other words, a cache is a temporary storage area where frequently accessed data can be stored for rapid access. Once the data is stored in the cache, future use can be made by accessing the cached copy rather than re-fetching or recomputing the original data, so that the average access time is shorter. Cache, therefore, helps expedite data access that the CPU would otherwise need to fetch from main memory.

Session (via Wikipedia) - a session is a semi-permanent interactive information exchange, also known as a dialogue, a conversation or a meeting, between two or more communicating devices, or between a computer and user (see Login session). A session is set up or established at a certain point in time, and torn down at a later point in time. An established communication session may involve more than one message in each direction. A session is typically, but not always, stateful, meaning that at least one of the communicating parts need to save information about the session history in order to be able to communicate, as opposed to stateless communication, where the communication consists of independent requests with responses.

HTTP session token (via Wikipedia) - A session token is a unique identifier (usually in the form of a hash generated by a hash function) that is generated and sent from a server to a client to identify the current interaction session. The client usually stores and sends the token as an HTTP cookie and/or sends it as a parameter in GET or POST queries. The reason to use session tokens is that the client only has to handle the identifier (a small piece of data which is otherwise meaningless and thus presents no security risk) - all session data is stored on the server (usually in a database, to which the client does not have direct access) linked to that identifier. Examples of the names that some programming languages use when naming their cookie include JSESSIONID (JSP), PHPSESSID (PHP), and ASPSESSIONID (Microsoft ASP).

As the Wikipedia article mentioned, session data is usually stored in a database, which IMHO is the wrong thing to do.  So, you may think that I'd prefer to use a Distributed Cache, and Velocity does just that and lists it as one of its key features:

Provides tight integration with ASP.NET to be able to cache ASP.NET session data in the cache without having to write it to source databases. It can also be used as a cache for application data to be able to cache application data across the entire Web farm.

But, IMHO, using a caching engine for session, although better than a database, is still the wrong implementation for the problem.  I've mentioned before (but never in my blog), that it seems as though a message solution is a much better implementation for session data.  You see, what you are really doing when you writing some data out to session in a stateless system is sending a message to a future version of yourself.  Images of Star Trek: The Next Generation episode "Cause and Effect" come to mind.  In that episode, the Enterprise is stuck in a time loop, where it keeps get destroyed, until Data sends a message to a future version of himself, and breaks the loop.  I learned the trick of using Message Queues for Session Data back in my mainframe days, and I've found that if something scaled for the mainframe, using the same techniques on other platforms is usually the best way.  Back on the Mainframe, CICS is the transaction service used in online systems, and works in a stateless manner, very similar to the web.  To send data between each instance of a screen, one of the primary techniques is to use a Temp Storage Queue, and a queue is created for each session, based on the session id.

I've always wanted to try to do the same thing with ASP.Net, using MSMQ as the Message Queue, but until MSMQ 4.0 (released with Vista and Win2k8 Server), it really wasn't feasible.  Creating a new queue for each ASP.Net session wasn't a simple and efficient thing to do, so I never tried it.  With MSMQ 4.0, they have added a subqueues, which are implicitly created local queues that are logical partitions of a physical queue.  This way, I can create one or more message queues for an ASP.Net application, and easily have them "indexed" by a sessionid.  The downside of using MSMQ is that very few companies have a network admin staff that know how to support MSMQ.   

I always wondered why the ASP.Net team never released a MSMQ session provider, so I'm going to have a go at it and see what sort of perf gains I can get over using SQL Server Mode, or maybe even Out-of-process Mode.

The first issue I've run across is that System.Messaging wasn't updated in .Net 3.5 to take advantage of MSMQ 4.0.  Reading from a subqueue is the same as reading from a regular queue, but you can't write to a subqueue using the System.Messaging namespace.  So, I'll have to implement that myself, and I'll publish the code.

Published Friday, June 06, 2008 5:32 PM by donxml

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

Michael C. Neel said:

One reason for placing session into a cache system or database is dealing with a farm of web servers - I'm really hoping Velocity can help here. Being able to use Session and Application stores from withing your web app, but not needed to care about the fact there are N-number of other web servers the next user request may land on is a big win I think. There are cases where I am truly caching data across a web farm and do so by having a master server generate XML and place copies on the other web servers - this has it's own set of issues and gothcha's and again, hoping Velocity can help here.
June 7, 2008 1:13 PM

Bill Bain said:

ScaleOut Software also has been delivering fully featured, scalable, highly available distributed caching for .NET (including ASP.NET sessions) since January, 2005. The key features that Microsoft has listed for release in CTP2 and V1 (and others which will not be available in V1) are available today in ScaleOut StateServer. SOSS is also self-configuring and self-healing as a fully peer-to-peer architecture. Please see our Web site's press release (http://www.scaleoutsoftware.com) for our response to the Velocity announcement. Regarding caches versus queues, given that a session only has exactly one "future" at any time, a queue per session should not be necessary. Also, when the cache APIs are used together with session-state, the best performance gains can be had. The developer can thereby inhibit unnecessary updates to the distributed cache by using session only to identify cached objects associated with the session instead of to directly hold these objects.
June 9, 2008 11:18 AM

Jay Kimble said:

To Cache, To Static, or To Session
June 10, 2008 9:12 AM

Chuck Kraatz said:

Mr. XML OK..I see your point. But a data store is a data store is a data store...If it does disk I/O it does disk I/O. What I believe you are looking for is the fastest retrieval tool. The placing of the data into the store may prove to be important at some point but the retrieval is always important as you know someone/user is waiting at that point in time. So Don it you believe and prove MSMQ is a faster engine that SQL, then cool, I'd say use it. Remember the whole SQL in memory part or Cairo that never materialized? Though the SessionState (SQL) theory is probably the results of that. Don, you know me and you know I like using the right tool for the job. So whatever tool can hold in memory lots of data, fully indexed so it is highly searchable and yet still partitioned by application/usergroup/user...blah..blah..blah I'm all for using it. Seems to me like whatever multiplayer online gaming is using for live player interactions (aka sessionstate data) should be more than sufficient for most LOB applications. Hope to see you at some point this year. Your Friend Chuck chuck_kraatz@msn.com
June 13, 2008 8:52 AM

urig - Tidbits from a .net life said:

Digging a little into MSMQ and how it can help me with a website tracking mechanism I'm working on
June 21, 2008 9:17 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit

About donxml

I’m an independent consultant, specializing in .Net solutions architecture, based out of New Jersey who also doubles as an evangelist for XML, Domain Driven Design, enterprise architecture and .Net. I do not work for Microsoft, the W3C or any other big company that you may know of (at least not yet). I’ve been an indie for over ten years, and although I’ve been tempted a couple times to take a job with companies like Microsoft, I’ve haven’t found something better than my current situation. I work mostly with the large pharmaceuticals that are based here in New Jersey, and usually find myself on long term contracts. Definitely not the prototypical indie consultant, but it lets me dedicate time to my non-income generating activities like the developer community stuff, plus financing open source projects like XPathmania and MVP-XML. If you would like to talk to me about doing some contract work, just contact me via the contact page. My rates vary widely, depending on lots of different variables, but mostly distance from Jersey, and type of work. Plus, I’ve been known to donate some of my code for various projects.
Powered by Community Server, by Telligent Systems