Dealing with the WURFL Manager Object in ASP.NET

It is with great pleasure that we host a new contribution from our WURFL Developer Evangelist for Microsoft Technologies, Dino Esposito.

-Luca Passani, CTO @ScientiaMobile

WURFL is a cross-platform API available for a variety of platforms and languages. Any programmatic access to the data that WURFL holds passes through the WURFL manager object. In the end, using the WURFL API is quite easy; more interestingly, it requires the same approach and development pattern regardless of the language and platform you’re using. Programming with WURFL can be summarized in two steps:

Get a reference to the WURFL manager object
Ask the WURFL manager object for the capabilities associated with a given user agent string

Behind these two simple steps, however, there’s a whole bunch of tricks and caveats to avoid. The key point to understand is the lifetime of the WURFL manager object, its caching strategy and the impact it may have on the available memory. This post analyzes the behavior of the WURFL manager object in ASP.NET and under IIS.

What Happens on Application Startup

The WURFL library is a Device Description Repository tool (DDR) and works by serving device-specific data to requestors. The core of the library therefore is the companion database where over 500 capabilities for over 18,000 devices are stored. Once loaded in memory, this information takes up approximately 50 megabytes of space—not too much for a real-world application, but not even a little amount that can let some memory leaks pass unnoticed.

The WURFL database is not made of live data. Moreover, the data that the WURFL library holds doesn’t even change frequently. If you licensed the classic WURFL API, you have access to weekly builds of the database. If you licensed the WURFL Cloud API, you are guaranteed to have always fresh data available. In both cases, the frequency of updates is in the order of days. In summary, the WURFL data can be considered static data at least from the perspective of a web application.

This data is ideal for one-time caching at the start of the application.

In an ASP.NET application, the Application_Start method on the global.asax class is the first place where a developer can gain control over the application. If any action needs be taken before the first request comes in, well, that’s the place. The initialization of the WURFL library, and the loading of the database, will take place in Application_Start. Here’s the code that creates an instance of the WURFL manager object.

var manager = WURFLManagerBuilder.Build(new ApplicationConfigurer());

I guess that to a .NET audience, the use of the Builder pattern to just instantiate a class may seem a bit awkward. I could probably agree given the overall simplicity of the process that builds up a WURFL manager. On the other hand, just the use of the Builder pattern here is a sign of the cross-platform nature of the API.

The builder requires a configurer object. The role of the configurer is providing enough information to the builder to locate the database files. In particular, the ApplicationConfigurer class—part of the WURFL library—reads the path of the database files from a section in the web.config file.

  <wurfl mode="Accuracy">
    <mainFile path="~/App_Data/wurfl-latest.zip" />
  </wurfl>

If you would like to pass the path to the database file programmatically, you then use the InMemoryConfigurer object, as below:

var manager = WURFLManagerBuilder.Build(
                      new InMemoryConfigurer()
                             .MainFile(...)
                             .SetMatchMode(MatchMode.Accuracy));

The WURFL manager builder does just main thing: it locates the WURFL database and loads its content into a memory data structure. The WURFL database is essentially an XML file. The process of loading consists in reading the entire document and parsing it out to proper bits and pieces.

The memory data structure that ends up containing parsed WURFL data acts as an internal cache privately owned by the WURFL manager. This data structure takes up most of the run time memory required by WURFL.

In the end, once you hold an instance to the WURFL manager you hold both the actual WURFL data and tool to read them. WURFL data should be considered global in the context of an ASP.NET application.

Now the question is: how can I reference the single instance of the WURFL manager being created at startup from any other places within my application?

The WURFL Manager is a Singleton

Internally, the WURFL Manager is built like a singleton. The instance of the WURFL manager class created (and then returned) by the builder is assigned to a public static member of the builder class named Instance. In the end, in global.asax you just call the builder:

WURFLManagerBuilder.Build(new ApplicationConfigurer());

Next, in any other place where you need to make a WURFL query you use the code below:

var deviceInfo = WURFLManagerBuilder.Instance.GetDeviceForRequest(userAgent);

In alternative, you can just save the reference to the WURFL manager you get from the builder to your own singleton and use it throughout the application. For example, suppose you declare the following member on your HttpApplication global.asax class (named MyApp in the sample):

public class MyApp : HttpApplication {
   public static IWURFLManager WurflManager;
   :
   protected void Application_Start()
   {
      :
      WurflManager = WURFLManagerBuilder.Build(new ApplicationConfigurer());
   }
}

Next, you can reference the WURFL manager like below:

var deviceInfo = MyApp.WurflManager.GetDeviceForRequest(userAgent, mode);

In this way, you are guaranteed to read always from the same piece of memory and no duplication whatsoever occurs.

Can the WURFL Manager Be Null?

The WURFL manager is never reset internally. If you correctly initialize it at the start of the application there’s no way for the manager to become null. Except, of course, that your code has a path where the manager variable is assigned a potentially null reference.

What about ASP.NET and Caching?

The WURFL manager holds its own private cache for obvious performance reasons. You can measure that the startup of the WURFL library usually takes a few seconds (only once when Application_Start is invoked) but each request is served is a matter of milliseconds. The WURFL internal cache uses a LRU algorithm and automatically retrieves device information that may have been discarded.

With WURFL, it is preferable not to use the ASP.NET Cache. The WURFL manager object exists to make static data available; by design this data is guaranteed to be available since the start of the application. Put down in .NET terms, the WURFL manager is not explicitly disposable.

You may think that storing the WURFL manager reference to the ASP.NET Cache would keep your application lean-and-mean even more. If you do so, however, you should consider a couple of possible issues. First, at some point the manager object may be removed from the ASP.NET cache. There are several possible reasons why a cached item may be discarded: memory pressure is one, but it could also be that you linked the cached item to an expiration policy (sliding time, absolute time, timestamp of files, changes in other cached items). If the WURFL manager is in the ASP.NET Cache, however, you should check it for nullness before you use it. If it’s null, you have to re-initialize it passing from the builder. At this point, though, another large chunk of memory is taken up with no guarantee that the previous chunk has been already discarded. This may lead the application pool that hosts the site to collapse. In the end, the worst that can happen is that your application recycles meaning that the entire ASP.NET Cache is cleared up as well as with any global data, session, and so forth.

When you store the WURFL manager in the ASP.NET Cache, you actually store a memory pointer and retrieve it like this:

var manager = Cache["…"] as IWURFLManager;
if (manager == null) {
    manager = WURFLManagerBuilder.Build(new ApplicationConfigurer());
}

The reference is removed from cache and left to the garbage collector but the memory it was referencing—the 50MB of in-memory WURFL data—may still be there perhaps doubled by the second manager reference you build.

Consider that ASP.NET uses strict arithmetic to determine if the application is under pressure. If your application is close to the threshold then a second allocation of 50 MB may be lethal. So 50 MB may not be a huge amount in general, but it is big enough to be the straw that breaks the camel’s back.

Updating the WURFL Database

As mentioned, as a WURFL user you have your chance to update the database on a weekly basis. Just replacing the database file on your server may not be enough. Because the WURFL data is cached on application startup, a new copy of the database is not detected automatically. While this may change in a future release of the .NET API, for the time being you should restart the web application in order for the new database to be effectively reloaded. Here are a couple of tricks to make the restart of the application happen automatically when you copy a newer database file. First, you can just place the WURFL database file under the Bin folder. By design any change detected in the tree rooted in Bin causes the application to restart. Second, you can add a line to the web.config file to track the date in which the database was installed. Also editing the web.config file, in fact, causes the application to restart. Upon restart, you have fresh data up and running.

Summary

The server machine configuration doesn’t impact much the use of WURFL either. Because the WURFL data is read once in the application lifetime and then just read there’s no issue if threads cross their way. Likewise, if you are in a Web farm scenario then each server ends up having its own copy of the same data as soon as the local copy of the application is started—simple and effective.

The bottom line is that centralizing the WURFL initialization in Application_Start and referencing the WURFL manager via a singleton—your own singleton or the built-in singleton—is the best and safest way to use the library. Enjoy!

– Dino

The Source Code is included in the release which can be downloaded here.