An interview with Sriram Sridharan, Director of Mobile Data Analysis, ScientiaMobile
November 18, 2022
The User-Agent freeze has started on desktop Chrome and is expected to reach Android mobile devices during Q1 2023. To explain what is going on and to provide next steps to WURFL users, we have interviewed Sriram Sridharan – we call him Sri – Director of Mobile Data Analysis at ScientiaMobile.
Sri, can you tell us about how we got to UA Client Hints and summarize what has happened so far?
User-Agent Client Hints were introduced by the Chromium team in 2020. The stated goal was to move away from having the device information be completely visible in the User-Agent string. Instead, certain Client Hint HTTP headers would disclose that information as a result of negotiation between the browser and the server.
We’ve followed the progress of these closely. We have published a number of blog posts and technical resources that talk about the impact of this change and the ways to take advantage of it for our customers.
WURFL products have supported User-Agent Client Hints for quite a while now [Editor’s note: starting with WURFL API 1.12.5.0]. We’ve made the whole process of detecting devices using User-Agent Client Hints as seamless and consistent with the process that our users have enjoyed for all this time when detection was based solely on the traditional User-Agent string.
All you need to do to use a WURFL API, is feed it the entire HTTP request and let WURFL take care of the rest. WURFL will automatically analyze and consolidate the various bits and pieces of information in the HTTP headers, detect the device and return the appropriate device properties (AKA device capabilities in WURFL-speak).
There’s quite a bit of magic that happens under the hood. For example, WURFL first finds out which pieces of information in the HTTP request are trustworthy:
- Is the User-Agent frozen? To what extent?
- Are there User-Agent Client Hint headers present in the HTTP request and are they “low entropy” or “high entropy”?
- Is there any information that’s missing? Can we backfill what’s missing with data from elsewhere?
- Are there any other header fields that we can use to supplement or enhance the device detection quality?
- How can we combine all the relevant bits of information into one package while discarding unnecessary or misleading information or tokens from the supplied HTTP headers?
Quite a lot of work, as you can see.
If people look at the logs of their HTTP traffic today, they’ll still see User-Agent strings. Can’t they keep using that?
Well, no! That’s the key issue. This is a question we’ve heard from multiple WURFL customers, but what many are not quite realizing yet is that HTTP is changing and this is the end of the world as we knew it, sort of..
If you are in the business of serving HTTP traffic and use Device Detection, it is really important that you become proactive in supporting User-Agent Client Hints on your website or to ensure that your data sources start “enriching” your traffic with User-Agent Client Hints.
Ok, let me go deeper into this then. Why is the User-Agent string not so good anymore?
The main reason why you shouldn’t use just the User-Agent string exclusively for device detection is that it is likely to be “frozen”. The Chromium team is gradually “locking” parts of the User-Agent string, making the UA string progressively more generic, to a point where the information can no longer be relied upon for accurate device detection. We have a great blog post up on this with traffic stats.
Here’s an example of a frozen User-Agent string:
Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Mobile Safari/537.36
This is what Chrome on Android User-Agent strings will end up as, after the User-Agent freeze is complete. There are only two actionable bits of information that you can extract from this string:
- This is an Android device
- The request is likely from a Chromium based browser or perhaps even from Chrome itself with a major version of 106
This isn’t very helpful information for most use cases. You don’t know what version of Android the device runs, nor its brand or model. And you don’t know the full browser version that generated this User-Agent string.
For contrast, here is what was available before (and might still be available for a while depending on how frequently your users update their browser):
Mozilla/5.0 (Linux; Android 11; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4812.0 Mobile Safari/537.36
Just from a casual inspection of this user agent, we can tell much more detailed information – the platform, platform version, the device model, the browser name and browser version. And with WURFL looking up more device capabilities, we can get an even more complete and accurate view of this device.
Ok. If I look at my logs and I see that HTTP Clients hints are there, does that mean that I am good to go using WURFL for device detection?
Possibly, but not necessarily. Things are a bit more complicated than that.
There are actually two kinds of User-Agent Client Hints you can expect to see in your traffic – “low entropy” and “high entropy”. The former is the set of Client Hints you get for free with every request from a browser that supports User-Agent Client Hints. This kind of Client-Hints are not very “informative”. You want the second kind, “high entropy.” They provide the complete set of Client Hints with all the bits and pieces that you need for optimal device detection.
The terms “low entropy” and “high entropy” are used in the User-Agent Client Hint specification to describe two possible sets of headers you may receive. The reason for the choice in wording relates to some analogy with the use of the term in physics, related to how much “user personal information” is potentially given away in HTTP requests as devices request content. For the purpose of device detection we opted for different terms that make more sense to describe header quality. “Low entropy” User-Agent Client Hints essentially translate to a header quality of “Basic”, while“high entropy” translates to a header quality of “Full” in WURFL lingo.
“Basic” header quality, or “low entropy”, User-Agent Client Hints are included with every request from a user agent or browser. However, the “Full” header quality, or “high entropy”, User-Agent Client Hints, must be specifically requested. This means that you need to configure your server to request them. Here’s a table that explains what pieces of information you get with each set of User-Agent Client Hints:
Note: If you are seeing “Full” header quality or “high entropy” User-Agent Client Hints in your logs, the “Basic” header quality or “low entropy” User-Agent Client Hints are also included in the request.
User-Agent Client Hint | Low/High Entropy |
---|---|
sec-ch-ua | Low Entropy |
sec-ch-ua-mobile | Low Entropy |
sec-ch-ua-platform | Low Entropy |
sec-ch-ua-platform-version | High Entropy |
sec-ch-ua-full-version (deprecated) | High Entropy |
sec-ch-ua-full-version-list | High Entropy |
sec-ch-ua-model | High Entropy |
sec-ch-ua-arch | High Entropy |
sec-ch-ua-bitness | High Entropy |
sec-ch-ua-wow64 | High Entropy |
If you don’t request the entire set of User-Agent Client Hints explicitly, the ones you are seeing in your logs are most likely the “Basic” header quality or “low entropy” Client Hints. In practice, this means that you will only be able to tell the OS platform, a list of possible browsers that requested the page and Chromium’s idea of whether the client is a mobile device without further differentiation between tablets, desktops, Smart TVs, IOT devices and so on.
Here’s an example of what you are likely to see with just the “Basic” or “low entropy” set of User-Agent Client Hints:
User-Agent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Mobile Safari/537.36 Sec-Ch-Ua: "Chromium";v="106", "Microsoft Edge";v="106", "Not;A=Brand";v="99" Sec-Ch-Ua-Platform: "Android" Sec-Ch-Ua-Mobile: ?1
These set of “low entropy” User-Agent Client Hints tells us that:
- The browser originating the request could be either Chromium or Microsoft Edge
- That the originating device is considered a mobile device
More importantly, we know that this is an Android device, but not the version. The device make and model are unknown, and so is the exact version of the browser.
Please note that the “Not A Brand” token in the Sec-Ch-Ua aka the Brands header is intentional and is added according to the User-Agent Client Hints specification. This is called GREASE-ing, an allusion to how the Chromium team ensures cipher suite compatibility.
Now let’s put this in contrast with the complete “high entropy” User-Agent Client Hints. When you specifically request the “full” set of User-Agent Client Hint headers (and you must request them individually), you also receive the platform version, the full browser version, the model name, the architecture of the device and, depending on the platform, the “bitness” or the bit width of the device.
Here’s a specimen of a HTTP request that contains those high entropy Client-Hints:
User-Agent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Mobile Safari/537.36 Sec-Ch-Ua: "Chromium";v="106", "Microsoft Edge";v="106", "Not;A=Brand";v="99" Sec-Ch-Ua-Full-Version-List: "Chromium";v="106.0.5249.65", "Microsoft Edge";v="106.0.5249.65", "Not;A=Brand";v="99.0.0.0" Sec-Ch-Ua-Platform: "Android" Sec-Ch-Ua-Mobile: ?1 Sec-Ch-Ua-Platform-Version: 13.0.0 Sec-Ch-Ua-Model: Pixel 6 Pro
We can immediately see that we get quite a lot of additional information, such as:
- Full version for the browser that originated the request
- Full platform/OS version
- Device model name
In other words, all the good stuff is available again with high entropy Client Hints.
Interestingly, the actual browser name is never directly exposed via User-Agent Client Hints. Instead, the raw header data will always contain a comma separated list of browser brands.This is an intentional part of the User-Agent Client Hints specification and was introduced with the assumption that this would make it harder for developers to discriminate against minor browsers. Without debating that move here, it is important to note that the WURFL API is able to sort through this information and return the accurate browser name.
This makes sense, but – at a high level – what are the implications for device detection accuracy and for analytics solutions that rely on it?
We can rank HTTP requests in three categories, based on the amount of information they carry. From most preferred to least preferred:
- Full Header Quality (high entropy) request (or, equivalently, an “old school” “unfrozen” User-Agent String)
- Basic Header Quality (low entropy) request
- Frozen User-Agent string
Let me illustrate some scenarios with real HTTP requests. We’ll pass the User-Agents and the appropriate HTTP headers for each of these cases to the WURFL API and compare the results. We will retrieve the following capabilities for each detection request:
complete_device_name, advertised_browser and advertised_browser_version
Frozen User-Agent only
Inputs
User-Agent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Mobile Safari/537.36
User-Agent Client Hints: None
Outputs
complete_device_name: Generic Android 10.0
advertised_browser: Chrome Mobile
advertised_browser_version: 106.0.0.0
There’s not much additional information that the WURFL API can glean from the frozen User-Agent string alone, so these generic results are the best one can do.
Basic Header Quality aka Low Entropy User-Agent Client Hints
Inputs
User-Agent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Mobile Safari/537.36
Sec-Ch-Ua: “Chromium”;v=”106″, “Microsoft Edge”;v=”106″, “Not;A=Brand”;v=”99″
Sec-Ch-Ua-Platform: “Android”
Sec-Ch-Ua-Mobile: ?1
Outputs
complete_device_name: Generic Android 4.0
advertised_browser: Edge
advertised_browser_version: 106.0.0.0
Even with a “Basic” header quality or just the “low entropy” User-Agent Client Hints, you can see that the WURFL API is able to detect that the User-Agent is frozen, determines the right browser name from the browser brands list and even uses data driven fallbacks – Android 4.0 (in this version of the WURFL API).
Full Header Quality aka High Entropy User-Agent Client Hints
Inputs
User-Agent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Mobile Safari/537.36
Sec-Ch-Ua: “Chromium”;v=”106″, “Microsoft Edge”;v=”106″, “Not;A=Brand”;v=”99″
Sec-Ch-Ua-Full-Version-List: “Chromium”;v=”106.0.5249.65″, “Microsoft Edge”;v=”106.0.5249.65″, “Not;A=Brand”;v=”99.0.0.0″
Sec-Ch-Ua-Platform: “Android”
Sec-Ch-Ua-Mobile: ?1
Sec-Ch-Ua-Platform-Version: 13.0.0
Sec-Ch-Ua-Model: Pixel 6 Pro
Outputs
complete_device_name: Google Pixel 6 Pro
advertised_device_os: Android
advertised_device_os_version: 13.0.0
advertised_browser: Edge
advertised_browser_version: 106.0.5249.65
This is where you see the full benefit of including the full suite of the User-Agent Client Hints. Not only is the device make, model, browser and browser version accurately detected, but the OS platform and OS platform version are also detected as well! For our purposes, the reported platform information is synonymous with the device’s Operating System (OS) information.
So, you convinced me. Accessing those Client-Hints is really the key…
I’ve worn glasses my entire life. I can see without my glasses – sort of – but everything’s blurry, hazy and undefined. With my glasses on, everything’s crystal clear again. I don’t settle for blurred vision and you probably don’t want to either! We recommend that WURFL users go out of their way to ensure access to the “Full” header quality User-Agent Client Hints, to maintain the clear view of their data that WURFL has allowed over the years. Without proactive action, that level of Device Detection quality will not be preserved.
What do engineers need to do in practice to access full header quality?
Engineers need to configure their application and HTTP servers to request those additional headers through the Accept-CH header. It will need to advertise which Client-Hints are needed and – of course – you want the good ones, all the “Full” header quality User-Agent Client Hints. We recommend something like this:
Accept-CH: sec-ch-ua-platform-version,sec-ch-ua-full-version,sec-ch-ua-full-version-list,sec-ch-ua-model,sec-ch-ua-arch,sec-ch-ua-bitness, sec-ch-ua-wow64
Additionally, you may have the server indicate that these high entropy Client Hints are critical for an optimal browsing experience on your website. You can do this by setting the Critical-CH response header as follows:
Critical-CH: sec-ch-ua-platform-version,sec-ch-ua-full-version,sec-ch-ua-full-version-list,sec-ch-ua-model,sec-ch-ua-arch,sec-ch-ua-bitness,sec-ch-ua-wow64
If you are using a cloud-based WURFL solution (WURFL.js or ImageEngine), then you will also need to delegate access to the User-Agent Client Hints to us. You can do this by using a Permissions-Policy header. Here’s an example for WURFL.js Lite customers:
permissions-policy: ch-ua-platform-version=(self "https://wurfl.io"),ch-ua-full-version=(self "https://wurfl.io"),ch-ua-full-version-list=(self "https://wurfl.io"),ch-ua-model=(self "https://wurfl.io"),ch-ua-arch=(self "https://wurfl.io"),ch-ua-bitness=(self "https://wurfl.io"),ch-ua-wow64=(self "https://wurfl.io")
(Editor’s note: More detailed information on this topic is available in our “Request and Implement User-Agent Client Hints” technical documentation here ).
We realize that different web servers have different methods of setting request headers, so we have kept these instructions general enough to be server/platform agnostic. We also have Apache specific instructions on enabling User-Agent Client Hints support (here).
Thank you, Sri. This was hugely informative!
My pleasure!
If you’d like to stay apprised of the newest WURFL and device detection updates, please subscribe to our newsletter here. If you have any technical questions or need assistance, please contact our world-class support team here.