by Filip Sufitch , the Python Handler
Python programmers frequently deal with huge amounts of Web log data. Many of them want device capabilities to be a part of that data so they can analyze how various device types and capabilities impact Web usage.
Customers have asked for a WURFL API for Python. ScientiaMobile offers Python access to our WURFL Cloud service. This Cloud service approach may work for some people, but for folks with high volume who want more control, we offer a different approach with WURFL InFuze for Python.
The beauty of WURFL InFuze is that it provides a high-performance C++ API which Python can import. It also provides command line utilities that enable filtering for the sake of analytics. With these tools Python programmers can integrate WURFL’s device detection into their Python code base.
WURFL InFuze for Python uses an XML file which contains the device definitions for all the mobile devices on earth. Only two lines are need to import the library and load the XML file:
from pywurfl.wurfl import Wurfl WURFL = Wurfl("/home/wurfl.xml")
This first example shows how you can parse a single user agent. Then, it shows how to request single or multiple device capabilities.
def plain_queries(): dev = WURFL.parse_useragent("Mozilla/5.0 (Linux; Android 4.4; Nexus 5 Build/KRT16M) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36") # Get one capability print(dev.get_capability("is_mobile")) #true # Get many capabilities print(dev.get_capabilities(["brand_name", "model_name"])) #{'brand_name': 'LG', 'model_name': 'Nexus 5'} # Get all capabilities dev.get_all_capabilities() # result is a huge dict # Release the device dev.release() # Release back-end pointers and such, to prevent memory leaking
In this second example, we process many user agents from the “sample_uas.txt” file. Python performs serial processing of the user agents in this file. It uses the device capability result from “ux_full_desktop” to compute what percent of visitors to the site are using a full desktop Web browser.
def serial(): """ Sample serial bulk processing. """ num_desktop = 0 num_total = 0 with open("sample_uas.txt") as fh: for line in fh: ua = line.strip() dev = WURFL.parse_useragent(ua) if dev.get_capability("ux_full_desktop")=='true': num_desktop += 1 num_total += 1 dev.release() print("%d (%.2f%%) desktop devices" % (num_desktop, float(num_desktop)/num_total*100))
This third example shows how Python can use parallel multi-processing to perform analysis. In this case, “process_ua” is used in parallel to parse user agents and return device capability results. Python chunks the “sample_uas.txt” file into 10 user agents each. Then it pools the results. The results are the same as the serial processing example, just much faster performance.
#### Sample multiprocessing parallel bulk analysis import multiprocessing as mp import os def process_ua(ua): dev = WURFL.parse_useragent(ua.strip()) ux_desktop = dev.get_capability("ux_full_desktop") dev.release() return os.getpid(), int(ux_desktop=='true') def line_iter(fname): with open(fname) as fh: for line in fh: yield line def parallel_mp(): num_desktop = 0 num_total = 0 pool = mp.Pool() results = pool.imap(process_ua, line_iter("sample_uas.txt"), chunksize=10) pool.close() pool.join() pids = set() for result in results: pid, desktop_cnt = result num_total += 1 num_desktop += desktop_cnt pids.add(pid) print("%d (%.2f%%) desktop devices" % (num_desktop, float(num_desktop)/num_total*100)) print("PIDs involved: %s" % list(pids))
If you want to start crunching device capability statistics, then WURFL InFuze for Python is a great tool. It gives programmers the tools to quickly integrate device detection and perform powerful device analysis at the same time.