A First Look At The Privacy Harms Of The Public Suffix List
Stephen McQuistin (University of St Andrews), Peter Snyder (Brave Software), Colin Perkins (University of Glasgow), Hamed Haddadi (Imperial College London, Brave Software), Gareth Tyson (Hong Kong University of Science & Technology) | Privacy
The public suffix list is a community-maintained list of rules that can be applied to domain names to determine how they should be grouped into logical organizations or companies. We present the first large-scale measurement study of how the public suffix list is used by open-source software on the Web and the privacy harm resulting from projects using outdated versions of the list. We measure how often developers include out-of-date versions of the public suffix list in their projects, how old included lists are, and estimate the real-world privacy harm with a model based on a large-scale crawl of the Web. We find that incorrect use of the public suffix list is common in open-source software, and that at least 43 open-source projects use hard-coded, outdated versions of the public suffix list. These include popular, security-focused projects, such as password managers and digital forensics tools. We also estimate that, because of these out-of-date lists, these projects make incorrect privacy decisions for 1313 effective top-level domains (eTLDs), affecting 50,750 domains, by extrapolating from data gathered by the HTTP Archive project.