Deep Dive: http.favicon

Favicons are the small icons that you see in the browser tab next to the website title or in your bookmarks. For example, the Shodan logo on the left side of the browser tab is the favicon:

They typically contain the logo of the company which gives them 2 functions:

  • An easy way to find the tab of a website when you have multiple open tabs.
  • A sense of authenticity that the website you're visiting belongs to the right company.

Shodan collects the favicons for websites and stores the information in the http.favicon property:

{
        "data": "AAABAAIAEBAAAAAAIABoBAAAJgAAACAgAAAAACAAqBAAAI4EAAAoAAAAEAAAACAAAAABACAAAAAA\nAEAEAAAAAAAAAAAAAAAAAAAAAAAA////Af///wH///8B////Af///wH///8B////ASWX/RUne/MX\n////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////ASan/EEn\nbvThJm/05yai/EP///8B////Af///wH///8B////Af///wH///8B////Af///wH///8BJqP8Ayaj\n/I0mi/v5J1Tt/SdT7Psmi/z5JqL8j////wH///8B////Af///wH///8B////...",
        "hash": 516963061,
        "location": "https://about.gitlab.com:443/ico/favicon.ico"
}
  • data contains the image as a base64-encoded string.

  • hash is the MurmurHash3 of the data property. The Shodan API has a search filter called http.favicon.hash to search based on this value.

  • location lets you know where the favicon was found. Historically, the favicon.ico file was located at the root of the web server but it can be put in any arbitrary location by referencing it in the HTML. For example:

    html <link rel="icon" type="image/png" href="/assets/favicon-yellow-018213ceb87b472388095d0264be5b4319ef47471dacea03c83ecc233ced2fd5.png" />

At Shodan, we developed the technique of hashing the favicon to make it possible to search across the Internet for identical favicon images. We developed it nearly a decade ago to help with 2 use cases:

  • Identify Phishing Websites: bad actors will commonly use the same favicon as the website they're imitating. By searching for the favicon of a company you can identify potential phishing websites.
  • Origin IP Disclosure: websites that are hosted behind a CDN (ex. Cloudflare) should restrict access to their web server to only accept connections from the CDN. By searching for the favicon of a website you can confirm that a website has been correctly configured and isn't responding to requests from its origin IP.

The favicon hash is calculated by applying the MurmurHash3 algorithm to the http.favicon.data property on the banner.

Why MMH3? The key considerations when we developed the technique were speed of the hashing algorithm and size of the resulting hash. We didn't need the cryptographic guarantees of MD5/ etc.

favscan

We provide a simple tool called favscan that calculates the favicon hash given a URL, hostname or local file path.

$ favscan -h
Calculate the favicon hash of a local file, hostname or URL

Usage: favscan [OPTIONS] <LOCATION>

Arguments:  
  <LOCATION>  

Options:  
  -v, --verbose  
  -h, --help     Print help
  -V, --version  Print version

favscan will first look for the favicon in the common /favicon.ico path and if that fails it will check the frontpage for a shortcut icon link. The tool is available for download across many platforms:

For example, to get the favicon hash for google.com you would run:

favscan google.com  

You can also specify ports as part of the URL:

favscan https://test.shodan.io:6993  

Or calculate it for a local file:

favscan favicon.ico  

Example

Lets say we want to find public instances of Gitlab using favicons. We start off by grabbing the favicon hash of a known Gitlab instance:

$ favscan gitlab.com
1265477436  

We then take that hash and use it in a search query of:

http.favicon.hash:1265477436  

The search query can be used on the website, CLI or API. For now, lets just see how many instances there are based on the favicon:

$ shodan count http.favicon.hash:1265477436
29558  

And this is what it looks like on the website:

https://www.shodan.io/search/report?query=http.favicon.hash%3A1265477436

Note: Shodan already fingerprints Gitlab services so you can search for product:gitlab instead of using favicons.

References