<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[research - Shodan Blog]]></title><description><![CDATA[The latest news and developments for Shodan.]]></description><link>https://blog.shodan.io/</link><generator>Ghost 0.7</generator><lastBuildDate>Sat, 11 Apr 2026 03:30:07 GMT</lastBuildDate><atom:link href="https://blog.shodan.io/tag/research/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Trends in Internet Exposure]]></title><description><![CDATA[<blockquote>
  <p><strong>Edit</strong>: The original data for RDP in March, 2020 included IPv6 results whereas the historical analysis only looked at IPv4. I've changed the numbers to reflect the new counts. Tl;dr: we're still seeing growth but significantly less than before.</p>
</blockquote>

<p>More companies are going remote due to COVID-19 and as</p>]]></description><link>https://blog.shodan.io/trends-in-internet-exposure/</link><guid isPermaLink="false">e751b30d-dce7-488c-881a-0144dd1b486c</guid><category><![CDATA[research]]></category><category><![CDATA[Shodan]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Mon, 30 Mar 2020 00:06:18 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2020/03/ics-map-2020.png" medium="image"/><content:encoded><![CDATA[<blockquote>
  <img src="http://blog.shodan.io/content/images/2020/03/ics-map-2020.png" alt="Trends in Internet Exposure"><p><strong>Edit</strong>: The original data for RDP in March, 2020 included IPv6 results whereas the historical analysis only looked at IPv4. I've changed the numbers to reflect the new counts. Tl;dr: we're still seeing growth but significantly less than before.</p>
</blockquote>

<p>More companies are going remote due to COVID-19 and as a result there's been a lot of speculation around how this impacts the exposure of companies and the Internet as a whole (in terms of publicly-accessible services). I was actually already working on creating trends for various services due to a presentation I gave late last year so let me share with you some updated charts on how the Internet has evolved over the past few years (up to March 29, 2020).</p>

<h4 id="methodology">Methodology</h4>

<p>Just quickly I'll mention a bit about how the data itself is generated:</p>

<ol>
<li>Shodan infrastructure is globally distributed to prevent being geographically biased  </li>
<li>Crawlers run 24/7 and don't do sweeps of IP ranges the same way a network scanner would  </li>
<li>Crawlers attempt full protocol-specific handshakes to validate that a port is responding. Depending on the protocol Shodan also performs additional steps to validate the response. For example, in the case of RDP the crawlers grab a screenshot, perform OCR on that screenshot and do a variety of basic security checks.</li>
</ol>

<h5 id="timeframe">Timeframe</h5>

<p>Shodan keeps a full history of every IP in the Internet that it's ever seen. We store that archive in a variety of formats and for this purpose I reprocessed our data going back to the beginning of 2017. You can also access that historical data via the <a href="https://help.shodan.io/developer-fundamentals/looking-up-ip-info">API, CLI, </a> or the new <a href="https://beta.shodan.io">beta website</a>.</p>

<h5 id="aggregation">Aggregation</h5>

<p>I binned the results by unique IPs per month for each port/ tag. This means that data is not based on point-in-time scans but rather an aggregate view of the active IPs during a month.</p>

<h4 id="remotedesktop">Remote Desktop</h4>

<p>The Remote Desktop Protocol (RDP) is a common way for Windows users to remotely manage their workstation or server. However, it has a history of security issues and generally shouldn't be publicly accessible without any other protections (ex. firewall whitelist, 2FA).</p>

<p><img src="https://blog.shodan.io/content/images/2020/04/Shodan---Remote-Desktop-Port.png" alt="Trends in Internet Exposure"></p>

<p>The number of devices exposing RDP to the Internet has grown over the past month which makes sense given how many organizations are moving to remote work.</p>

<p>It's surprising how the number of RDP instances actually went up after the initial Microsoft bulletin on Bluekeep in May 2019. And then it dropped sharply in August once a series of issues were revealed (DejaBlue) that impacted newer versions of RDP.</p>

<p>A common tactic we've seen in the past by IT departments is to put an insecure service on a non-standard port (aka <a href="https://blog.shodan.io/hiding-in-plain-sight/">security by obscurity</a>). To that point, this is how the exposure for RDP looks like on an alternate port (3388) that we've seen organizations use:</p>

<p><img src="https://blog.shodan.io/content/images/2020/04/Shodan---Remote-Desktop-Port--3388-.png" alt="Trends in Internet Exposure"></p>

<p>It follows very similar growth as seen for the standard port (3389). The last thing I wanted to point out is that 8% of the results remain vulnerable to BlueKeep (CVE-2019-0708).</p>

<h4 id="vpns">VPNs</h4>

<p><img src="https://blog.shodan.io/content/images/2020/04/Shodan---VPN-Exposure.png" alt="Trends in Internet Exposure"></p>

<p>The above chart encompasses a few different VPN protocols and ports (IKE, PPTP etc.). VPNs are a secure way to allow remote workers access to your network and it's not surprising to see that number grow as well the past month.</p>

<h4 id="industrialcontrolsystems">Industrial Control Systems</h4>

<p><img src="https://blog.shodan.io/content/images/2020/04/Shodan---Industrial-Control-Systems.png" alt="Trends in Internet Exposure"></p>

<p>We've observed significant growth in other protocols (HTTPS) but one of the important areas where we've seen a worrying increase in exposure is for industrial control systems (ICS). The growth is not as large as for other protocols but these are ICS protocols that don't have any authentication or security measures. We had actually seen a stagnation in the ICS exposure up until now. And there have been significant advancements in OT security so there are plenty of secure options to choose from.</p>

<p>We're also keeping our <a href="https://exposure.shodan.io">country-wide exposure dashboards</a> up-to-date if you'd like to see breakdowns by country.</p>

<h4 id="conclusion">Conclusion</h4>

<p>I hope the above data provides a more data-driven view of how the exposure of those ports has changed the past few years. There aren't any earth-shattering surprises in the data but it's good to validate what many already assumed. If you're an organization that is concerned with your Internet exposure and wants to keep track of what you have connected to the Internet then please check out our <a href="https://monitor.shodan.io">Shodan Monitor service</a>.</p>]]></content:encoded></item><item><title><![CDATA[The HDFS Juggernaut]]></title><description><![CDATA[<p>There's been much focus on MongoDB, Elastic and Redis in terms of data exposure on the Internet due to their general popularity in the developer community. However, in terms of data volume it turns out that HDFS is the real juggernaut. To give you a better idea here's a quick</p>]]></description><link>https://blog.shodan.io/the-hdfs-juggernaut/</link><guid isPermaLink="false">c469ddda-3cd3-48db-b4dc-a2d771993b61</guid><category><![CDATA[NoSQL]]></category><category><![CDATA[research]]></category><category><![CDATA[Python]]></category><category><![CDATA[HDFS]]></category><category><![CDATA[CLI]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Wed, 31 May 2017 17:32:11 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2017/05/hdfs-map-1600.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2017/05/hdfs-map-1600.png" alt="The HDFS Juggernaut"><p>There's been much focus on MongoDB, Elastic and Redis in terms of data exposure on the Internet due to their general popularity in the developer community. However, in terms of data volume it turns out that HDFS is the real juggernaut. To give you a better idea here's a quick comparison between MongoDB and HDFS:</p>

<table>  
<thead>  
<tr>  
<th></th>  
<th>MongoDB</th>  
<th>HDFS</th>  
</tr>  
</thead>  
<tbody>  
<tr>  
<td>Number of Servers</td>  
<td>47,820</td>  
<td>4,487</td>  
</tr>  
<tr>  
<td>Data Exposed</td>  
<td>25 TB</td>  
<th>5,120 TB</th>  
</tr>  
</tbody>  
</table>

<p>Even though there are more MongoDB databases connected to the Internet without authentication in terms of data exposure it is dwarfed by HDFS clusters (25 TB vs 5 PB). Where are all these instances located?</p>

<script type="text/javascript" src="https://asciinema.org/a/6dzqir2jbssqftvcxwgh63dwp.js" id="asciicast-6dzqir2jbssqftvcxwgh63dwp" async></script>

<p>Most of the HDFS NameNodes are located in the US (1,900) and China (1,426). And nearly all of the HDFS instances are hosted on the cloud with Amazon leading the charge (1,059) followed by Alibaba (507).</p>

<p><img src="https://blog.shodan.io/content/images/2017/05/hdfs-map-600.png" alt="The HDFS Juggernaut"></p>

<p>The ransomware attacks on databases that were <a href="http://www.csoonline.com/article/3154190/security/exposed-mongodb-installs-being-erased-held-for-ransom.html">widely</a> <a href="https://www.fidelissecurity.com/threatgeek/2017/01/revenge-devops-gangster-open-hadoop-installs-wiped-worldwide">publicized</a> earlier in the year are still happening. And they're impacting both MongoDB and HDFS deployments. For HDFS, Shodan has discovered roughly <a href="https://www.shodan.io/search?query=NODATA4U_SECUREYOURSHIT">207 clusters</a> that have a message warning of the public exposure. And a quick glance at search results in Shodan reveals that most of the public MongoDB instances <a href="https://www.shodan.io/search?query=product%3Amongodb">seem to be compromised</a>. I've <a href="https://blog.shodan.io/its-the-data-stupid/">previously written</a> on the reason behind these exposures but note that both products nowadays have extensive documentation on <a href="https://docs.mongodb.com/manual/security/">secure deployment</a>.</p>

<h6 id="technicaldetails">Technical Details</h6>

<p>If you'd like to replicate the above findings or perform your own investigations into data exposure, this is how I measured the above.</p>

<ol>
<li><p>Download data using the <a href="https://cli.shodan.io">Shodan command-line interface</a>:</p>

<pre><code>shodan download --limit -1 hdfs-servers product:namenode
</code></pre></li>
<li><p>Write a Python script to measure the amount of exposed data (<strong>hdfs-exposure.py</strong>):</p>

<pre><code>from shodan.helpers import iterate_files, humanize_bytes
from sys import argv, exit


if len(argv) &lt;=1 :
    print('Usage: {} &lt;file1.json.gz&gt; ...'.format(argv[0]))
    exit(1)


datasize = 0
clusters = {}


# Loop over all the banners in the provided files
for banner in iterate_files(argv[1:]):
    try:
        # Grab the HDFS information that Shodan gathers
        info = banner['opts']['hdfs-namenode']
        cid = info['ClusterId']
        # Skip clusters we've already counted
        if cid in clusters:
            continue
        datasize += info['Used']
        clusters[cid] = True
    except:
        pass


print(humanize_bytes(datasize))
</code></pre></li>
<li><p>Run the Python script to get the amount of data exposed:</p>

<pre><code>$ python hdfs-exposure.py hdfs-data.json.gz
5.0 PB
</code></pre></li>
</ol>]]></content:encoded></item><item><title><![CDATA[Understanding Security by Country: SSL]]></title><description><![CDATA[<p>With Shodan it's easy to get an overview of the security for a country. Real-world borders don't necessarily translate to the Internet but it can still reveal useful information as shown by <a href="https://books.google.com/books?id=T9IqCgAAQBAJ&amp;pg=PA259&amp;lpg=PA259#v=onepage&amp;q&amp;f=false">OECD</a>. I will show how I use Shodan to get a big picture view of a country; in</p>]]></description><link>https://blog.shodan.io/understanding-security-by-country-ssl/</link><guid isPermaLink="false">9aff5dfd-1696-4732-886d-5a610a9c3c5a</guid><category><![CDATA[research]]></category><category><![CDATA[SSL]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Wed, 03 Aug 2016 21:33:53 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2016/08/Firefox_Screenshot_2016-08-03T21-32-54-260Z.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2016/08/Firefox_Screenshot_2016-08-03T21-32-54-260Z.png" alt="Understanding Security by Country: SSL"><p>With Shodan it's easy to get an overview of the security for a country. Real-world borders don't necessarily translate to the Internet but it can still reveal useful information as shown by <a href="https://books.google.com/books?id=T9IqCgAAQBAJ&amp;pg=PA259&amp;lpg=PA259#v=onepage&amp;q&amp;f=false">OECD</a>. I will show how I use Shodan to get a big picture view of a country; in this case I'm looking at the USA.</p>

<p>First, lets have a look at how SSL is deployed in the USA. I will start off by getting a breakdown of the SSL versions that are supported by web servers:</p>

<pre><code>shodan stats --facets ssl.version country:US has_ssl:true HTTP
</code></pre>

<p>To do this I'm faceting on the <strong>ssl.version</strong> property which contains a list of SSL versions that the web server supports. This is possible because Shodan crawlers explicitly test for SSLv2 through TLSv1.2.</p>

<p><img src="https://blog.shodan.io/content/images/2016/08/ssl-versions-usa.png" alt="Understanding Security by Country: SSL"></p>

<p>Unsurprisingly, the <a href="https://www.shodan.io/report/lIWBfrtT">majority of the HTTPS servers</a> are hosted by Akamai and Amazon. However, there's still a sizable chunk (600,000+) devices that support SSLv2 so lets look at those briefly:</p>

<pre><code>shodan stats --facets org country:US ssl.version:sslv2 HTTP
</code></pre>

<p>Here I'm faceting on the <strong>org</strong> (organization) property and filtering for web servers that support SSLv2. This doesn't mean that they only accept SSLv2 connections but it is one of the versions the service supports.</p>

<p><img src="https://blog.shodan.io/content/images/2016/08/Firefox_Screenshot_2016-08-03T05-55-53-808Z.png" alt="Understanding Security by Country: SSL"></p>

<p>Around <a href="https://www.shodan.io/report/S4iafkde">25% of the services</a> that support SSLv2 are operating on CenturyLink's network. Just <a href="https://www.shodan.io/search?query=ssl.version%3Asslv2+country%3Aus+http">looking at the results</a> it seems like some of CenturyLink's modems are the reason for their #1 spot on the list. Their numbers are significantly higher than the next provider but I'm hoping these numbers will decline as CenturyLink phases out their older equipment.</p>

<p>The Shodan crawls also check for the various SSL vulnerabilities such as Heartbleed and FREAK so lets see how the US fares for those. For Heartbleed there are at least <a href="https://www.shodan.io/report/SlUlgL38">~30,000 devices in the US</a> still vulnerable to it.</p>

<p><img src="https://blog.shodan.io/content/images/2016/08/Firefox_Screenshot_2016-08-03T20-21-04-193Z.png" alt="Understanding Security by Country: SSL"></p>

<p>Interestingly, Verizon Wireless is the network with the most services vulnerable to Heartbleed. The runner-up, Amazon, is less surprising since it's not unusual for people to deploy old images that haven't yet been patched (<a href="https://blog.shodan.io/its-still-the-data-stupid/">or lack protection</a>). There are 2 types of devices operated by Verizon Wireless that are affected:</p>

<ol>
<li><p><a href="https://www.shodan.io/search?query=vuln%3Acve-2014-0160+country%3Aus+http+org%3A%22Verizon+Wireless%22+admin">Wireless routers</a> that run on the alternate HTTPS port 8443 and are made by CradlePoint Technology.</p></li>
<li><p><a href="https://www.shodan.io/search?query=vuln%3Acve-2014-0160+country%3Aus+http+org%3A%22Verizon+Wireless%22+WatchfireSessionID">Digital billboards</a> made by Watchfire Signs that run a web server on port 9443.</p></li>
</ol>

<p>I have not heard of these products before but this explains why Verizon Wireless has the most devices affected by Heartbleed - I wouldn't have expected many regular web servers to operate on their network. The same analysis can be performed by looking at services that support export ciphers (CVE-2015-0204) which I will leave as an exercise.</p>

<p>Finally, lets look at the distribution of SSL certificates. It usually isn't a good sign if the same SSL certificate is deployed across a large number of devices. To see the usage of duplicate SSL certificates we can facet on the <strong>ssl.cert.fingerprint</strong> property:</p>

<pre><code>shodan stats --facets ssl.cert.fingerprint country:us has_ssl:true http
</code></pre>

<p>The results of the command will give us the 10 most common SSL certificate fingerprints:</p>

<p><img src="https://blog.shodan.io/content/images/2016/08/ssl-us-fingerprints.png" alt="Understanding Security by Country: SSL"></p>

<p>If you want to get more than 10 you can also provide a number to the facet. For example, this is how to get the top 100 SSL fingerprints:</p>

<pre><code>shodan stats --facets ssl.cert.fingerprint:100 country:us has_ssl:true http
</code></pre>

<p>The most common SSL certificate is for what looks like Google's CDN on IPv6. However, the 2nd most often seen SSL certificate is for <a href="https://www.shodan.io/search?query=ssl.cert.fingerprint%3Ae1369c0316542950dbf9bd0c96a9feae43ee41d8">Ecommerce Corporation</a> which is a familiar company if you've read some of my <a href="https://blog.shodan.io/tracking-hacked-websites-2/">earlier articles</a> on defaced websites. While we're looking at duplicate fingerprints, what about SSH fingerprints?</p>

<pre><code>shodan stats --facets ssh.fingerprint country:us
</code></pre>

<p><img src="https://blog.shodan.io/content/images/2016/08/ssl-us-fingerprints-1.png" alt="Understanding Security by Country: SSL"></p>

<p>The most common duplicate SSH fingerprint in the US belongs to <a href="https://www.shodan.io/search?query=62%3A5e%3Ab9%3Afd%3A3a%3A70%3Aeb%3A37%3A99%3Ae9%3A12%3Ae3%3Ad9%3A3f%3A4e%3A6c">GoDaddy</a>. Looking at those results will require another blog post but the above is how I usually get started when trying to identify systemic problems.</p>

<p>Here is a short video that shows how I've done a similar analysis for Germany:</p>

<script type="text/javascript" src="https://asciinema.org/a/48143.js" id="asciicast-48143" async></script>

<p>SSL is only one of many aspects that should be looked at and I will be discussing some other angles in future posts. I hope I've given you a better idea of how I use Shodan to breakdown SSL issues on a national level.</p>]]></content:encoded></item><item><title><![CDATA[Tracking Hacked Websites]]></title><description><![CDATA[<p>I wanted to revisit the results of a few posts last year on how to <a href="https://blog.shodan.io/tracking-hacked-websites/">track website defacements</a> and <a href="https://blog.shodan.io/top-website-defacers-june-2015/">see how things have changed</a> since then. In case you're wondering how this data is collected, I've created a video that shows in real-time the commands I used to generate the</p>]]></description><link>https://blog.shodan.io/tracking-hacked-websites-2/</link><guid isPermaLink="false">bcf05327-f9e6-40a5-add3-297280fe74b8</guid><category><![CDATA[research]]></category><category><![CDATA[defacements]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Mon, 18 Jan 2016 10:25:08 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2016/01/Blog-Hacker-Background.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2016/01/Blog-Hacker-Background.jpg" alt="Tracking Hacked Websites"><p>I wanted to revisit the results of a few posts last year on how to <a href="https://blog.shodan.io/tracking-hacked-websites/">track website defacements</a> and <a href="https://blog.shodan.io/top-website-defacers-june-2015/">see how things have changed</a> since then. In case you're wondering how this data is collected, I've created a video that shows in real-time the commands I used to generate the data:</p>

<script type="text/javascript" src="https://asciinema.org/a/21387.js" id="asciicast-21387" async></script>

<p>Here's the Top 10 Website Defacers as of January 2016:</p>

<ol>
<li><strong>GHoST61</strong>: 51  </li>
<li><strong>Kadimoun</strong>: 39  </li>
<li><strong>AnonCoders</strong>: 35  </li>
<li><strong>r00t-x</strong>: 31  </li>
<li><strong>Shor7cut</strong>: 28  </li>
<li><strong>Owner Dzz</strong>: 27  </li>
<li><strong>Toxic Phantom FROM BANGLADESH BLACK HAT HACKERS</strong>: 27  </li>
<li><strong>TechnicaL</strong>: 21  </li>
<li><strong>virus3033</strong>: 21  </li>
<li><strong>Yuba</strong>: 17</li>
</ol>

<p><strong>GHoST61</strong> also topped the ranking last year and remains at the top at the moment. Other familiar names are: <strong>r00t-x</strong> (moved down 1 rank), <strong>TechnicaL</strong> (moved down 2 ranks) and <strong>virus3033</strong> (moved down 2 ranks). This means that 4 of out of the previous top 10 are still around, while the other 6 weren't listed before.</p>

<p><img src="https://blog.shodan.io/content/images/2016/01/Firefox_Screenshot_2016-01-18T10-01-00-904Z.png" alt="Tracking Hacked Websites"></p>

<p>In terms of organizations containing defaced websites, the <a href="https://www.shodan.io/report/nIBwjjHw">Ecommerce Corporation remains the most affected by far</a>. At this point it seems a given that Ecommerce will have the worst ranking so lets look at the other organisations on the list. The full ranking is:</p>

<ol>
<li>Ecommerce Corporation  </li>
<li>Unified Layer (+1)  </li>
<li>GoDaddy (-1)  </li>
<li>CyrusOne  </li>
<li>iServer Hosting  </li>
<li>SoftLayer Technologies  </li>
<li>Media Temple (-1)  </li>
<li>Peer1 Dedicated Hosting (-4)  </li>
<li>New Dream Network  </li>
<li>Digital Ocean</li>
</ol>

<p>The top 3 have remained the same, though GoDaddy and Unified Layer switched spots. New entries on the list are: CyrusOne, iServer Hosting, SoftLayer, New Dream Network and Digital Ocean. At this point it's clear that there are a few hosting providers with on-going problems and it doesn't look like they've made any impactful changes to reduce the number of compromised websites.</p>

<p>In terms of products, the vast majority of affected websites were running Apache:</p>

<p><img src="https://blog.shodan.io/content/images/2016/01/Firefox_Screenshot_2016-01-18T09-55-02-979Z.png" alt="Tracking Hacked Websites"></p>

<p>I'm planning on periodically revisiting this subject to see how things change over time, especially with regards to the newly-listed organisations!</p>

<p>PS: Credit to <a href="https://twitter.com/Viss">@Viss</a> for the dramatic hacker background image at the top.</p>]]></content:encoded></item><item><title><![CDATA[Memory As A Service]]></title><description><![CDATA[<p>I've written and presented on the topic of insecure databases for nearly 2 years now. The example I use the most to demonstrate the problem is MongoDB because it's popular and had <a href="https://blog.shodan.io/its-still-the-data-stupid/">terrible defaults</a>. Invariably though the focus of the conversation ends up on MongoDB and not that there are</p>]]></description><link>https://blog.shodan.io/memory-as-a-service/</link><guid isPermaLink="false">7d5429f8-0bd1-4dfc-a025-4e0e32a69d8f</guid><category><![CDATA[research]]></category><category><![CDATA[mongo]]></category><category><![CDATA[Memcached]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Thu, 17 Dec 2015 07:13:47 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/12/artificial-engine_00229391.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/12/artificial-engine_00229391.jpg" alt="Memory As A Service"><p>I've written and presented on the topic of insecure databases for nearly 2 years now. The example I use the most to demonstrate the problem is MongoDB because it's popular and had <a href="https://blog.shodan.io/its-still-the-data-stupid/">terrible defaults</a>. Invariably though the focus of the conversation ends up on MongoDB and not that there are hundreds of thousands of databases on the Internet without any authentication.</p>

<p>So for today I decided to take a look at something else: <a href="http://memcached.org/">Memcached</a>. Their website explains it best:</p>

<blockquote>
  <p>Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.</p>
</blockquote>

<p>Do you operate a website? Does it get a lot of traffic? Then memcached is what you need to speed up response times by caching database lookups, web responses or anything else that takes more than a second to accomplish.</p>

<p>Shodan shows there are <a href="https://www.shodan.io/report/IsKj8RXU">more than 130,000 Memcached servers</a> running on the Internet. And they also return a lot of detailed information about their status:</p>

<p><img src="https://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-17T06-22-15-983Z.png" alt="Memory As A Service"></p>

<p>Memcached provides its uptime, version, current number of connections, how much is being stored and much more. For now, I just took a look at the amount of data stored and how much memory is made available. Aggregating all the information from the publicly-available Memcached instances here are some stats:</p>

<ul>
<li><strong>8 TB</strong> of data stored</li>
<li><strong>49,153 PB</strong> of memory collectively available</li>
</ul>

<p>Since Memcached is a caching layer we wouldn't expect to see a lot of data stored in it on a permanent basis (records also usually have an expiration attached). And it doesn't offer advanced querying as a regular database would, which makes navigating the 8 TB of data more difficult than with MongoDB. That being said, there is still a lot of sensitive information that is temporarily stored on these instances. However, there is also a ridiculously giant amount of memory available on public Memcached servers. For people not familiar with petabytes, the total amount of memory advertised is <strong>49,153,000 TB</strong>.</p>

<p>The organizations that are hosting the most instances are:</p>

<p><img src="https://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-17T06-35-04-260Z.png" alt="Memory As A Service"></p>

<ol>
<li><strong>ColoCrossing</strong>  </li>
<li><strong>GoDaddy</strong>  </li>
<li><strong>Enzu</strong>  </li>
<li><strong>Aliyun</strong>  </li>
<li><strong>Alibaba Advertising</strong></li>
</ol>

<p>One of the reason for all these publicly accessible instances is the same as with MongoDB: the official, default configuration of Memcached listens on all interfaces. Curiously, the Linux distributions I looked at that are offering Memcached packages provided secure defaults; i.e. only listen on <em>localhost</em>. This means that most likely the above organizations installed Memcached from source.</p>

<p>I hope this has provided some evidence that it's not just MongoDB facing insecure-by-default issues when it comes to data storage services. I could've performed the same analysis as above for <a href="https://www.shodan.io/search?query=product%3Aredis">Redis</a>, <a href="https://www.shodan.io/search?query=product%3Acassandra">Cassandra</a>, <a href="https://www.shodan.io/search?query=product%3Acouchdb">CouchDB</a> or <a href="https://www.shodan.io/search?query=port%3A8098+mochiweb">Riak</a>.</p>]]></content:encoded></item><item><title><![CDATA[It's Still the Data, Stupid!]]></title><description><![CDATA[<p>In light of the recent incident of <a href="https://krebsonsecurity.com/2015/12/13-million-mackeeper-users-exposed/">MacKeeper exposing 13 million accounts</a> through a public, unauthenticated MongoDB instances I wanted to quickly revisit my <a href="https://blog.shodan.io/its-the-data-stupid/">earlier blog post</a> on the subject.</p>

<p>At the moment, there are at least <a href="https://www.shodan.io/report/nlrw9g59">35,000 publicly available, unauthenticated instances of MongoDB</a> running on the Internet. This</p>]]></description><link>https://blog.shodan.io/its-still-the-data-stupid/</link><guid isPermaLink="false">2143defc-cf08-4c17-8532-ec7fdf0b4012</guid><category><![CDATA[research]]></category><category><![CDATA[MongoBD]]></category><category><![CDATA[NoSQL]]></category><category><![CDATA[Cassandra]]></category><category><![CDATA[Riak]]></category><category><![CDATA[Redis]]></category><category><![CDATA[CouchDB]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Tue, 15 Dec 2015 07:30:59 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/12/Library-with-a-book-ladde-014.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/12/Library-with-a-book-ladde-014.jpg" alt="It's Still the Data, Stupid!"><p>In light of the recent incident of <a href="https://krebsonsecurity.com/2015/12/13-million-mackeeper-users-exposed/">MacKeeper exposing 13 million accounts</a> through a public, unauthenticated MongoDB instances I wanted to quickly revisit my <a href="https://blog.shodan.io/its-the-data-stupid/">earlier blog post</a> on the subject.</p>

<p>At the moment, there are at least <a href="https://www.shodan.io/report/nlrw9g59">35,000 publicly available, unauthenticated instances of MongoDB</a> running on the Internet. This is an increase of >5,000 instances since the last article. They're hosted mostly on Amazon, Digital Ocean and Aliyun (cloud computing by Alibaba):</p>

<p><img src="https://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-15T07-29-58-196Z.png" alt="It's Still the Data, Stupid!"></p>

<p>The most popular versions of MongoDB are:</p>

<ol>
<li><strong>3.0.7</strong>: 3,010  </li>
<li><strong>2.4.9</strong>: 2,624  </li>
<li><strong>2.4.14</strong>: 2,535  </li>
<li><strong>2.4.10</strong>: 1,879  </li>
<li><strong>3.0.6</strong>: 1,256</li>
</ol>

<p>By default, newer versions of MongoDB only listen on localhost. The fact that MongoDB 3.0 is well-represented means that a lot of people are changing the default configuration of MongoDB to something less secure and aren't enabling any firewall to protect their database. In the previous article, it looked like the misconfiguration problem might solve itself due to the new defaults that MongoDB started shipping with; that doesn't appear to be the case based on the new information. It could be that users are upgrading their instances but using their existing, insecure configuration files.</p>

<p>In terms of data volume, all of the exposed databases combined account for <strong>684.8 TB of data</strong>. And the most popular database names are:</p>

<ol>
<li><strong>local</strong>: 33,947  </li>
<li><strong>admin</strong>: 23,970  </li>
<li><strong>db</strong>: 8,638  </li>
<li><strong>test</strong>: 6,761  </li>
<li><strong>config</strong>: 859  </li>
<li><strong>test1</strong>: 612  </li>
<li><strong>mydb</strong>: 549  </li>
<li><strong>DrugSupervise</strong>: 382  </li>
<li><strong>Video</strong>: 376  </li>
<li><strong>mean-dev</strong>: 252</li>
</ol>

<p>The database names are mostly the same as before, with the exception of: DrugSupervise and mean-dev. Notably absent is <strong>hackedDB</strong> which was at #8 last time.</p>

<p>Finally, I can't stress enough that this problem is not unique to MongoDB: <a href="https://www.shodan.io/search?query=product%3Aredis">Redis</a>, <a href="https://www.shodan.io/search?query=product%3Acouchdb">CouchDB</a>, <a href="https://www.shodan.io/search?query=product%3Acassandra">Cassandra</a> and <a href="https://www.shodan.io/search?query=port%3A8098+mochiweb">Riak</a> are equally impacted by these sorts of misconfigurations.</p>]]></content:encoded></item><item><title><![CDATA[Tracking HTTP/2.0 Adoption]]></title><description><![CDATA[<p>HTTP/2.0 is the next version of the protocol powering websites and it promises <a href="https://http2.github.io/faq/#what-are-the-key-differences-to-http1x">many improvements</a> over HTTP/1.x. There are a few different ways that a server can advertise support for HTTP/2.0 but the most common one is during the SSL handshake. I've added support</p>]]></description><link>https://blog.shodan.io/tracking-http2-0-adoption/</link><guid isPermaLink="false">5ec4b674-87e6-4258-b0e3-7f182b93ecb1</guid><category><![CDATA[research]]></category><category><![CDATA[http/2]]></category><category><![CDATA[spdy]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Sun, 13 Dec 2015 05:40:33 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-13T05-40-01-999Z.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-13T05-40-01-999Z.png" alt="Tracking HTTP/2.0 Adoption"><p>HTTP/2.0 is the next version of the protocol powering websites and it promises <a href="https://http2.github.io/faq/#what-are-the-key-differences-to-http1x">many improvements</a> over HTTP/1.x. There are a few different ways that a server can advertise support for HTTP/2.0 but the most common one is during the SSL handshake. I've added support in Shodan for tracking the negotiated HTTP versions and the data can be searched using the <a href="https://www.shodan.io/search?query=ssl.alpn%3Ah2"><strong>ssl.alpn</strong></a> filter.</p>

<p>Without further ado, here are the <a href="https://www.shodan.io/report/mNs9fa3I">most popular HTTP versions</a> on the Internet for HTTPS servers (port 443):</p>

<p><img src="https://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-13T02-22-26-532Z.png" alt="Tracking HTTP/2.0 Adoption"></p>

<p>The above chart was generated using Shodan Reports, which now automatically show a summary of the negotiated HTTP versions when possible. Here are the top 10 versions in text:</p>

<ol>
<li><strong>HTTP/1.1</strong>  </li>
<li>SPDY/3.1  </li>
<li>SPDY/2  </li>
<li>SPDY/3  </li>
<li><strong>HTTP/2</strong>  </li>
<li>HTTP/2 Draft 14  </li>
<li><strong>HTTP/2 (Cleartext)</strong>  </li>
<li>HTTP/2 Draft 17  </li>
<li>HTTP/1.0  </li>
<li>SPDY/0.9.4</li>
</ol>

<p>Unsurprisingly, HTTP/1.1 remains the most popular version and I'd expect it to stay that way for a while. <strong>SPDY</strong> also remains a fairly popular choice, mostly due to CloudFlare's support:</p>

<p><img src="https://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-13T03-02-34-222Z.png" alt="Tracking HTTP/2.0 Adoption"></p>

<p>Taking a <a href="https://www.shodan.io/report/iLzUWHyz">deeper look at the HTTP/2 results</a> we can see that the following organizations are leading the charge in <strong>HTTP/2 support</strong>:</p>

<p><img src="https://blog.shodan.io/content/images/2015/12/Firefox_Screenshot_2015-12-13T02-29-08-132Z.png" alt="Tracking HTTP/2.0 Adoption"></p>

<p><a href="https://www.singlehop.com">Singlehop</a> has the most servers supporting HTTP/2.0 at the moment! The 2nd organization "GetClouder EOOD" is actually <a href="https://www.siteground.com/">SiteGround</a> followed by Hostwinds and Cloudflare.</p>

<p>Some general oddities in the data:</p>

<ol>
<li>There are a few servers that support HTTP/2 and <strong>SSLv2</strong>: <a href="https://www.shodan.io/search?query=ssl.alpn%3Ah2+ssl.version%3Asslv2">https://www.shodan.io/search?query=ssl.alpn%3Ah2+ssl.version%3Asslv2</a>  </li>
<li>Google is the only provider of TLS with IMAP, POP3 and SMTP that also supports HTTP/2: <a href="https://www.shodan.io/search?query=ssl.alpn%3Ah2+port%3A%22993%22">https://www.shodan.io/search?query=ssl.alpn%3Ah2+port%3A%22993%22</a></li>
</ol>

<p>You can get a full breakdown of all the organizations and versions using the <a href="https://cli.shodan.io">Shodan command-line</a>. Here's a video that shows how to do a few things which can be run automatically every month to keep track of changes in HTTP/2 adoption:</p>

<script type="text/javascript" src="https://asciinema.org/a/31715.js" id="asciicast-31715" async></script>

<p>TL;DR: ~115,000 web servers on the Internet support HTTP/2 as of December 2015.</p>]]></content:encoded></item><item><title><![CDATA[All About Dell]]></title><description><![CDATA[<p>Dell has been hit with 2 security issues the past few days. I wanted to quickly summarize my findings from an external network perspective:</p>

<h6 id="1laptopscomepreinstalledwitharootcertificate">1. Laptops come pre-installed with a root certificate</h6>

<p><a href="https://blog.hboeck.de/archives/876-Superfish-2.0-Dangerous-Certificate-on-Dell-Laptops-breaks-encrypted-HTTPS-Connections.html">https://blog.hboeck.de/archives/876-Superfish-2.0-Dangerous-Certificate-on-Dell-Laptops-breaks-encrypted-HTTPS-Connections.html</a></p>

<p>The root certificate is issued by <strong>eDellRoot</strong>. Initially, the story</p>]]></description><link>https://blog.shodan.io/all-about-dell/</link><guid isPermaLink="false">930580ab-11dd-414d-92c7-006e569db883</guid><category><![CDATA[SSL]]></category><category><![CDATA[research]]></category><category><![CDATA[Dell]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Thu, 26 Nov 2015 05:12:18 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/11/screenshot-maps-shodan-io-2015-11-25-23-01-05.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/11/screenshot-maps-shodan-io-2015-11-25-23-01-05.png" alt="All About Dell"><p>Dell has been hit with 2 security issues the past few days. I wanted to quickly summarize my findings from an external network perspective:</p>

<h6 id="1laptopscomepreinstalledwitharootcertificate">1. Laptops come pre-installed with a root certificate</h6>

<p><a href="https://blog.hboeck.de/archives/876-Superfish-2.0-Dangerous-Certificate-on-Dell-Laptops-breaks-encrypted-HTTPS-Connections.html">https://blog.hboeck.de/archives/876-Superfish-2.0-Dangerous-Certificate-on-Dell-Laptops-breaks-encrypted-HTTPS-Connections.html</a></p>

<p>The root certificate is issued by <strong>eDellRoot</strong>. Initially, the story mentioned just one certificate but it quickly became clear that there was a 2nd certificate that can be found on live web servers using <a href="https://www.shodan.io/search?query=ssl%3Aedellroot">Shodan</a> with the search query:</p>

<pre><code>ssl:eDellRoot
</code></pre>

<p>At the moment, the search returns 28 results that are <a href="https://www.shodan.io/report/JpMAZMji">located mostly in the US</a> with a few in Switzerland, Canada, Singapore and Malaysia:</p>

<p><img src="https://pbs.twimg.com/media/CUi6RGCU8AEirOs.png:large" alt="All About Dell"></p>

<p>Even though there are very few results, at least one of them has turned out to be a control system. This isn't a big surprise since there are <a href="http://www.slideshare.net/BobRadvanovsky/project-shine-findings-report-dated-1oct2014">millions of control systems connected to the Internet</a> but it's a good reminder that the Internet has much more than just web servers.</p>

<p>Dell has <a href="http://en.community.dell.com/dell-blogs/direct2dell/b/direct2dell/archive/2015/11/23/response-to-concerns-regarding-edellroot-certificate">issued a statement</a> explaining the existence of the root certificate and released a tool/ instructions on how to remove it.</p>

<h6 id="2webserverrunsonport7779thatprovidesunauthenticatedaccesstothedellservicetag">2. Webserver runs on port 7779 that provides unauthenticated access to the Dell service tag</h6>

<p><a href="http://www.theregister.co.uk/2015/11/25/dell_backdoor_part_two/">http://www.theregister.co.uk/2015/11/25/dell<em>backdoor</em>part_two/</a></p>

<p>There are <a href="https://www.shodan.io/search?query=port%3A7779">~12,800 webservers</a> on the Internet running on port 7779. Out of those, roughly ~2,300 are running software that looks like it's from a Dell computer:</p>

<p><img src="https://blog.shodan.io/content/images/2015/11/screenshot-www-shodan-io-2015-11-25-22-23-06.png" alt="All About Dell"></p>

<p>I wrote a quick script to grab the service tags from those IPs and was able to collect ~1,000 service tags. The other 1,300 devices didn't respond in time or otherwise errored out when trying to query the information. Of course, much of the threat is the ability to execute Javascript to gather the information from localhost but I wanted to get a sense of how many are Internet-connected. I've also added port 7779 to Shodan so it will be possible to keep track of how the issue gets resolved over time.</p>]]></content:encoded></item><item><title><![CDATA[Duplicate SSL Serial Numbers]]></title><description><![CDATA[<p>I've made some improvements to the way SSL is indexed and added 2 new filters:</p>

<ol>
<li><strong>ssl</strong> <br>
Search all SSL-related information that Shodan collects. <br>
Example: <a href="https://www.shodan.io/search?query=ssl%3Agoogle">ssl:Google</a>  </li>
<li><strong>has_ssl</strong> <br>
Boolean filter to only show results/ banners that contain SSL information.</li>
</ol>

<p>There was also a bug in how the SSL serial numbers</p>]]></description><link>https://blog.shodan.io/ssl-serial-number-weirdness/</link><guid isPermaLink="false">3ea33a67-fc66-4652-8cbd-e59b8438d72b</guid><category><![CDATA[SSL]]></category><category><![CDATA[research]]></category><category><![CDATA[market research]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Sat, 10 Oct 2015 23:24:32 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/10/screenshot-maps-shodan-io-2015-10-10-18-23-58.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/10/screenshot-maps-shodan-io-2015-10-10-18-23-58.png" alt="Duplicate SSL Serial Numbers"><p>I've made some improvements to the way SSL is indexed and added 2 new filters:</p>

<ol>
<li><strong>ssl</strong> <br>
Search all SSL-related information that Shodan collects. <br>
Example: <a href="https://www.shodan.io/search?query=ssl%3Agoogle">ssl:Google</a>  </li>
<li><strong>has_ssl</strong> <br>
Boolean filter to only show results/ banners that contain SSL information.</li>
</ol>

<p>There was also a bug in how the SSL serial numbers were indexed so after that got patched I kept an eye on the results. To do so I used the <a href="https://cli.shodan.io">command-line interface</a> and faceted on the <strong>ssl.cert.serial</strong> property to get a list of the most popular SSL serial numbers:</p>

<p><a href="https://asciinema.org/a/27675" target="_blank"><img src="https://asciinema.org/a/27675.png" style="width:90%;" alt="Duplicate SSL Serial Numbers"></a></p>

<p>The top 5 SSL serial numbers are:</p>

<ol>
<li><strong>15264109253415148488</strong>  </li>
<li><strong>17803741903183845083</strong>  </li>
<li><strong>0</strong>  </li>
<li><strong>40564819207326832829647457238321</strong>  </li>
<li><strong>295</strong></li>
</ol>

<p>I wasn't sure what to expect so lets <a href="https://www.shodan.io/search?query=ssl.cert.serial%3A15264109253415148488">take a look</a> at what the most popular SSL serial on the Internet is used by:</p>

<p><img src="https://blog.shodan.io/content/images/2015/10/screenshot-www-shodan-io-2015-10-10-17-18-44.png" alt="Duplicate SSL Serial Numbers"></p>

<p>There are <a href="https://www.shodan.io/report/7a2xT0hs">more than a million devices</a> that use the serial number <strong>15264109253415148488</strong> and none of them return a banner. They're all self-signed certificates that are running a service on port 443 but otherwise aren't responding to HTTP requests. Hmmm, ok well what about the 2nd most popular serial number?</p>

<p><img src="https://blog.shodan.io/content/images/2015/10/screenshot-www-shodan-io-2015-10-10-17-27-54.png" alt="Duplicate SSL Serial Numbers"></p>

<p>Once again a huge amount of devices are responding on port 443 and not providing any banners but this time for Motorola Mobility devices. In both instances the devices are located on AT&amp;T's network, and based on the netblock ownership the IPs are being used for U-verse. I started searching for more information about these certificates and eventually found an answer:</p>

<p><img src="https://blog.shodan.io/content/images/2015/10/screenshot-discussions-apple-com-2015-10-10-17-24-14.png" alt="Duplicate SSL Serial Numbers"></p>

<p>Apparently, AT&amp;T is running a service on port 443 to manage their wireless set top boxes. I don't have any way to verify those claims but they seem plausible. If nothing else it's now very easy to see how many of AT&amp;T's users purchased their wireless Internet package (~2 million households).</p>]]></content:encoded></item><item><title><![CDATA[ISP Offers BitTorrent Caching]]></title><description><![CDATA[<p>I've recently started <a href="https://www.shodan.io/search?query=bittorrent+tracker+running">crawling for BitTorrent trackers</a> on a few ports (69, 80, 6969) and got the following picture from the results:</p>

<p><img src="https://blog.shodan.io/content/images/2015/09/screenshot-www-shodan-io-2015-09-08-16-13-14.png" alt=""></p>

<p>Why are there so many trackers located in India? I would've expected to see a lot more in the US or Europe. There are probably other ports I</p>]]></description><link>https://blog.shodan.io/isp-offers-bittorrent-caching/</link><guid isPermaLink="false">3902171d-e329-4ebb-90a7-d98fe59974f9</guid><category><![CDATA[research]]></category><category><![CDATA[bittorrent]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Tue, 08 Sep 2015 21:30:48 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/09/532.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/09/532.jpg" alt="ISP Offers BitTorrent Caching"><p>I've recently started <a href="https://www.shodan.io/search?query=bittorrent+tracker+running">crawling for BitTorrent trackers</a> on a few ports (69, 80, 6969) and got the following picture from the results:</p>

<p><img src="https://blog.shodan.io/content/images/2015/09/screenshot-www-shodan-io-2015-09-08-16-13-14.png" alt="ISP Offers BitTorrent Caching"></p>

<p>Why are there so many trackers located in India? I would've expected to see a lot more in the US or Europe. There are probably other ports I should look into (if you have any suggestions please leave a comment) but it was unexpected to find so many results in India. I posed the question to Twitter and <a href="https://twitter.com/jupenur">@jupenur</a> came to the rescue:</p>

<blockquote class="twitter-tweet" data-conversation="none" lang="en"><p lang="en" dir="ltr"><a href="https://twitter.com/achillean">@achillean</a> Most of the IPs are from broadmax.in, an ISP that offers &quot;Transparent Torrent Caching locally, for Super High Speed On Torrents.&quot;</p>&mdash; Juho Nurminen (@jupenur) <a href="https://twitter.com/jupenur/status/639666402283134976">September 4, 2015</a></blockquote>  

<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

<p><img src="https://blog.shodan.io/content/images/2015/09/screenshot-broadmax-in-2015-09-08-16-06-10.png" alt="ISP Offers BitTorrent Caching"></p>

<p><a href="http://broadmax.in/">Broadmax</a> is an ISP primarily serving the city of Mumbai and as seen above advertises their ability to cache torrents/ improve download speeds. I've never seen this sort of thing by an ISP in the United States but it's <a href="https://torrentfreak.com/isp-speeds-up-customers-bittorrent-downloads-090418/">actually been around for a while</a>. There's even an official <a href="https://en.wikipedia.org/wiki/Cache_Discovery_Protocol">cache discovery protocol</a> to help ISPs know which torrents are popular and should be cached. From a technical perspective, it makes a lot of sense actually for ISPs to locally cache the data. BitTorrent accounts for <a href="http://researchcenter.paloaltonetworks.com/app-usage-risk-report-visualization/">~3.35% of the worldwide Internet traffic</a> (2013 data), with certain countries being significantly higher such as the <a href="http://thenextweb.com/apps/2014/11/21/netflix-now-accounts-35-overall-us-internet-traffic/">US at ~25%</a>. I don't expect torrent caching to show up anytime soon in my area though. Either way, I'll be adding more ports to track whether more ISPs around the world are starting to use this sort of technology.</p>]]></content:encoded></item><item><title><![CDATA[Don't Be Clever]]></title><description><![CDATA[<p>I've started <a href="https://images.shodan.io">collecting screenshots</a> for a few services, most notably VNC, and something stuck out at me:</p>

<p><img src="https://blog.shodan.io/content/images/2015/09/screenshot-www-shodan-io-2015-09-04-22-18-44.png" alt=""></p>

<p>The top 5 ports where VNC is running with authentication disabled are:</p>

<ol>
<li>5900 (default port): 4,029  </li>
<li><strong>5901</strong>: 3,995  </li>
<li>84: 25  </li>
<li>83: 14  </li>
<li>13579: 7</li>
</ol>

<p>Out of ~8,000 results, 50% of</p>]]></description><link>https://blog.shodan.io/dont-be-clever/</link><guid isPermaLink="false">a13f79f6-4ad2-4bee-bfc4-032d28f86fcc</guid><category><![CDATA[ICS]]></category><category><![CDATA[research]]></category><category><![CDATA[vnc]]></category><category><![CDATA[screenshots]]></category><category><![CDATA[images]]></category><category><![CDATA[modbus]]></category><category><![CDATA[obscurity]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Sat, 05 Sep 2015 03:48:32 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/09/screenshot-images-shodan-io-2015-09-04-22-46-35.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/09/screenshot-images-shodan-io-2015-09-04-22-46-35.png" alt="Don't Be Clever"><p>I've started <a href="https://images.shodan.io">collecting screenshots</a> for a few services, most notably VNC, and something stuck out at me:</p>

<p><img src="https://blog.shodan.io/content/images/2015/09/screenshot-www-shodan-io-2015-09-04-22-18-44.png" alt="Don't Be Clever"></p>

<p>The top 5 ports where VNC is running with authentication disabled are:</p>

<ol>
<li>5900 (default port): 4,029  </li>
<li><strong>5901</strong>: 3,995  </li>
<li>84: 25  </li>
<li>83: 14  </li>
<li>13579: 7</li>
</ol>

<p>Out of ~8,000 results, 50% of the results came from services that were operating VNC on a non-standard port. It's not unusual to see common services running on different ports, but that was a surprising amount. My guess is that a lot of people change the default port thinking that it will hide their service. Because Shodan scans for 250+ different ports however, there's a small chance that Shodan will discover it anyways. And for a lot of the popular protocols, Shodan actually also crawls for one-off ports (thank you to <a href="https://twitter.com/Viss">@Viss</a> for that idea).</p>

<p>I've seen this sort of behavior in other services as well, this isn't limited to VNC. If you've read my previous blog posts this might sound familiar to you. In fact, I observed much of the same when <a href="https://blog.shodan.io/hiding-in-plain-sight/">looking at SSH</a>. For SSH, the choice of ports is a bit wider but in general people don't work well as random number generators.</p>

<p>Furthermore, this sort of behavior can be observed across the industries. For example, you might know that Shodan crawls the Internet for industrial control systems (ICS). One of the most popular protocols in ICS is called Modbus that runs on port 502. At the moment, there are about <a href="https://www.shodan.io/search?query=port%3A502">17,000 devices</a> listening to Modbus on the default port. It turns out there are also <a href="https://www.shodan.io/search?query=port%3A503">700 devices</a> listening on port 503, again a one-off sort of situation.</p>

<p><img src="https://blog.shodan.io/content/images/2015/09/screenshot-www-shodan-io-2015-09-04-22-33-29.png" alt="Don't Be Clever"></p>

<p>If you're looking to hide your service putting it on a different port is a temporary band-aid at best and a false sense of security more than anything.</p>]]></content:encoded></item><item><title><![CDATA[Shining a Light on the Roku]]></title><description><![CDATA[<p>The <a href="https://www.roku.com">Roku</a> is a small computer that enables you to stream videos and music to your TV. Before the rise of smart TVs it was one of the easiest ways to watch Netflix in your living room and it still <a href="http://blog.streamingmedia.com/2015/01/roku-ipo.html">seems to be thriving</a>. I hadn't thought much about them</p>]]></description><link>https://blog.shodan.io/hello-roku/</link><guid isPermaLink="false">aa9c69fc-e7d5-45d8-84a0-1dcf4d1aae72</guid><category><![CDATA[research]]></category><category><![CDATA[Roku]]></category><category><![CDATA[market research]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Mon, 27 Jul 2015 03:19:30 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/07/RokuStickApps.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/07/RokuStickApps.jpg" alt="Shining a Light on the Roku"><p>The <a href="https://www.roku.com">Roku</a> is a small computer that enables you to stream videos and music to your TV. Before the rise of smart TVs it was one of the easiest ways to watch Netflix in your living room and it still <a href="http://blog.streamingmedia.com/2015/01/roku-ipo.html">seems to be thriving</a>. I hadn't thought much about them recently until I saw a great series of posts on <a href="https://www.reddit.com/r/netsec">Reddit</a> recently on the security of the Roku:</p>

<ul>
<li>Roku API doesn't have authentication and allows remote reboot: <a href="http://x42.obscurechannel.com/2015/07/25/restart-a-roku-via-bash/">http://x42.obscurechannel.com/2015/07/25/restart-a-roku-via-bash/</a></li>
<li>Roku WPS Pin cracked: <a href="http://x42.obscurechannel.com/2015/07/26/cracking-the-roku-v2-wpa2-psk/">http://x42.obscurechannel.com/2015/07/26/cracking-the-roku-v2-wpa2-psk/</a></li>
</ul>

<p>Much of the smart TV world is full of low-hanging fruit in terms of security. For example, this is me running a network scan on my Vizio TV:</p>

<iframe width="560" height="315" src="https://www.youtube.com/embed/RfK_2-khznA" frameborder="0" allowfullscreen></iframe>

<p>In case you can't make it out: scanning the TV with Nmap launches an update and shows the application menu - no authentication required. As such, it isn't a huge surprise to learn that the Roku offers an API to control the device that doesn't have authentication enabled. And to be fair, the use case for the API is to allow local users to control their Roku over the phone. They're not meant to be directly exposed on the Internet. Aside from the security implications, this also provides an opportunity to learn a bit about which Roku devices are most popular and which apps users install the most. First, I scanned the Internet for devices then downloaded the results. If you have access to the <a href="https://cli.shodan.io">Shodan command-line client</a> you can get the data using:</p>

<pre><code>shodan download --limit -1 roku-data "port:8060 Roku"
</code></pre>

<p>It seems there are around <strong>1,868 Roku devices directly on the Internet</strong> as of July 26, 2015. I expect this number to fluctuate depending on the timezone that the scan is performed, but it's a good starting point to learn more about Roku's usage. To start off, I wanted to learn which Roku devices sell the most so here is a ranking of the Top 10 Most Popular Roku devices:</p>

<ol>
<li><strong>Roku 3</strong>: 514  </li>
<li><strong>Roku Stick</strong>: 376  </li>
<li><strong>Roku 2</strong>: 169  </li>
<li><strong>Roku 2 XD</strong>: 163  </li>
<li><strong>Roku 2 XS</strong>: 161  </li>
<li><strong>Roku LT</strong>: 121  </li>
<li><strong>Roku 1</strong>: 116  </li>
<li><strong>Roku HD</strong>: 93  </li>
<li><strong>Roku Streaming Player 2050X</strong>: 41  </li>
<li><strong>Roku Streaming Player 2100X</strong>: 28</li>
</ol>

<p>The total number of devices isn't huge but I think it's awesome that we can empirically measure which products sell the most using real data. And it's interesting that the most expensive model, the Roku 3, is also the most popular one. Usually, the low- and mid-range models for a product are most visible on the Internet but that isn't the case this time. In terms of specific model numbers the breakdown is as follows:</p>

<ol>
<li><strong>4200X</strong>: 538  </li>
<li><strong>3500X</strong>: 350  </li>
<li><strong>3050X</strong>: 163  </li>
<li><strong>3100X</strong>: 162  </li>
<li><strong>2720X</strong>: 146  </li>
<li><strong>2500X</strong>: 93  </li>
<li><strong>2400SK</strong>: 61  </li>
<li><strong>2050X</strong>: 41  </li>
<li><strong>2100X</strong>: 28  </li>
<li><strong>2400X</strong>: 28</li>
</ol>

<p><img src="https://blog.shodan.io/content/images/2015/07/roku-channels.png" alt="Shining a Light on the Roku"></p>

<p>Finally, I wanted to see which channels are most commonly installed on Roku devices. The Roku API will happily tell you all the channels that the device has running, so I gathered all the data and am making it accessible via 2 Gists:</p>

<ul>
<li>List of Channels: <a href="https://gist.github.com/achillean/110dd0fdd8d42c6fe08e">https://gist.github.com/achillean/110dd0fdd8d42c6fe08e</a></li>
<li>List of Channels with Versions: <a href="https://gist.github.com/achillean/32b8f31b9072fd98a986">https://gist.github.com/achillean/32b8f31b9072fd98a986</a></li>
</ul>

<p>The Top 10 Channels as determined via Shodan are:</p>

<ol>
<li>Netflix  </li>
<li>Amazon Instant Video  </li>
<li>Hulu Plus  </li>
<li>VUDU  </li>
<li>Pandora  </li>
<li>YouTube  </li>
<li>Crackle  </li>
<li>Blockbuster  </li>
<li>Popcornflix  </li>
<li>Rdio</li>
</ol>

<p>I was really surprised to see Blockbuster on this list, since I thought they were dead but apparently the video streaming is still online. Naturally, I wanted to compare my list to the official <a href="https://www.roku.com/channels#!browse/movies-and-tv/by-popular">most popular channels</a> on the Roku website. Theirs is:</p>

<ol>
<li>Netflix (-)  </li>
<li>Hulu Plus (+1)  </li>
<li>Amazon Instant Video (-1)  </li>
<li>Sling TV (<strong>+22</strong>)  </li>
<li>HBO GO (+11)  </li>
<li>Crackle (+1)  </li>
<li>Time Warner Cable (<strong>+39</strong>)  </li>
<li>PBS (+10)  </li>
<li>VUDU (-5)  </li>
<li>Acorn TV (<strong>+55</strong>)</li>
</ol>

<p>The difference between the Shodan ranking and the Roku rankings is provided in the parenthesis. For example, Hulu Plus moved up 1 rank in the Roku ranking while VUDU fell 5 compared to Shodan's. The sample size is much smaller than what Roku has and maybe people that put Roku devices on the Internet simply prefer YouTube over PBS or Acorn TV. But <strong>Sling TV</strong>, <strong>Time Warner Cable</strong> and <strong>Acorn TV</strong> aren't anywhere close to the top 10 in the Shodan ranking yet they're very high in Roku's list.</p>

<p>It's also possible to determine how often people update/ patch their channels. For example, this is the breakdown for the various versions of the Netflix channel:</p>

<table>  
<tr><th>Application</th><th>Version</th><th>Count</th></tr>  
<tr><td>Netflix</td><td>3.1.6040</td><td>694</td></tr>  
<tr><td>Netflix</td><td>4.2.14</td><td>406</td></tr>  
<tr><td>Netflix</td><td>4.1.214</td><td>292</td></tr>  
<tr><td>Netflix</td><td>2.5.1</td><td>115</td></tr>  
<tr><td>Netflix</td><td>4.2.12</td><td>65</td></tr>  
<tr><td>Netflix</td><td>4.2.6</td><td>9</td></tr>  
<tr><td>Netflix</td><td>3.1.6038</td><td>2</td></tr>  
</table>

<p>Based on these results it looks like most customers don't update their channels/ apps on the Roku. For a complete breakdown of all version and apps please <a href="https://gist.github.com/achillean/32b8f31b9072fd98a986">check out the CSV</a>. Let me know if you find anything interesting/ cool/ weird in the data!</p>]]></content:encoded></item><item><title><![CDATA[It's the Data, Stupid!]]></title><description><![CDATA[<p>I would like to take a moment to discuss databases. Most people use Shodan to find devices that have web servers, but for a few years now I've also been crawling the Internet for various database software. I usually mention this during my talks and I've tried to raise awareness</p>]]></description><link>https://blog.shodan.io/its-the-data-stupid/</link><guid isPermaLink="false">a8294ab0-9195-498b-b97f-62dd4b59aec4</guid><category><![CDATA[research]]></category><category><![CDATA[MongoBD]]></category><category><![CDATA[NoSQL]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Sat, 18 Jul 2015 21:51:42 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/07/Library-with-a-book-ladde-014.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/07/Library-with-a-book-ladde-014.jpg" alt="It's the Data, Stupid!"><p>I would like to take a moment to discuss databases. Most people use Shodan to find devices that have web servers, but for a few years now I've also been crawling the Internet for various database software. I usually mention this during my talks and I've tried to raise awareness of it over the years with mixed results. At least with <a href="https://www.shodan.io/search?query=product%3A%22MySQL%22">MySQL</a>, <a href="https://www.shodan.io/search?query=port%3A5432">PostgreSQL</a> and much of the relational database software the defaults are fairly secure: listen on the local interface only and provide some form of authorization by default. This isn't the case with some of the newer NoSQL products that started entering mainstream fairly recently. For the purpose of this article I will talk about one of the more popular NoSQL products called <a href="https://www.mongodb.com">MongoDB</a>, though much of what is being said also applies to other software (I'm looking at you <a href="https://www.shodan.io/search?query=product%3A%22Redis+key-value+store%22">Redis</a>).</p>

<p><em>Note: This article isn't about the way MongoDB scales.</em></p>

<p>Firstly, in an effort to make it a bit easier to understand the results for MongoDB I've updated the way they're <a href="https://www.shodan.io/search?query=product%3A%22MongoDB%22">represented in the search results</a>:</p>

<p><img src="https://blog.shodan.io/content/images/2015/07/mongodb-results.png" alt="It's the Data, Stupid!"></p>

<p>A quick <a href="https://www.shodan.io/report/OID7V1zw">search for MongoDB</a> reveals that there are nearly 30,000 instances on the Internet that don't have any authorization enabled. This was actually a bit surprising since by default MongoDB listens on localhost and has done so for a while based on the <a href="https://github.com/mongodb/mongo/blob/e01dfe96c73e89fb5e20f55faff4fcbfb54de1b5/debian/mongod.conf">oldest Github checkin for their mongodb.conf</a>. This made my results very confusing: how could there be so many open MongoDB installations if the defaults were to listen on localhost?</p>

<h5 id="configurationhistory">Configuration History</h5>

<p>So I started downloading older versions of MongoDB to figure out when they changed the configuration defaults. It turns out that <a href="https://fastdl.mongodb.org/src/mongodb-src-r2.4.14.tar.gz">MongoDB version 2.4.14</a> seems to be the last version that still listened to 0.0.0.0 by default, which looks like a maintenance release done on April 28, 2015. I'm a bit confused why a configuration file was checked-in to Github September 2013 that listened on localhost by default, but then they kept distributing versions that didn't include that change?! I dug around some more and eventually found the official issue in Jira that tracked this configuration issue:</p>

<p><a href="https://jira.mongodb.org/browse/SERVER-4216">https://jira.mongodb.org/browse/SERVER-4216</a></p>

<p>Roman Shtylman actually raised this problem back in February of 2012! It ended up taking a bit more than 2 years to change the settings. Based on the distribution of versions I'm seeing, my guess is that early versions of 2.6 might've also lacked binding to localhost:</p>

<p><img src="https://blog.shodan.io/content/images/2015/07/mongodb-versions.png" alt="It's the Data, Stupid!"></p>

<p>The lack of secure defaults explained some of the 30,000 results but just looking at the data made something else obvious.</p>

<h5 id="thecloud">The Cloud</h5>

<p><img src="https://blog.shodan.io/content/images/2015/07/mongodb-orgs.png" alt="It's the Data, Stupid!"></p>

<p>The vast majority of public MongoDB instances are operating in a cloud: Digital Ocean, Amazon, Linode and OVH round out the most popular destinations for hosting MongoDB without authorization enabled. I've actually observed this trend across the board: cloud instances tend to be more vulnerable than the traditional datacenter hosting. My guess is that cloud images don't get updated as often, which translates into people deploying old and insecure versions of software.</p>

<h5 id="problemscope">Problem Scope</h5>

<p>There's a total of <strong>595.2 TB of data</strong> exposed on the Internet via publicly accessible MongoDB instances that don't have any form of authentication. To determine the scale of the problem I downloaded the data using the <a href="https://cli.shodan.io">Shodan command-line tool</a>:</p>

<pre><code>shodan download --limit -1 mongodb "product:MongoDB"
</code></pre>

<p>And then I ran a small Python script to aggregate the total size of all exposed databases. I also looked at which database names were most popular:</p>

<ol>
<li><strong>local</strong>: 27,108  </li>
<li><strong>admin</strong>: 22,286  </li>
<li><strong>db</strong>: 9,895  </li>
<li><strong>test</strong>: 6,818  </li>
<li><strong>config</strong>: 1,119  </li>
<li><strong>mydb</strong>: 498  </li>
<li><strong>Video</strong>: 409  </li>
<li><strong>hackedDB</strong>: 319  </li>
<li><strong>storage</strong>: 315  </li>
<li><strong>trash</strong>: 309</li>
</ol>

<p>Faceting on the database name reveals widespread installations that might've been misconfigured or otherwise exposed. There are a lot of instances that have some sort of administrative database, so the app that uses MongoDB probably has authentication but the database itself doesn't... The name that really sticks out is <strong>hackedDB</strong>. It's unclear whether those instances have been compromised or whether it's a large deployment of MongoDB servers from a company that uses "hackedDB" as its database name. Or maybe it's a honeypot? The interesting thing to note when <a href="https://www.shodan.io/search?query=product%3A%22MongoDB%22+hackeddb">looking at the results</a> is that 40% of the instances are running a very old version of MongoDB (1.8.1).</p>

<p>I could go on and on about these sorts of problems because they're everywhere and haven't been resolved for years. Hopefully, more people will start looking at services that are responsible for the actual data and not solely focus on the web interfaces.</p>]]></content:encoded></item><item><title><![CDATA[Presidential Robots and 404s]]></title><description><![CDATA[<p>The field of presidential candidates has started to heat up and the websites are the first stop for a lot of prospective voters. For my purposes though, I was less interested in their political platform and more curious about the technology behind the websites. Others have <a href="https://paulschreiber.com/blog/2015/04/12/presidential-candidate-website-tech-compared/">already compared the SSL</a></p>]]></description><link>https://blog.shodan.io/presidential-robots-and-404s/</link><guid isPermaLink="false">c4aa727d-ac0c-46fa-86ee-17f1905e068e</guid><category><![CDATA[research]]></category><category><![CDATA[presidential candidates]]></category><category><![CDATA[robots.txt]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Sun, 21 Jun 2015 02:08:52 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2015/06/white-house-02.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2015/06/white-house-02.jpg" alt="Presidential Robots and 404s"><p>The field of presidential candidates has started to heat up and the websites are the first stop for a lot of prospective voters. For my purposes though, I was less interested in their political platform and more curious about the technology behind the websites. Others have <a href="https://paulschreiber.com/blog/2015/04/12/presidential-candidate-website-tech-compared/">already compared the SSL security</a> of the candidates, so I wanted to check out what sort of information the presidential hopefuls' <strong>robots.txt</strong> files and <strong>404 responses</strong> return. To generate the 404 response I chose a random URL <strong>/test</strong> (turns out I'm really bad at being random).</p>

<p>Without further ado, let me show the results of the requests:</p>

<h1 id="democrats">Democrats</h1>

<h4 id="hillaryclinton">Hillary Clinton</h4>

<p><a href="https://www.hillaryclinton.com">https://www.hillaryclinton.com</a></p>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow: /api/
</code></pre>

<p>Looks like there's an API for their website that is undocumented publicly.</p>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/hillary-404.png" alt="Presidential Robots and 404s"></p>

<h4 id="berniesanders">Bernie Sanders</h4>

<p><a href="https://berniesanders.com/">https://berniesanders.com/</a></p>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow: /wp-admin/
</code></pre>

<p>The website uses Wordpress as its framework.</p>

<h6 id="404">404</h6>

<iframe width="560" height="315" src="https://www.youtube.com/embed/Dhot2OJKKZc" frameborder="0" allowfullscreen></iframe>

<h4 id="martinomalley">Martin O'Malley</h4>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow: /wp-admin/
</code></pre>

<p>The website uses Wordpress as its framework.</p>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/malley-404.png" alt="Presidential Robots and 404s"></p>

<h4 id="jimwebb">Jim Webb</h4>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow: /wp-admin/

Sitemap: http://www.webb2016.com/sitemap.xml
</code></pre>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/webb-404.png" alt="Presidential Robots and 404s"></p>

<h4 id="lincolnchafee">Lincoln Chafee</h4>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow: /wp-admin/
</code></pre>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/chafee-404.png" alt="Presidential Robots and 404s"></p>

<h1 id="republicans">Republicans</h1>

<h4 id="jebbush">Jeb Bush</h4>

<h6 id="robotstxt">robots.txt</h6>

<p>No robots.txt file available.</p>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/jeb-404.png" alt="Presidential Robots and 404s"></p>

<h4 id="randpaul">Rand Paul</h4>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow:
</code></pre>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/paul-404.png" alt="Presidential Robots and 404s"></p>

<h4 id="tedcruz">Ted Cruz</h4>

<p><a href="https://www.tedcruz.org">https://www.tedcruz.org</a></p>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-agent: *
Disallow: /wp-admin/
</code></pre>

<p>The website uses Wordpress as its framework.</p>

<h4 id="ricksantorum">Rick Santorum</h4>

<p><a href="http://www.ricksantorum.com/">http://www.ricksantorum.com/</a></p>

<h6 id="robotstxt">robots.txt</h6>

<pre><code>User-Agent: *
Disallow: /admin/
Disallow: /utils/
Disallow: /forms/
Disallow: /users/
Sitemap: http://www.ricksantorum.com/sitemap_index.xml
</code></pre>

<p>Based on this information the website is a hosted CMS at nationbuilder.com</p>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/santorum-404.png" alt="Presidential Robots and 404s"></p>

<h4 id="bencarson">Ben Carson</h4>

<p><a href="https://www.bencarson.com/">https://www.bencarson.com/</a></p>

<h6 id="robotstxt">robots.txt</h6>

<p>No robots.txt file available.</p>

<h6 id="404">404</h6>

<p><img src="https://blog.shodan.io/content/images/2015/06/carson-404.png" alt="Presidential Robots and 404s"></p>

<p>Most of them didn't turn out to be very interesting to look at, with the exception of the final candidate I'd like to show:</p>

<h2 id="carlyfiorina">Carly Fiorina</h2>

<p><a href="https://www.carlyfiorina.com">https://www.carlyfiorina.com</a></p>

<h4 id="robotstxt">robots.txt</h4>

<pre><code>User-agent: *
Disallow: /standing-desks2
Disallow: /standing-desks2.html
Disallow: /privacy-policy.html
Disallow: /privacy-policy
Disallow: /terms-of-use.html
Disallow: /terms-of-use
Disallow: /adjustable-height-desk.html
Disallow: /adjustable-height-desk
Disallow: /blank
Disallow: /test
</code></pre>

<h4 id="404">404</h4>

<p><img src="https://blog.shodan.io/content/images/2015/06/carly-auth.png" alt="Presidential Robots and 404s"></p>

<p>It turned out that my <em>random</em> URL of <strong>/test</strong> wasn't random enough and I accidentally stumbled upon a location on Carly Fiorina's website that requires authentication.</p>

<p>I took away 4 lessons from this exercise:</p>

<ol>
<li>Wordpress remains incredibly popular  </li>
<li>robots.txt can tell you where the administrative area is  </li>
<li>404s must be generated enough that it is worth investing time into making them nicer  </li>
<li>I'm bad at generating random URLs</li>
</ol>

<p>PS: Did you know that Shodan also grabs the <strong>robots.txt</strong> data for each IP? You can access all the information via the <a href="https://developer.shodan.io">Shodan API</a>.</p>]]></content:encoded></item><item><title><![CDATA[State of Control Systems in the USA]]></title><description><![CDATA[<p>I've recently added the ability to search for devices in Shodan based on the state they're located in. This provides the interesting possibility to start comparing the security posture of US states by looking at what sort of things they expose publicly. To start off, I will be taking a</p>]]></description><link>https://blog.shodan.io/state-of-control-systems-in-the-usa-2015-05/</link><guid isPermaLink="false">fec818c5-253e-4dff-bf12-98e0191714cf</guid><category><![CDATA[ICS]]></category><category><![CDATA[research]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Fri, 15 May 2015 00:54:36 GMT</pubDate><media:content url="https://static.shodan.io/shodan/img/categories/ics/ics.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://static.shodan.io/shodan/img/categories/ics/ics.jpg" alt="State of Control Systems in the USA"><p>I've recently added the ability to search for devices in Shodan based on the state they're located in. This provides the interesting possibility to start comparing the security posture of US states by looking at what sort of things they expose publicly. To start off, I will be taking a look at how prevalent industrial control systems (<strong>ICS</strong>) are in the various states.</p>

<p>Shodan currently crawls for roughly <a href="https://www.shodan.io/explore/category/industrial-control-systems">15 different ICS protocols on more than 20 ports</a> and within the past week or so has discovered ~40,000 of them that are publicly accessible on the Internet worldwide. I presented on a few of these protocols at <a href="https://www.4sics.se">4SICS</a> in October 2014, and you can <a href="https://icsmap.shodan.io/">download the data</a> and <a href="https://imgur.com/a/v3NU9">view the maps</a> that were generated for the talk.</p>

<p><img src="https://static.shodan.io/4sics/icsmap.png" alt="State of Control Systems in the USA"></p>

<p>Lets start taking a look at the US in particular though: so far in May Shodan has discovered <strong>20,445 ICS devices in the US</strong>. This is definitely a lower bound and more will be discovered until the end of the month (many ICS devices have high latencies - more on that later).</p>

<p><img src="https://blog.shodan.io/content/images/2015/05/usa-ics-ports.png" alt="State of Control Systems in the USA"></p>

<p>The most popular protocol is Tridium's Fox, followed by BACnet and Modbus. Fox and BACnet are commonly used by building management systems (BMS), while Modbus is used across a wide range of products. The full Top 10 Protocols are as follows:</p>

<ol>
<li><strong>Tridium Fox</strong>: 7,706  </li>
<li><strong>BACnet</strong>: 4,525  </li>
<li><strong>Modbus</strong>: 1,625  </li>
<li><strong>EtherNet/IP</strong>: 1,578  </li>
<li><strong>ProConOS</strong>: 1,018  </li>
<li><strong>General Electric</strong>: 956  </li>
<li><strong>OMRON FINS</strong>: 777  </li>
<li><strong>Mitsubishi</strong>: 615  </li>
<li><strong>Red Lion</strong>: 551  </li>
<li><strong>Codesys</strong>: 407</li>
</ol>

<p>By faceting on <strong>state</strong> using the Shodan API we can get a breakdown of ICS devices for each US state:</p>

<iframe style="width:100%;height:400px;border:0;" src="https://docs.google.com/spreadsheets/d/1iI7lEtE33Bkam6CF-RtAzSqVepsPQwGAjchMeLv8uzQ/pubchart?oid=1480265085&amp;format=interactive"></iframe>

<p>The above map chart shows a breakdown of which states have the most ICS devices on the Internet. As you'd probably expect the larger, more populous states also tend to have more devices online:</p>

<ol>
<li><strong>CA</strong>:    2328  </li>
<li><strong>TX</strong>:    1422  </li>
<li><strong>NY</strong>:    739  </li>
<li><strong>MA</strong>:    559  </li>
<li><strong>IL</strong>:    552  </li>
<li><strong>PA</strong>:    482  </li>
<li><strong>OH</strong>:    466  </li>
<li><strong>NJ</strong>:    465  </li>
<li><strong>FL</strong>:    416  </li>
<li><strong>MI</strong>:    390</li>
</ol>

<p>Bigger states have more people, more devices and therefore more control systems required to provide services. This means larger states would always be at the top of any ICS ranking, which means it's not entirely fair to compare states based on absolute numbers. So lets normalize the results and look at states based on the % of devices in the state that are control systems:</p>

<p><img src="https://blog.shodan.io/content/images/2015/05/ics-percentage.png" alt="State of Control Systems in the USA"></p>

<p>And now we get a slightly different picture. Maine is at the top of the list with 0.23% of the state's devices being industrial control systems on the Internet, followed by Hawaii (0.19%) and Nebraska (0.17%). The top 10 are:</p>

<ol>
<li><strong>Maine</strong>: 0.23%  </li>
<li><strong>Hawaii</strong>: 0.19%  </li>
<li><strong>Nebraska</strong>: 0.17%  </li>
<li><strong>Vermont</strong>: 0.17%  </li>
<li><strong>West Virginia</strong>: 0.16%  </li>
<li><strong>Montana</strong>: 0.14%  </li>
<li><strong>Rhode Island</strong>: 0.14%  </li>
<li><strong>Iowa</strong>: 0.13%  </li>
<li><strong>Arkansas</strong>: 0.12%  </li>
<li><strong>Washington DC</strong>: 0.11%</li>
</ol>

<p>If you want to analyze the results further or look at the data yourself, check out the <a href="https://developer.shodan.io/api">Shodan API documentation</a> for information on what you can search and facet on. There are also a bunch of libraries available in Python, Ruby, NodeJS and Go to make getting started easy.</p>

<p>I will be keeping track of how these numbers change in the coming months/ years, especially as federal policies change and cyber insurance becomes more popular.</p>

<p>PS: If your browser supports WebGL you can also check out the following visualization that was generated for my talk at the Department of Homeland Security ICS Joint Working Group conference in June 2014: <a href="https://ics-radar.shodan.io/">https://ics-radar.shodan.io/</a></p>]]></content:encoded></item></channel></rss>