<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[CLI - Shodan Blog]]></title><description><![CDATA[The latest news and developments for Shodan.]]></description><link>https://blog.shodan.io/</link><generator>Ghost 0.7</generator><lastBuildDate>Sat, 11 Apr 2026 22:25:02 GMT</lastBuildDate><atom:link href="https://blog.shodan.io/tag/cli/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Measuring the Minecraft Playerbase]]></title><description><![CDATA[<p>For fun I decided to see whether I can figure out how many Minecraft players are online at the moment. And it turns out that it's fairly straight-forward so here's how I did it.</p>

<p>As of now June 1st 2017 at 18:55 there are <strong>96,418</strong> players online on</p>]]></description><link>https://blog.shodan.io/measuring-the-minecraft-playerbase/</link><guid isPermaLink="false">4e4c6565-a24f-42bf-80bb-0c3839c3b87b</guid><category><![CDATA[minecraft]]></category><category><![CDATA[Python]]></category><category><![CDATA[CLI]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Fri, 02 Jun 2017 00:20:35 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2017/06/4453115-minecraft-wallpapers.jpg" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2017/06/4453115-minecraft-wallpapers.jpg" alt="Measuring the Minecraft Playerbase"><p>For fun I decided to see whether I can figure out how many Minecraft players are online at the moment. And it turns out that it's fairly straight-forward so here's how I did it.</p>

<p>As of now June 1st 2017 at 18:55 there are <strong>96,418</strong> players online on public servers.</p>

<p>To get started I downloaded the latest list of Minecraft servers from Shodan:</p>

<pre><code>shodan download --limit -1 minecraft-servers product:minecraft port:25565
</code></pre>

<p>Now the next task is to parse that list of servers and request the number of players that are currently online. To speed things up the plan is to asynchronously perform the requests to the Minecraft servers using the <a href="http://www.gevent.org">gevent</a> library in Python. It lets you write code that looks synchronous but actually runs asynchronously which means you can perform many connections in parallel. This is the usual template I use when grabbing a bunch of data using gevent:</p>

<pre><code>#!/usr/bin/env python
#
# Shodan Async Workers

## Configuration
NUM_WORKERS = 100


# Make the stdlib async. This is where the gevent magic happens
import gevent.monkey
gevent.monkey.patch_all(subprocess=True, sys=True)


from gevent.pool import Pool
from shodan.helpers import iterate_files
from socket import setdefaulttimeout, socket, AF_INET, SOCK_STREAM

setdefaulttimeout(2.0)

def worker(banner):
    # Here's where you do the network stuff
    # Example:
    # con = socket(AF_INET, SOCK_STREAM)
    # con.connect((banner['ip_str']
    # con.send('hello world\n')
    # data = con.recv(5120)
    return True

def main(files):
    pool = Pool(NUM_WORKERS)

    # Loop through the banners in the file(s) and launch a worker
    # for each banner. When the pool is full it will cause the loop to
    # block until a worker finishes and opens up a spot in the pool.
    for banner in iterate_files(files):
        pool.spawn(worker, banner)

    # Wait for the workers to finish up
    pool.join()

    return True


if __name__ == '__main__':
    import sys
    sys.exit(main(sys.argv[1:])
</code></pre>

<p>If you're working with Shodan data files I recommend checking out the <strong>shodan.helpers.iterate_files()</strong> method since it'll make it easy for you to access the banners. You can give it either a single file:</p>

<pre><code>for banner in iterate_files('minecraft-data.json.gz'):
    ...
</code></pre>

<p>Or you can provide it a list of files:</p>

<pre><code>for banner in iterate_files(['minecraft-2017-04.json.gz', minecraft-2017-05.json.gz']):
    ...
</code></pre>

<p>To get the player count I added a method in the <em>worker()</em> that looks up the Minecraft info based on their <a href="http://wiki.vg/Protocol">current protocol</a> and kicked it off:</p>

<pre><code>$ python global-player-count.py minecraft-data.json.gz
96418
</code></pre>

<p>And that's how I'm now keeping track of how many players are at any moment online on Minecraft!</p>

<p>Note that this method only looks at Minecraft servers running on the default port (25565) and that are publicly-accessible on the Internet.</p>]]></content:encoded></item><item><title><![CDATA[The HDFS Juggernaut]]></title><description><![CDATA[<p>There's been much focus on MongoDB, Elastic and Redis in terms of data exposure on the Internet due to their general popularity in the developer community. However, in terms of data volume it turns out that HDFS is the real juggernaut. To give you a better idea here's a quick</p>]]></description><link>https://blog.shodan.io/the-hdfs-juggernaut/</link><guid isPermaLink="false">c469ddda-3cd3-48db-b4dc-a2d771993b61</guid><category><![CDATA[NoSQL]]></category><category><![CDATA[research]]></category><category><![CDATA[Python]]></category><category><![CDATA[HDFS]]></category><category><![CDATA[CLI]]></category><dc:creator><![CDATA[John Matherly]]></dc:creator><pubDate>Wed, 31 May 2017 17:32:11 GMT</pubDate><media:content url="http://blog.shodan.io/content/images/2017/05/hdfs-map-1600.png" medium="image"/><content:encoded><![CDATA[<img src="http://blog.shodan.io/content/images/2017/05/hdfs-map-1600.png" alt="The HDFS Juggernaut"><p>There's been much focus on MongoDB, Elastic and Redis in terms of data exposure on the Internet due to their general popularity in the developer community. However, in terms of data volume it turns out that HDFS is the real juggernaut. To give you a better idea here's a quick comparison between MongoDB and HDFS:</p>

<table>  
<thead>  
<tr>  
<th></th>  
<th>MongoDB</th>  
<th>HDFS</th>  
</tr>  
</thead>  
<tbody>  
<tr>  
<td>Number of Servers</td>  
<td>47,820</td>  
<td>4,487</td>  
</tr>  
<tr>  
<td>Data Exposed</td>  
<td>25 TB</td>  
<th>5,120 TB</th>  
</tr>  
</tbody>  
</table>

<p>Even though there are more MongoDB databases connected to the Internet without authentication in terms of data exposure it is dwarfed by HDFS clusters (25 TB vs 5 PB). Where are all these instances located?</p>

<script type="text/javascript" src="https://asciinema.org/a/6dzqir2jbssqftvcxwgh63dwp.js" id="asciicast-6dzqir2jbssqftvcxwgh63dwp" async></script>

<p>Most of the HDFS NameNodes are located in the US (1,900) and China (1,426). And nearly all of the HDFS instances are hosted on the cloud with Amazon leading the charge (1,059) followed by Alibaba (507).</p>

<p><img src="https://blog.shodan.io/content/images/2017/05/hdfs-map-600.png" alt="The HDFS Juggernaut"></p>

<p>The ransomware attacks on databases that were <a href="http://www.csoonline.com/article/3154190/security/exposed-mongodb-installs-being-erased-held-for-ransom.html">widely</a> <a href="https://www.fidelissecurity.com/threatgeek/2017/01/revenge-devops-gangster-open-hadoop-installs-wiped-worldwide">publicized</a> earlier in the year are still happening. And they're impacting both MongoDB and HDFS deployments. For HDFS, Shodan has discovered roughly <a href="https://www.shodan.io/search?query=NODATA4U_SECUREYOURSHIT">207 clusters</a> that have a message warning of the public exposure. And a quick glance at search results in Shodan reveals that most of the public MongoDB instances <a href="https://www.shodan.io/search?query=product%3Amongodb">seem to be compromised</a>. I've <a href="https://blog.shodan.io/its-the-data-stupid/">previously written</a> on the reason behind these exposures but note that both products nowadays have extensive documentation on <a href="https://docs.mongodb.com/manual/security/">secure deployment</a>.</p>

<h6 id="technicaldetails">Technical Details</h6>

<p>If you'd like to replicate the above findings or perform your own investigations into data exposure, this is how I measured the above.</p>

<ol>
<li><p>Download data using the <a href="https://cli.shodan.io">Shodan command-line interface</a>:</p>

<pre><code>shodan download --limit -1 hdfs-servers product:namenode
</code></pre></li>
<li><p>Write a Python script to measure the amount of exposed data (<strong>hdfs-exposure.py</strong>):</p>

<pre><code>from shodan.helpers import iterate_files, humanize_bytes
from sys import argv, exit


if len(argv) &lt;=1 :
    print('Usage: {} &lt;file1.json.gz&gt; ...'.format(argv[0]))
    exit(1)


datasize = 0
clusters = {}


# Loop over all the banners in the provided files
for banner in iterate_files(argv[1:]):
    try:
        # Grab the HDFS information that Shodan gathers
        info = banner['opts']['hdfs-namenode']
        cid = info['ClusterId']
        # Skip clusters we've already counted
        if cid in clusters:
            continue
        datasize += info['Used']
        clusters[cid] = True
    except:
        pass


print(humanize_bytes(datasize))
</code></pre></li>
<li><p>Run the Python script to get the amount of data exposed:</p>

<pre><code>$ python hdfs-exposure.py hdfs-data.json.gz
5.0 PB
</code></pre></li>
</ol>]]></content:encoded></item></channel></rss>