The dataset comes in Avro format which can be
processed by several tools like avro-tools
or fastavro
for
Python. We are working on a JSON version of the data for applications that do not rely on Java or
Hadoop. Please contact us for more information.
This is an example of a record from the dataset.
{ "date": "20161001", "qname": "gatech.edu.", "qtype": 1, "rdata": "130.207.160.173", "ttl": 300, "authority_ips": "128.61.244.253,168.24.2.35", "count": 80, "hours": 16710647, "source": "gt", "sensor": "active-dns" }
You can download a small sample of the dataset by clicking here active_dns_sample_20161001.json.gz (418KB).
date:
The date of the current Resource Record (RR).qname:
The query name that our recursive resolvers answered; effectively the domain
name.
qtype:
The question type number. A comprehensive list of qtypes can be found on Wikipedia.
rdata:
The data returned by the Authoritative Nameserver(s).ttl:
The Time To Live for the particular Resource Record (RR).authority_ips:
The IP addresses of the Authoritative Nameservers (ANS) that replied
with this particular Resource Record (RR).
count:
The number of times this Resource Record (RR) was encountered for the specific
date
.
hours:
A 24-bit integer that encodes the time of day that this Resource Record (RR) was
encountered. More information can be found here..
source:
The source for the particular Resource Record (RR).sensor:
The sensor that recorded this Resource Record (RR).The hours
field encodes the hours of the day, that a particular domain name has been
queried, in a 24 bit integer. You can use the following Python function that will return a
list of hours around the clock, to decode the information.
def parse_hours(hours): hours = str(bin(int(hours)))[::-1][:-2] hours = '0' * (24 - len(hours)) + hours return [i for i in xrange(0, 24) if int(hours[i - 1])]
For the previous JSON excerpt, the output would look like:
In [2]: parse_hours(16710647) Out[2]: [0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23]
To request access to data, please contact access@activednsproject.org. For more information about the project, please contact:
E: info@activednsproject.org