COVID-19 Blocklists

COVID-19 Initiatives

A lot of good initiatives popped up recently to combat malicious activity related to the Corona pandemic.

COVID-19 Blocklist

I wanted to created a blocklist of COVID-19 activity, based on the threat data received from the MISP instance used for COVID-19 and prevent that known good or legitimate sites are included in this list. The approach is



  1. Import the threat data in MISP from synced servers, label it with the pandemic COVID-19 taxonomy;
  2. Exclude the attributes that exist on the warning lists;
  3. Additionally, exclude attributes which are ‘known’ to be good, either related to Corona or because they are company-required. These known domains need to be added to a dynamic warninglist;
  4. Extract all domains, hostnames and URLs;
  5. Have lists with TLP:White and False Positive:Low risk;
  6. Export the list in an easy accessible format as CSV.

Approach

MISP Warninglists

First, enable the necessary MISP warninglists. I enabled the ones below, especially the covid list is important to remove known good sites.



List of domains that need to be excluded from blocklists
List of known domains to know external IP
List of known URL Shorteners domains
Top 1,000,000 most-used sites from Tranco
List of known Office 365 URLs address ranges
List of known microsoft domains
Top 10K websites from Majestic Million
Valid covid-19 related domains
Common contact e-mail addresses
List of known Cloudflare IP ranges
List of known Amazon AWS IP address ranges
List of known Wikimedia address ranges
Specialized list of IPv6 addresses belonging to common VPN providers and datacenters
Specialized list of IPv4 addresses belonging to common VPN providers and datacenters
University domains
Fingerprint of known intermedicate of trusted certificates
Fingerprint of trusted CA certificates
List of known Office 365 IP address ranges
List of known Googlebot IP ranges
List of known gmail sending IP ranges
List of disposable email domains
List of known domains used by automated malware analysis services & security vendors
TLDs as known by IANA
List of known sinkholes
List of known security providers/vendors blog domain
Second level TLDs as known by Mozilla Foundation
List of RFC 6761 Special-Use Domain Names
List of RFC 6598 CIDR blocks
List of RFC 5735 CIDR blocks
List of RFC 3849 CIDR blocks
List of RFC 1918 CIDR blocks
List of known IPv6 public DNS resolvers
List of known IPv4 public DNS resolvers
List of known public DNS resolvers expressed as hostname
List of known Ovh Cluster IP
List of RFC 5771 multicast CIDR blocks
Top 500 domains and pages from https://moz.com/top500
List of known Windows 10 connection endpoints
List of known Office 365 IP
List of known Office 365 IP address ranges in China
List of known Office 365 URLs
List of known Microsoft Azure Datacenter IP Ranges
List of known Office 365 Attack Simulator used for phishing awareness campaigns
List of IPv6 link local blocks
List of known google domains
List of known hashes for empty files
List of hashes for EICAR test virus
List of known dax30 webpages
CRL Warninglist
List of known hashes with common false-positives (based on Florian Roth input list)
Top 1000 website from Cisco Umbrella
List of known bank domains
Top 1000 website from Alexa
List of known Akamai IP ranges

Besides these warninglists, you also need to enable the dynamic -custom- warninglist referred to below in the Python script. By default the supplied Python script assumes this list is called corp_exclusion. Adding this warninglist to MISP is easy.

  1. Create a directory corp_exclusion in /var/www/MISP/app/files/warninglists/lists/;
  2. Add a list.json file to this location and edit the details to correspond to the setting below.
{"name": "List of domains that need to be excluded from blocklists", 
"version": 10, 
"description": "Maintained by blocklist_generator", 
"list": ["belgium.be", "google.com", "www.info-coronavirus.be", "info-coronavirus.be"], 
"type": "hostname", 
"matching_attributes": ["domain", "hostname", "url"]}

Once you have added this warninglist, in the MISP interface, update the warninglists and enable it.



Extract and exclude

If you export MISP attributes via the RestAPI, you can indicate that attributes that are in warninglists should be ignored.

"enforceWarninglist": "true"

Dynamic warninglist

I wanted to have the option to download domains known to be ‘good’ (either external or company required) and then have these also excluded from the blocklist. The easiest way to do this is adding these domains to a dynamic warninglist. This, together with the export to CSV is done via a Python script.

Export the list

The export of the blocklist is done via a Python script which calls PyMISP.

Extensions and caveats

False Positives

MISP supports a false positive taxonomy. This allows feed providers to indicate the risk of false positives. If you are a feed provider, you can help the community by indicate the risk of false positives.

The quality of the dynamic warninglist, and thus also your final blocklist, highly depends on which data you feed it. You can use your proxy ‘top sites’ or use the registration date (whois) of domains to remove known, legitimate sites. Basically you control the quality of the feed by adjusting the domains which end up in the dynamic warninglist.

Domains, URLs and Hostnames

The script fetches domains, URLs and hostnames. If you are only interested in domains then change the search query.

Python script

You can find the script on Github but also below. https://github.com/cudeso/tools/blob/master/covid-19-feed/blocklist_generator.py. Ideally you put this in a crontab, after the pull schedule of the sync server.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

'''
Koen Van Impe

Create block list from MISP data
Put this script in crontab to run every /15 or /60
    */5 *    * * *   mispuser   /usr/bin/python3 /home/mispuser/PyMISP/examples/blocklist_generator.py


'''

from pymisp import ExpandedPyMISP
from keys import misp_url, misp_key, misp_verifycert

import logging
import os
import sys
import json
import urllib3


if misp_verifycert is False:
    urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
misp = ExpandedPyMISP(misp_url, misp_key, misp_verifycert, debug=False, tool='blocklist_generator')
exclude_warninglist = "corp_exclusion"
path_to_warninglist = "/var/www/MISP/app/files/warninglists/lists/{}/list.json".format(exclude_warninglist)


def get_valid_domains():
    return ['belgium.be', 'google.com', 'www.info-coronavirus.be', 'info-coronavirus.be']


def fetch_misp_results(misp_tags):
    relative_path = 'attributes/restSearch'
    body = {
        "returnFormat": "json",
        "enforceWarninglist": "True",
        "tags": misp_tags,
        "type": ["url", "domain", "hostname"],
        "includeDecayScore": "True",
        "includeEventTags": "True"
        }
    result = misp.direct_call(relative_path, body)

    result_csv = result_tlpwhite_csv = result_falsepositive_low = result_domain_csv =  result_domain_tlpwhite_csv = result_domain_falsepositive_csv = "value,decay_sore,value_type,event_id,event_info"
    if "Attribute" in result:

        for attribute in result["Attribute"]:
            value = attribute["value"]
            value_type = attribute["type"]
            decay_score = 0
            if "decay_score" in attribute:
                decay_score = attribute["decay_score"][0]["score"]
            event_info = attribute["Event"]["info"]
            event_id = attribute["Event"]["id"]
            result_csv = result_csv + "\n{},{},{},{},\"{}\"".format(value, decay_score, value_type, event_id, event_info)
            result_domain_csv = result_domain_csv + "\n{}".format(value)

            for t in attribute["Tag"]:
                if t["name"] == "tlp:white":
                    result_tlpwhite_csv = result_tlpwhite_csv + "\n{},{},{},{},\"{}\"".format(value, decay_score, value_type, event_id, event_info)
                    result_domain_tlpwhite_csv = result_domain_tlpwhite_csv + "\n{}".format(value)
                if t["name"] == "false-positive:risk=\"low\"":
                    result_falsepositive_low = result_falsepositive_low + "\n{},{},{},{},\"{}\"".format(value, decay_score, value_type, event_id, event_info)
                    result_domain_falsepositive_csv = result_domain_falsepositive_csv + "\n{}".format(value)

    return result_csv, result_tlpwhite_csv, result_falsepositive_low, result_domain_csv, result_domain_tlpwhite_csv, result_domain_falsepositive_csv

# Step 0: Print all enabled warninglists
active_warninglists = misp.warninglists()
for w_list in active_warninglists:
    w_list_detail = w_list["Warninglist"]["name"]
    logging.info("Warninglist enabled {}".format(w_list_detail))

# Step 1: Fetch the list of "valid domains"
valid_domains = get_valid_domains()

# Step 2: Extend the exclusion list
domains_for_exclusion = []
for domain in valid_domains:
    # Check if the domain is already in a warninglist
    lookup_warninglist = misp.values_in_warninglist([domain])
    if lookup_warninglist:
        # It's already in the list, ignore
        res = lookup_warninglist[domain][0]
        list_name = lookup_warninglist[domain][0]['name']
        list_id = lookup_warninglist[domain][0]['id']
        logging.info("Ignore domain '{}' because already in {} (id {})".format(domain, list_name, list_id))
    else:
        # A new domain, add it to the exclusion list
        domains_for_exclusion.append(domain)
        logging.info("Add domain '{}'".format(domain))

# Step 3: Write the exclusion list
if domains_for_exclusion:
    # First read current file
    logging.info("Reading exclusion file")
    with open(path_to_warninglist) as exclusion_file:
        data = json.load(exclusion_file)

    exclusion_file_version = data["version"]
    current_list = data["list"]
    new_list = (current_list + domains_for_exclusion)
    new_list.sort()

    data["version"] = exclusion_file_version + 1
    data["list"] = new_list

    logging.info("Updating exclusion file")
    with open(path_to_warninglist, 'w') as exclusion_file:
        json.dump(data, exclusion_file)

# Step 4: Update the MISP warning lists
update_result = misp.update_warninglists()
logging.info(json.dumps(update_result))

# Step 5: Fetch all the domains that we want on the blocklist
result_full, result_tlpwhite, result_fp, result_domain, result_domain_tlpwhite, result_domain_fp = fetch_misp_results("pandemic:covid-19=\"cyber\"")

# Step 6: Write the blocklist
logging.info("Write CSV files")
f = open("/home/misp/blocklist_upload/blocklist_full.csv", "w")
f.write(result_full)
f = open("/home/misp/blocklist_upload/blocklist_tlpwhite.csv", "w")
f.write(result_tlpwhite)
f = open("/home/misp/blocklist_upload/blocklist_fp_lowrisk.csv", "w")
f.write(result_fp)
f = open("/home/misp/blocklist_upload/blocklist_domain.csv", "w")
f.write(result_domain)
f = open("/home/misp/blocklist_upload/blocklist_domain_fp_lowrisk.csv", "w")
f.write(result_domain_fp)
f = open("/home/misp/blocklist_upload/blocklist_domain_tlpwhite.csv", "w")
f.write(result_domain_tlpwhite)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.