Staying in control of MISP correlations

MISP correlations

MISP correlations are a way to find relationships between attributes and indicators from malware or attacks campaigns. Correlation support analysts in detecting clusters of similar activities and pivot from one event to another.

When the volume of data in your MISP instance grows, the number of correlations can however explode and make your system less responsive. I cover some approaches that you can use to stay in control.

What is correlation?

Correlation basically is a way for MISP to indicate that a certain value exists in more than one event. Typical examples include the same IP address observed during different attack campaigns or the same domain used by various phishing attacks. The correlation feature of MISP is not limited to basic technical indicators, you can also use it to correlate on for example (the same) YARA rule.

Data model

The correlation takes place on attribute level. But because attributes are enclosed in an event, they are also represented on event level.

There are in fact three ways how MISP informs you of correlations:

  1. On the event index;
  2. In the event detail page;
  3. Next to the attributes that cause the correlation.

Options for disabling correlation

Reasons for disabling correlation

Some of the reasons why you would choose to disable correlation include

  • The correlation is on attributes that have no real value for your organisation;
  • The value is not very specific. An example is correlating on a destination port. Correlating on tcp/80 or tcp/443 is maybe not that useful, whereas correlating on a high network port (above 1024) can be useful;
  • Your system has not sufficient resources to cope with all the correlations.

Per attribute

Correlation in MISP is done in the background and doesn’t require additional effort from an analyst.

That said, you can still prevent correlation from taking place. When you add attributes, either manually, in batch or via the freetext import you always have the choice to override the automatic correlation by checking Disable Correlation.

Important to realise is that disabling the correlation on attribute level only disables the correlation for the specific attribute you’re adding/editing. It does not disable the MISP correlation engine for other attributes or other events.

Per event

Instead of disabling correlation per attribute, you can also disable correlation on event level. By enabling MISP.allow_disabling_correlation you give event creators the option to disable all correlations for a specific event.

Exclude values for correlation

You can exclude correlation from taking place on specific values with correlation exclusion entries. Under Input Filters > List Correlation Exclusions you can add a list of attribute values for which you want no correlation to take place.

As you can notice, you can also cleanup the already existing correlations.

Completely disable correlation

You can disable the correlation index of MISP completely by enabling MISP.completely_disable_correlation. Enabling this setting will trigger a full recorrelation of all data which is an extremely long and costly procedure.

Only enable this if you know what you’re doing.

Correlation, performance and resources

Performance tuning

In some cases your system might have sufficient resources to cope with correlation but you still experience slowness when logging in. This is most likely because you are displaying the number of correlations on the event index page. Correlations aren’t cached, this means that they are requested (counted) every time when accessing the event index page. You can get a huge performance increase on the event index page by disabling MISP.showCorrelationsOnIndex.

This does not disable correlation. It just prevents correlations from being displayed on the event index page. You still have access to all correlations on the event details page and in the list of attributes.

“overhead” statistics

MISP helps you in diagnosing some of the resource issues related to correlation. Under Server Settings & Maintenance > Diagnostics you can use the SQL database status to get an overview of the volume of correlations and how much drive space is reclaimable. Reclaiming this drive space can give your server some extra (storage) breathing room.

The fact that you can “reclaim” drive space is the result of database manipulations taking place when adding, editing and deleting data.

MISP does not include tools to reclaim the drive space via its administration interface. You need to use the mysql administration tool for this. Once you are logged in into mysql, use the command optimize table correlations.

MariaDB [misp]> optimize table correlations;
| Table           | Op       | Msg_type | Msg_text                                                          |
| misp.correlations | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| misp.correlations | optimize | status   | OK                                                                |
2 rows in set (3 min 42.36 sec)

Important to note is that in order to reclaim space, you first need to have sufficient space (at least the size of the table) available to start the reclaiming. If there’s not sufficient space, you’ll get an error (the error message isn’t that informative).

MariaDB [misp]> optimize table correlations;
| Table             | Op       | Msg_type | Msg_text                                                          |
| misp.correlations | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| misp.correlations | optimize | error    | Incorrect key file for table 'correlations'; try to repair it     |
| misp.correlations | optimize | status   | Operation failed                                                  |
3 rows in set, 1 warning (1 min 13.59 sec)


MISP correlations provide an excellent way to pivot between different threat events. It can however be challenging to cope with the required system resources. It’s highly advised not to disable the correlation engine, but to exclude certain low values from correlation and run the SQL-optimize commands on a regular basis.

2 thoughts on “Staying in control of MISP correlations

  1. Yeah, that’s funny. We have downloaded just few events from MalwareBazaar (official CIRCL feed) and we ended up with correlation table in size of 80 GB. I’m shitting bricks to get it back to normal.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.