• Non ci sono risultati.

User manual

User manual

(with the fixed name of “ml detect.py”) and the path to the script configuration file (“ml detect.conf” in this case). In this way, it is possible to place these files in the preferred directory: the only constraints are the presence of the “integra-tion.conf” in the current working directory and the existence of a “ml detect.py”

in the directory specified in the configuration.

# Format: line 0 = this line, line 1 = path to tstat.conf, line 2 = path to directory containing ml_detect.py, line 3 = path to ml_detect.conf

/home/matteo/suricata-6.0.1/TSTAT/tstat.conf /home/matteo/suricata-6.0.1/TSTAT/

/home/matteo/suricata-6.0.1/TSTAT/ml_detect.conf

Figure A.2: An example of the content of the integration.conf file.

The exact contents of the “tstat.conf” file are explained in the Tstat website9: it contains some configuration information needed by Tstat, together with the paths of other Tstat files. It is fundamental to correctly setup the Tstat configuration to be able to use the new Suricata tool. Instead, the “ml detect.conf” file contains the absolute paths to each of the classifiers available and the related scalers, in the form of a dictionary: the key for each line is the value that must be used inside the Suricata signatures to perform the classification of flows. Finally, the user has to configure Suricata before running the tool. The complete guide to customise Suricata can be found on its website10: the “suricata.yaml” configuration file must be edited with the user’s information (e.g. the name of the network interface to monitor) and the path to any additional file of signatures must be added.

Once all this steps have been completed, the user can run Suricata from the command line with the command

sudo suricata -c <suricata.yaml path> -i <interface name> -l

<suricata log dir>

that starts Suricata. If the operation is successful, the user should see a message similar to the one provided in FigureA.3.

Figure A.3: The output of a successful start of Suricata.

9http://tstat.polito.it/HOWTO.shtml

10https://suricata.readthedocs.io/en/latest/quickstart.html#basic-setup

User manual

A.3.2 Creation of new signatures

With the new tool installed and ready to be used, the user can classify TCP flows with the chosen machine learning classifiers. To do so, the user can modify any already existing Suricata signature or can create a new one. The following example will concern the creation of a new signature: the complete list of available keywords and rules format can be found on the Suricata website11.

For this example, the file “custom.rules” provided in the repository has been used, but the user can create any number of rule files, remembering to add their path inside the “suricata.yaml” configuration file. First of all, it is necessary to write a classic Suricata rule, like the one provided in ListingA.4.

alert tcp 192.168.178.20 any -> any any (msg:"My custom rule!";

flow:established; content:"facebook"; nocase;

classtype:policy-violation; sid:666; rev:1;)

Figure A.4: An example of a trivial Suricata rule.

This example rule is very simple: the action performed in case of match is an alert and the protocol to look for is tcp. The source address has been set to a local host (192.168.178.20), while the source port and the destination address and port are set to match to any address and port number. The flow: established option is used to match only flows where the TCP handshake has been completed, the content: facebook option is used to match packets that contain the word

“facebook” inside the payload and the option nocase tells the engine that the content value is not case-sensitive. The sid value can be freely chosen by the user, paying attention to not use a value already used for other signatures since Suricata would overwrite the previous rule in that case. An example of usage of this very trivial rule is to check if the host 192.168.178.20 visits or searches for the Facebook website, leading to a policy violation (classtype:policy-violation in the example).

Then, to add the machine learning classification of the flow to look for specific attacks, the user has to add one or more special keywords inside the option of the rule. The keywords supported by the pre-made set of classifiers available in the repository are:

ml detect bot to match with flows classified as botnet attacks;

ml detect brute to match with flows classified as bruteforce attacks;

ml detect dos to match with flows classified as DoS attacks;

ml detect loic to match with flows classified as DDoS attacks;

11https://suricata.readthedocs.io/en/latest/rules/index.html

User manual

ml detect web to match with flows classified as web attacks;

ml detect all to match with flows classified as any of the previous attacks.

The description of the attacks used to train these classifiers can be found in Section 7.1, while Chapter 3 contains the technical explanation of these attacks.

The user can choose a combination of any of the previous keywords to detect multiple attacks with a single rule: if at least one of the chosen classifiers finds a match, the whole machine learning detection engine considers the flow as matching.

Hence, the overall rule is considered as matching by Suricata if both the classic options and the machine learning engine consider the flow as matching. The only exception with the explained syntax is the ml_detect_all keyword: if this keyword is used, none of the other machine learning keywords can be inserted in the rule because it already includes all the classifiers automatically. In any case, the Suricata initialiser performs a syntactic and semantic check of all the provided rules and informs the user of any error. Finally, the previous example has been integrated with some of these keywords in Listing A.5, to show a practical example of the usage of the new keywords. In this case, the ml_detect_bot and ml_detect_web keywords have been used: they must be inserted as values of the option flow.

alert tcp 192.168.178.20 any -> any any (msg:"My custom rule with machine learning!"; flow:established, ml_detect_bot, ml_detect_web; content:"facebook"; nocase;

classtype:policy-violation; sid:666; rev:1;)

Figure A.5: An example of Suricata rule with the new keywords.

Appendix B

Developer manual

This appendix is focused on the technical details useful for developers that would like to modify or to expand the provided work. While the installation and usage of the tool can be found in AppendixA, the following sections contain the necessary information to perform actions like the addition of more classifiers, modify the training pipeline, use different datasets, use a different set of features from the same dataset, add other keywords to the Suricata rules or improve the existing work in any other way.

B.1 Requirements

The system requirements to modify the provided tool are the same needed to use it as-is and can be found in Appendix A. The new Suricata tool, Tstat and the files provided in this work’s repository can be all necessary, depending on the type of action to perform. If the developer is interested in modifying the training pipeline or add new models, as explained in Section B.3.2, it may be necessary to install additional Python libraries to use the provided scripts. This can be done with the command pip install matplotlib.