• Non ci sono risultati.

4.3 Photometric AGN Classification

4.3.1 The data

The experiments in this section have been performed using objects belonging to at least one of the catalogues of galaxies and candidate AGNs provided by Sorrentino et al. (2006), and by Kauffmann et al. (2003).

According to the standard unified model of active galactic nuclei (Antonucci, 1993) AGNs can be classified into two categories, depending on whether the central black hole and its associated continuum and broad emission-line region are viewed directly (a “type 1” AGN) or are obscured (a “type 2” AGN) by the dust torus

this section is largely extracted from:

Cavuoti, S.; Brescia, M.; D’Abrusco, R.; Longo, G.; Photometric AGN Classification in the SDSS with Machine Learning Methods to be Submitted to MNRAS

surrounding the black hole. The AGNs have been selected from the normal galaxies using the method devised by Baldwin, Phillips and Terlevich (hereafter BPT, Bald-win et al. 1981), i.e. considering the intensity ratios of two pairs of relatively strong emission lines, and classifying objects according to their position in the so-called BPT diagram.

Three different experiments have been performed using three different samples of photometric objects for which a classification based on spectroscopy, is available.

All three samples have been drawn from the SDSS DR4 catalogue: they belong to the PhotoSpecAll table, containing all the objects for which both photometric and spectroscopic observations have been carried out.

Our sample is made by the join of the three catalogues, described in the follow-ing.

Catalogue of Sorrentino et al.

The first selected catalogue has been obtained by Sorrentino et al. (2006) which operates a separation between (0.05 < z < 0.095) objects into Seyfert 1, Seyfert 2 and Not AGN. That catalogue contains 24293 objects.

One object is an AGN if it is over one of the Kewley’s line (Kewley et al., 2001):

log[OIII]λ5007

Hβ = 0.61

log[NII]λ6583Hα − 0.47+ 1.19 (4.1) log[OIII]λ5007

Hβ = 0.72

log[S II]λλ6717,6731

Hα − 0.32+ 1.30 (4.2)

log[OIII]λ5007

Hβ = 0.73

log[OI]λ6300Hα − 0.59+ 1.33 (4.3) The division obtained according the Kewley’s line is shown in figure 4.17. Once one object is declared as AGN, then it is declared to be a Seyfert 1 if

FW H M(Hα) > 1.5FW H M([OIII]λ5007) (4.4) or

FW H M(Hα) > 1200Kms−1 (4.5)

and

FW H M([OIII]λ5007) < 800Kms − 1 (4.6) The rest of the objects is classified as Seyfert 2. They found:

• 1829 AGN:

– 725 Sy 1

Figure 4.17. The division obtained according to the Kewley’s line.

– 1105 Sy 2

• 22464 Not AGN

Catalogue of Kauffman et al.

The second selected catalogue9 contains spectra lines and ratio for 88178 galaxies (0.02 < z < 0.3). According to the work of Kauffmann et al. (2003), we define a zone where there are just AGNs located over the Kewley’s line, (eq.

4.1) Kewley et al. 2001, and a zone where objects are not AGNs, i.e. under the Kauffman’s line, Kauffmann et al. 2003; Kewley et al. 2006):

log[OIII]λ5007

Hβ = 0.61

log[NII]λ6583Hα − 0.05+ 1.3 (4.7) The division obtained according the Kauffman’s line is shown in figure 4.18, with an overlap mixed zone in which both AGNs and not AGNs objects are present.

The mixed and pure AGN zones have been further divided into Seyfert and LINER zones by following the Heckman’s line (Heckman, 1980; Kewley et al., 2006):

[OIII]λ5007

Hβ = 2.1445[NII]λ6583

Hα + 0.465 (4.8)

The division obtained according to that line is shown in figure 4.19.

The resulting five areas of the AGN map are shown in figure 4.20.

9http://www.mpa-garching.mpg.de/SDSS/DR4/

Figure 4.18. The division obtained according to the Kauffman’s line.

Figure 4.19. The division obtained according to the Heckman’s line.

Figure 4.20. A representation of the catalogue in the BPT diagram.

Catalogue of D’Abrusco et al.

The catalogue made by D’Abrusco et al. (2007) contains photometric redshift with an accuracy estimated by σrob = 0.02, evaluated by (zphot− zspec).

The final catalogues

The process performed in order to obtain the final catalogues for our experi-ments was divided into four phases: the first was to join catalogs of Kauffman and Sorrentino. In case of overlap, the information of both type 1 and 2 objects has been taken from Sorrentino, while all other types from Kauffman. By merging these two catalogs, we obtained a total of 108162 objects. The second phase was to merge the outcoming catalogue with the one of D’Abrusco by matching and including the photometric redshift information, obtaining a final catalogue of 100069 objects.

The third stage was to extract the corresponding photometric information for the selected objects from the SDSS archive.

The information retrieved from the SDSS DR4 database for each object was:

• f iberMag u

• f iberMag g

• f iberMag r

• f iberMag i

• f iberMag z

• petroMag u

• petroMag g

• petroMag r

• petroMag i

• petroMag z

• petroR50 u

• petroR50 g

• petroR50 r

• petroR50 i

• petroR50 z

• petroR90 u

• petroR90 g

• petroR90 r

• petroR90 i

• petroR90 z

• dered u

• dered g

• dered r

• dered i

• dered z

After a statistical pruning just a subsample of those parameters, described more in detail in the section 4.3.2, has been used for the final experiments.

From the obtained dataset, all the objects with undefined values for some of their parameters (named also as NaN) have been removed. This was because machine learning empirical models may suffer from the potentially misleading information coming from such unknown terms.

The last phase was the creation of three different catalogues for the three different experiments:

1. AGN vs Not AGN

CLAS S CAT ALOGU E E xp.1 E xp.2 E xp.3

NotAGN All Class0

T ype1 S orrentino Class1 Class1

T ype2 S orrentino Class1 Class0

Mix − LINER Kau f f man Class0

Mix − S ey f ert Kau f f man Class0 Pure − LINER Kau f f man Class1 Class0 Pure − S ey f ert Kau f f man Class1 Class1

Mix − LINER − T ype1 overlap Class0

Mix − S ey f ert − T ype1 overlap Class0 Pure − LINER − T ype1 overlap Class1 Class1 Class0 Pure − S ey f ert − T ype1 overlap Class1 Class1 Class1

Mix − LINER − T ype2 overlap Class0

Mix − S ey f ert − T ype2 overlap Class0 Pure − LINER − T ype2 overlap Class1 Class0 Class0 Pure − S ey f ert − T ype2 overlap Class1 Class0 Class1 S IZE: 24293 S orrentino 84885 1570 30380 S IZE: 88178 Kau f f man

Table 4.9. The dataset composition after the merging from original catalogues. The empty fields indicate the unused typology. The division between class 0 and class 1 are referred to the target vector (used during training). The final sizes fo the three experiment datasets are obtained after the D’Abrusco photo-z catalogue matching and the whole NaN removal process.

2. Type 1 vs Type 2 3. Seyferts vs LINERs

The dataset for the first experiment is the whole dataset itself, with 84885 objects; the dataset for the second experiment contains just the objects belonging to the dataset of Sorrentino and that are pure AGN, resulting into 1570 objects; the last dataset contains the objects, belonging to the catalogue of Kauffman, that are pure AGN divided into LINERs and Seyferts, obtaining 30380 objects; the datasets obtained are summarized in table 4.9.