Supplementary Data
BUSCA: an integrative web server to predict subcellular localization of proteins
Castrense Savojardo, Pier Luigi Martelli, Piero Fariselli, Giuseppe Profiti, Rita Casadio
Supplementary data for Gram+ bacteria
We extracted from UniprotKB/SwissProt all protein sequences belonging to Firmicutes and Actinobacteria phyla and endowed with experimentally annotated subcellular localization. After homology reduction to 25%
sequence identity and filtering-out protein sequences already included into BUSCA training sets, we ended up with 1,667 non-redundant protein sequences whose subcellular localizations are distributed as follows:
510 cytoplasmic, 245 extracellular and 912 plasma membrane.
BUSCA was run in Gram+ mode on this dataset and results are reported in Table S1.
Compartment Precision Recall
MCC
Cytoplasm 0.70 0.96 0.73
Extracellular 0.51 0.45 0.40
Plasma membrane 0.96 0.78 0.74
Supplementary Table S1. Performance evaluation of BUSCA on a Gram-positive dataset derived from UniprotKB.