VirusResearchxxx (2012) xxx–xxx
Contents
lists
available
at
SciVerse
ScienceDirect
Virus
Research
j o
u
r
n
a l
h o m
e p a
g e :
w w w . e l s e v i e r . c o m / l o c a t e / v i r u s r e s
Selected
amino
acid
changes
in
HIV-1
subtype-C
gp41
are
associated
with
specific
gp120
V3
signatures
in
the
regulation
of
co-receptor
usage
1 2
Salvatore
Dimonte
a
,
∗
,
Muhammed
Babakir-Mina
b
,
c
,
Fabio
Mercurio
a
,
Domenico
Di
Pinto
a
,
Q1
Francesca
Ceccherini-Silberstein
a
,
Valentina
Svicher
a
,
Carlo-Federico
Perno
a
,
b
,
d
3 4
aUniversityofRomeTorVergata,viaMontpellier1,00133Rome,Italy
5
bLaboratoryofMolecularVirology,FoundationPolyclinicTorVergata,viaOxford81,00133Rome,Italy
Q2
6
cFoundationofTechnicalEducationinSulaimaniyah,IraqiKurdistanRegion,Iraq
7
dNationalInstituteofInfectiousDiseases(INMI)L.Spallanzani,viaPortuense292,00149Rome,Italy
8 9
a
r
t
i
c
l
e
i
n
f
o
10 11 Articlehistory: 12 Received6April2012 13Receivedinrevisedform13June2012
14 Accepted15June2012 15 Available online xxx 16 Keywords: 17 HIV-1 18 Subtype-C 19 gp41 20 gp120V3loop 21 Genotype 22 Tropism 23 Mutations 24 Cluster 25
a
b
s
t
r
a
c
t
ThemajorityofstudieshavecharacterizedthetropismofHIV-1subtype-Bisolates,butlittleisknown aboutthedeterminantsoftropisminothersubtypes.So,thegoalofthepresentstudywastogenetically characterizetheenvelopeofviralproteinsintermsofco-receptorusagebyanalyzing356full-lengthenv sequencesderivedfromHIV-1subtype-Cinfectedindividuals.Theco-receptorusageofV3sequences wasinferredbyusingtheGeno2PhenoandPSSMalgorithms,andalsoanalyzedtothe“11/25rule”.All reportedenvsequenceswerealsoanalyzedwithregardtoN-linkedglycosylationsites,netchargeand hydrophilicity,aswellasthebinomialcorrelationphicoefficienttoassesscovariationamonggp120V3 andgp41signaturesandtheaveragelinkagehierarchicalagglomerativeclusteringwerealsoperformed. AmongenvsequencespresentinLosAlamosDatabase,255and101sequencespredictedasCCR5and CXCR4wereselected,respectively.TheclassicalV3signaturesatpositions11and25,andotherspecific V3andgp41aminoacidchangeswerefoundstatisticallyassociatedwithdifferentco-receptorusage. Furthermore,severalstatisticallysignificantassociationsbetweenV3andgp41signatureswerealso observed.ThedendrogramtopologyshowedaclusterassociatedwithCCR5-usagecomposedbyfivegp41 mutatedpositions,A22V,R133M,E136G,N140L,andN166QthatclusteredwithT2VV3andG24TV3 (boot-strap=1).Conversely,aheterogeneousclusterwithCXCR4-usage,involvingS11GRV3,13–14insIG/LGV3, P16RQV3,Q18KRV3,F20ILVV3,D25KRQV3,Q32KRV3alongwithA30Tgp41,S107Ngp41,D148Egp41,A189Sgp41 wasidentified(bootstrap=0.86).
OurresultsshowthatasobservedforHIV-1subtype-B,alsoinsubtype-Cspecificanddifferentgp41 andgp120V3aminoacidchangesareassociatedindividuallyortogetherwithCXCR4and/orCCR5usage. Thesefindingsstrengthenpreviousobservationsthatdeterminantsoftropismmayalsoresideinthe gp41protein.
© 2012 Published by Elsevier B.V.
1.
Introduction
26
Ninety
percent
of
HIV-1-infected
people
worldwide
harbors
27
non-B-subtype
variants,
and
consequently
the
vast
majority
of
28
cases
of
infections
are
due
to
these
viruses
(
Arien
et
al.,
2007
).
Glob-29
ally,
the
C-subtype
is
the
most
prevalent
circulating
viral
clade
and
30
accounts
for
nearly
half
of
infections,
followed
by
A,
B,
G
subtypes
31
and
the
recombinant
form
CRF02-AG
and
CRF01-AE
(
Hemelaar
32
et
al.,
2011
).
33The
higher
rate
of
non-synonymous
mutations
tends
to
occur
in
34
regions
of
the
HIV-1
env
gene
and
is
submitted
to
strong
selective
35
∗ Correspondingauthor.Tel.:+390672596564;fax:+390672596039. E-mailaddress:salvatore.dimonte@uniroma2.it(S.Dimonte).
pressure
from
the
immune
system
(
Choisy
et
al.,
2004;
Lemey
et
al.,
362006;
Mikhail
et
al.,
2005;
Zhang
et
al.,
2005
).
A
structure
of
partic-
37ular
importance
in
this
process
is
the
third
variable
loop
(V3)
of
the
38surface
glycoprotein
gp120
which
is
essential
for
HIV-1
co-receptor
39usage
(
de
Jong
et
al.,
1992;
Fouchier
et
al.,
1992;
Huang
et
al.,
402005
).
In
most
European
countries,
HIV
tropism
is
identified
with
41tropism
phenotype
testing.
New
data
support
genotype
analysis
42of
the
V3
for
the
identification
of
HIV-1
tropism
(
Vandekerckhove
43et
al.,
2011
).
44HIV-1
enters
into
the
host
cell
by
binding
gp120
to
CD4
recep-
45tor
on
a
target
cell,
leading
to
conformational
changes
within
gp120
46that
allows
for
the
engagement
of
a
second
host
cell
receptor
(co-
47receptor)
(
Alkhatib
et
al.,
1996;
Choe
et
al.,
1996;
Deng
et
al.,
1996;
48Doranz
et
al.,
1996;
Dragic
et
al.,
1996;
Feng
et
al.,
1996;
Trkola
49et
al.,
1996;
Wu
et
al.,
1996
).
About
20
G-protein-coupled
receptors
500168-1702/$–seefrontmatter © 2012 Published by Elsevier B.V.
2 S.Dimonteetal./VirusResearchxxx (2012) xxx–xxx
(GPCRs)
have
been
shown
to
act
in
vitro
as
co-receptors
(
Neil
et
al.,
51
2005;
Shimizu
et
al.,
2009;
Simmons
et
al.,
2000
),
but
only
CCR5
52and
CXCR4
are
considered
essential
and
apparently
relevant
in
53
HIV
pathogenesis
(
Berger
et
al.,
1998;
Simmons
et
al.,
2000;
Zhang
54
et
al.,
1998
).
Moreover,
the
interaction
with
co-receptor
induces
the
55arrest
of
the
gp41
transitions
at
a
pre-hairpin
intermediate
stage
56
that
leads
to
the
insertion
of
the
fusion
peptide
into
the
target
cell
57
membrane
and
ultimately
to
virus-cell
fusion
activity
(
Eckert
and
58
Kim,
2001;
Wyatt
and
Sodroski,
1998
).
59The
gp41
is
a
transmembrane
glycoprotein
that
retains
the
60
gp120
on
viral
surface
with
non-covalent
interactions
(
Helseth
61
et
al.,
1991
)
and
some
studies
indicate
that
several
mutations
in
62gp41
were
involved
to
be
significantly
associated
with
co-receptor
63
usage
(
Dimonte
et
al.,
2011a;
Huang
et
al.,
2008;
Stawiski
et
al.,
64
2009;
Thielen
et
al.,
2009,
2010
),
beyond
the
primary
classical
65determinants
of
gp120
including
particularly
positions
11
and
25
66
in
V3-loop
(
de
Jong
et
al.,
1992;
Fouchier
et
al.,
1992;
Resch
et
al.,
67
2001
),
and
secondly
by
other
flanking
domains
(as
V1,
V2,
C3,
C4
68and
V5)
(
Carrillo
and
Ratner,
1996;
Huang
et
al.,
2008,
2011;
Koito
69et
al.,
1995;
Labrosse
et
al.,
2001;
Lin
et
al.,
2011;
Pastore
et
al.,
702006;
Svicher
et
al.,
2011b;
Suphaphiphat
et
al.,
2007
).
Both
CCR5
71and
CXCR4
co-receptors
interact
with
the
same
region
of
the
sur-72
face
gp120
viral
protein
that
encompasses
not
only
the
V3
loop
but
73
also
specific
regions
from
the
V1/V2
and
the
C4
domains
(
Sierra
74
et
al.,
2007
).
75Moreover,
few
amino
acid
substitutions
and
an
increasing
net
76
charge
of
the
V3-loop
were
sufficient
to
confer
a
change
from
CCR5
77
to
CXCR4
in
cellular
tropism
(
de
Jong
et
al.,
1992;
De
Wolf
et
al.,
78
1994
).
On
the
other
hand,
the
previous
studies
have
defined
the
loss
79of
a
Potential
N-linked
Glycosylation
Site
(PNGS)
at
V3
positions
6–8
80
(
Pollakis
et
al.,
2001
),
as
a
close
association
between
the
V3-loop
N-81linked
glycosylation
motifs
(sequons)
and
CXCR4
usage
(
Clevestig
82
et
al.,
2006
).
83Molecular
mechanisms
underlying
the
transition
from
CCR5
84
to
CXCR4
usage
of
clade
C
virus
remain
poorly
known.
With
the
85
recent
introduction
of
HIV-1
chemokine
receptor
antagonists
on
86
the
market
as
components
of
antiretroviral
therapy,
it
is
increas-87
ingly
important
to
properly
screen
co-receptor
usage
for
all
infected
88
patients
prior
to
therapy
(
Hunt
and
Romanelli,
2009;
Sayana
and
89
Khanlou,
2009
).
Hence,
simple
and
efficient
processes
for
routinely
90characterizing
and
monitoring
HIV-1
co-receptor
usage
are
needed
91
to
replace
slow
and
resource-intensive
phenotypic
assays.
Existing
92
methods
do
not
consider
the
other
gp120
regions,
mainly
for
lim-93
ited
data
available,
although
incorporating
the
V2-loop
is
known
94
to
improve
prediction
methods
based
on
V3
sequence
informa-95
tion
(
Prosperi
et
al.,
2009
),
and
key
genetic-elements
in
V1,
V2,
and
96C4
domains
tightly
and
differentially
modulate
HIV-1
dependency
97
on
CXCR4
or
CCR5,
irrespective
of
V3
genetic-background
(
Svicher
98
et
al.,
2011a
).
Nevertheless,
genotypic
determinants
of
co-receptor
99usage
located
outside
V3
could
also
explain
some
of
the
mispredic-100
tions
(
Raymond
et
al.,
2010
).
101
In
this
study,
large
datasets
of
HIV-1
gp120
V3and
gp41
C-102
subtype
sequences
were
analyzed
to
genetically
characterize
them
103
in
terms
of
co-receptor
usage.
In
addition,
according
to
CCR5
104
and/or
CXCR4
usage,
the
association
between
amino
acid
signa-105
tures,
average
hydrophilicity,
net
charge,
and
number
of
N-linked
106
glycosylation
sites
were
defined
for
the
V3
and
the
gp41.
107
2.
Materials
and
methods
108
2.1.
Sequence
analysis
109
The
analysis
included
312
HIV-1
C-subtype
env
full-length
110
sequences
and
other
44
HIV-1
C-subtype
V3
sequences,
retrieved
111
from
the
Los
Alamos
Database
(overall
from
356
infected
112
individuals
at
all
stages
of
infection,
with
one
isolate
per
single
113patient)
(
http://www.hiv.lanl.gov
)
(
Table
S1
).
The
treatment
sta-
114tus
for
the
individuals
is
not
available
in
the
Los
Alamos
Database.
115The
multiple
sequence
alignments
of
V3
and
gp41
segments
were
116performed
by
using
ClustalX
(
Thompson
et
al.,
1997
)
and
manually
117edited
with
the
Bioedit
software
(
Hall,
1999
).
Published
env
con-
118sensus
sequences
of
pure
HIV-1
subtypes
(A,
B,
C,
D,
F1,
F2,
G,
H,
J,
119and
K)
were
used,
and
multi-aligned
sequences
were
subjected
to
120phylogenetic
inference
through
the
Neighbor-Joining
method
and
121Kimura
two-parameter
model
implemented
in
the
MEGA
4
pack-
122age
(
Tamura
et
al.,
2007
).
One
thousand
bootstrap
replicates
were
123used
to
assess
the
phylogenetic
robustness
of
the
clusters.
1242.2.
Tropism
prediction
125Within
all
356
env-sequences,
the
V3
region
was
extrapolated
126and
submitted
for
tropism
prediction
to
Geno2Pheno
algorithm
127(
http://coreceptor.bioinf.mpi-inf.mpg.de
)
and
to
the
Position
Spe-
128cific
Scoring
Matrices
(PSSM)
algorithm
(
http://fortinbras.us/cgi-
129bin/fssm/fssm.pl
)
(
Vandekerckhove
et
al.,
2011
).
130Geno2Pheno
was
preferred
because
it
features
an
adjustable
131cutoff.
Beyond
tropism
prediction,
it
assigns
to
each
V3
sequence
132a
score,
called
False
Positive
Rate
(FPR),
ranging
from
0%
to
100%,
133which
represents
the
probability
for
a
sequence
to
belong
to
a
CCR5-
134virus.
According
to
FPR
values,
arbitrarily
we
selected
sequences
135with
FPR
≤5%
(indicating
a
strong
CXCR4
prediction)
and
sequences
136with
FPR
≤80%
(indicating
a
strong
CCR5
prediction)
for
CXCR4-
and
137CCR5-tropic
viruses,
respectively.
These
sequences,
together
with
138the
related
gp41
sequences,
were
then
used
for
all
at
the
rest
of
the
139study.
140For
Fortinbras
PSSM,
an
easy
and
rapid
bioinformatic
method
for
141viral
tropism
estimation
written
by
the
original
WebPSSM
devel-
142oper,
the
subtype-C
specific
matrix
(that
recently
was
provided)
143was
used
(
Jensen
et
al.,
2003
).
1442.3.
N-linked
glycosylation
motifs
prediction
145We
assessed
the
N-linked
glycosylation
motifs
(sequons)
in
all
146356
V3
HIV-1
C-subtype
sequences
using
the
LANL
N-glycosite
147program
(
http://www.hiv.lanl.gov
)
(
Table
S1
).
The
sequons
were
148governed
by
the
amino
acid
order
asparagine-X-threonine/serine-
149Y
(N-X-S/T-Y)
(
Marshall,
1972
),
where
X
can
be
any
amino
acid
150except
proline
(P)
in
the
threonine
(T)
context
(
Gavel
and
von
151Heijne,
1990;
Kasturi
et
al.,
1997;
Mellquist
et
al.,
1998
)
and
also
152not
tryptophan
(W),
aspartic
acid
(D)
or
glutamine
(E)
in
a
serine
(S)
153context
(
Kasturi
et
al.,
1997
).
These
parameters
provided
us
with
a
154high
probability
of
oligosaccharide
addition
(
Gavel
and
von
Heijne,
1551990;
Kasturi
et
al.,
1997;
Mellquist
et
al.,
1998;
Shakin-Eshleman
156et
al.,
1996
)
and
the
criteria
for
evaluating
each
sequon
as
a
possible
157N-linked
glycosylation
site.
Sequences
exhibiting
ambiguities
in
a
158site
were
included
from
this
calculation.
1592.4.
V3-loop
amino
acid
physical
and
chemical
properties
160Determinations
of
the
net
charge
and
the
average
of
161hydrophilicity
of
the
V3-loop
for
each
sequence
at
pH
7.0
were
162determined
using
a
desktop-based
bioinformatics
system
Peptide
163Property
Calculator
from
Innovagen
(
http://www.innovagen.se
).
164All
possible
permutations
were
assessed
when
amino
acid
mixtures
165were
found
at
some
codons
of
V3.
To
compare
the
values
between
C-
166and
B-subtype,
the
V3
B-subtype
sequences
(one
per
patient)
with
167available
phenotypic
determination
of
HIV-1
tropism
(114
CXCR4-
168and
582
CCR5-tropic
viruses,
respectively;
Los
Alamos
Database)
169S.Dimonteetal./VirusResearchxxx (2012) xxx–xxx 3
values
of
hydrophobicity
and
surface
probability
of
gp120
V3-loop
171
region
were
calculated.
172
2.5.
Verification
of
tropism
prediction
173
To
further
support
the
correlation
of
V3
and
gp41
mutations
174
with
different
co-receptor
usage
and
the
correlation
among
these
175
Env
amino
acid
signatures,
all
sequences
available
from
Los
Alamos
176
Database
with
pure
phenotype
and/or
co-receptor
determinations
177
have
been
considered
(for
V3:
423
CCR5-
and
48
CXCR4-using
178
viruses,
respectively;
for
gp41:
106
CCR5-
and
19
CXCR4-using
179
viruses,
respectively)
(
Table
S1
).
180
2.6.
Statistical
analysis
181
To
analyze
gp41
and
V3
mutations,
we
calculated
the
frequency
182
of
all
mutations
in
the
353
gp41
amino
acids
and
35
V3
amino
183
acids,
using
the
env
selected
sequences.
Fisher
exact
tests
were
used
184
to
determine
whether
the
differences
in
frequency
between
the
2
185
groups
of
patients
were
statistically
significant
(isolates
with
strong
186
CCR5
and
CXCR4
prediction,
respectively).
187
The
Benjamini–Hochberg
method
has
been
used
to
iden-188
tify
results
that
were
statistically
significant
in
the
presence
of
189
multiple-hypothesis
testing
(
Benjamini
and
Hochberg,
1995
).
A
190
false
discovery
rate
of
0.05
was
used
to
determine
statistical
sig-191
nificance.
To
identify
significant
patterns
of
pairwise
associations
192
between
V3
and
gp41
mutations,
we
calculated
the
ϕ
coefficient
193
and
its
statistical
significance
for
each
pair
of
mutations.
A
positive
194
and
statistically
significant
correlation
between
mutations
at
two
195
specific
positions
(0
<
ϕ
<
1;
P
<
0.05)
indicates
that
the
latter
mutate
196
in
a
correlated
manner
in
order
to
confer
an
advantage
in
terms
of
197
co-receptor
selection
and
that
the
co-occurrence
of
these
muta-198
tions
is
not
due
to
chance.
Moreover,
to
analyze
the
covariation
199
structure
of
mutations
in
more
detail,
we
performed
average
link-200
age
hierarchical
agglomerative
clustering
(
Dimonte
et
al.,
2011b;
201
Svicher
et
al.,
2009
).
Mann–Whitney
U
tests
have
been
used
to
202assess
statistically
significant
differences
among
all
the
pairwise
203
mutations
associated.
Statistical
tests
have
been
corrected
for
204
multiple-hypothesis
testing
by
using
Benjamini–Hochberg
method
205
at
a
false
discovery
rate
of
0.05
(
Benjamini
and
Hochberg,
1995
).
206
Using
again
the
nonparametric
Mann–Whitney
U
tests,
we
com-207
pared
the
mean
changes
in
the
mean
net
charge
and
in
the
mean
208
hydrophilicity
respectively,
in
255
CCR5-
and
101
CXCR4-using
209
viruses
V3
amino
acid
sequences.
210
3.
Results
and
discussion
211
The
genotypic
algorithms
built
from
B-subtype
virus
data
are
212
questioned
whether
they
correctly
predict
the
tropism
of
non-B
213
viruses
(
Garrido
et
al.,
2008
),
despite
recent
observations
suggest-214
ing
that
they
performed
well
for
predicting
the
tropism
of
HIV-1
215
clade
C
virus
(
Raymond
et
al.,
2010
).
Moreover,
a
study
compar-216
ing
the
predictive
performance
of
Geno2Pheno
,
PSSM
and
other
217
methods
against
the
first-generation
Trofile
®assay
(validated
for
218
HIV-1
tropism
determination),
concluded
that
the
concordance
219
being
as
high
as
91%
(
Raymond
et
al.,
2008
).
Similarly,
another
220
work
described
that
HIV-1
tropism
determination
via
plasma
viral
221
V3
RNA
genotyping
coupled
with
Geno2Pheno
interpretation
may
222
represent
a
valid
alternative
to
enhanced
sensitivity
Trofile
®assay
223
(
Prosperi
et
al.,
2009
).
224In
HIV-1
B-subtype,
gp120
mutations
in
the
V3
and
V1/V2
225
domains
are
required
for
co-receptor
switching,
but
in
C-subtype
226
there
is
a
much
stronger
genetic
barrier
to
co-receptor
switching
227
that
involves
the
requirement
for
more
extensive
changes
outside
228
the
V3
region
(
Coetzer
et
al.,
2011
).
Hence,
the
contribution
of
the
229
other
gp120
regions
in
directing
co-receptor
usage
was
excluded
230in
this
study.
2313.1.
Physical
and
chemical
V3
properties
and
prevalence
of
V3
232mutations
233356
HIV-1
C-subtype
V3-containing
env-sequences
were
col-
234lected
from
the
Los
Alamos
HIV
Sequence
Database.
Among
them,
235312
contained
also
gp41
genome
region.
Geno2Pheno
algorithm
236was
used
to
infer
HIV-1
co-receptor
usage
for
all
the
356
V3-
237containing
env-sequences.
Among
them,
255
were
CCR5-using
238(with
FPR
≥80%),
and
101
CXCR4-using
(with
FPR
≤5%).
The
predic-
239tion
of
co-receptor
usage
was
fully
confirmed
using
both
Fortinbras
240PSSM
algorithm,
and
the
“net
charge”
and
“11/25”
rules
(
Table
1
)
241(
Vandekerckhove
et
al.,
2011
).
Thus,
these
3
interpretation
methods
242for
tropism-prediction
provide
superimposable
results.
243Previous
studies
have
shown
that
CXCR4-using
viruses
were
244infrequently
found
in
HIV-1
C-subtype
infection
compared
to
B-
245subtype
(
Cecilia
et
al.,
2000;
Ndung’u
et
al.,
2006;
Pollakis
et
al.,
2462004;
Zhang
et
al.,
2006
):
thus,
this
can
explain
the
low
number
247of
CXCR4-related
env
sequences
retrieved
and
employed
for
the
248entire
study.
249By
evaluating
the
V3-loop
sequences,
we
have
identified
11
250amino
acids
at
specific
V3
positions
whose
prevalence
was
sig-
251nificantly
higher
in
CCR5-using
than
in
CXCR4-using
viruses
(P
252values
from
1.40E
−30
to
1.66E
−2)
(
Fig.
1
).
All
of
them
(D25D
253and
S11S,
and
T2V,
N5N,
N6N,
N7N,
K10ET,
P16P,
G24T,
D29N
254and
Q32E)
had
a
prevalence
≥10%
(ranging
from
12.2%
to
100%)
255in
CCR5-using
viruses.
We
also
identified
46
amino
acids
at
spe-
256cific
V3
positions
whose
prevalence
was
significantly
higher
in
257CXCR4-using
than
in
CCR5-using
viruses,
suggesting
their
asso-
258ciation
with
CXCR4-usage
(P
values
from
2.34E
−38
to
4.49E
−2).
259Among
them,
18
(S11R
and
D25KRQ,
and
N5G,
T8KR,
K10R,
S11G,
26013–14insIL/IG/VG,
P16RQ,
Q18KR,
T19AV,
F20ILV,
A22TV,
T23A,
261T23HK,
G24DE,
G24KR,
I26V,
and
Q32KR)
had
a
prevalence
≥10%
262(ranging
from
10.9%
to
91.1%)
in
CXCR4-using
viruses,
suggesting
263(and
mimicking
the
trend
observed
in
B-subtype)
that
within
the
264V3
region,
much
more
mutations
are
associated
with
CXCR4
usage
265(
Fig.
1
).
In
fact,
in
a
study
enlarged
to
flanking
V3
regions
that
used
266samples
with
experimentally
determined
phenotype,
mutations
at
26723
positions
within
V3
were
significantly
associated
with
HIV-1
B-
268subtype
X4
viruses,
as
well
as
for
13
positions
in
V2
and
2
in
C4,
269respectively
(
Thielen
et
al.,
2009
).
270A
detailed
analysis
of
the
classical
V3
positions
11
and
25
271showed
that
the
wild-type
amino
acid
at
positions
11
and
27225
(S11S
and
D25D)
were
significantly
associated
with
CCR5-
273usage
(P
=
6.77E
−10;
ϕ
=
0.41),
respectively,
while
S11GR
and
274D25KRQ
mutations
were
significantly
associated
with
CXCR4
usage
275(P
=
3.36E
−4;
ϕ
=
0.31)
(
Fig.
1
).
Among
the
other
mutations
found
276at
V3
position
25
of
HIV-1
C-subtype,
the
prevalence
of
E
(wild-
277type
for
B-subtype)
was
higher
in
CCR5-using
than
CXCR4-using
278viruses
(15.7%
and
6.9%,
respectively,
P
=
0.071).
Conversely,
the
279mutations
K,
N,
P,
Q,
R,
T
and
V
at
position
25
were
mainly
found
280in
CXCR4-using
viruses
(1.2%
in
CCR5
versus
56.4%
in
CXCR4).
Only
281the
mutations
AGS
at
position
25
had
a
similar
prevalence
in
CCR5-
282and
CXCR4-using
viruses.
283The
analysis
of
position
11
showed
the
complete
absence
of
284the
Lysine
at
this
position
in
HIV-1
C-subtype
(while
S11K
is
com-
285mon
in
HIV-1
B-subtype
CXCR4-using
viruses)
and
the
presence
of
286glycine.
This
glycine
is
completely
absent
in
all
V3
sequences
from
287CCR5-using
viruses,
while
it
was
observed
in
12.8%
of
CXCR4-using
288viruses
(P
=
4.74E
−8)
(
Fig.
1
).
When
the
position
11
was
mutated
289(47.5%)
the
corresponding
virus
was
always
CXCR4-using.
290We
also
analyzed
the
V3
region
encompassing
the
amino
291acids
5–8
including
the
N-linked
glycosylation
site
(N
6XT
8).
This
2924 S.Dimonteetal./VirusResearchxxx (2012) xxx–xxx
Table1
V3andgp41chemico-physicalpropertiesofCCR5-andCXCR4-tropicviruses.
CCR5-usingviruses,N=255 CXCR4-usingviruses,N=101 P-valueb
Meanaveragehydrophilicitya 0.06 0.12 <0.001
MeannetchargeatpH7.0a 2.85 5.32 <0.001
NumberofV3sequenceswithoutN-linkedglycosylationsites 4(1.6%) 21(20.8%) <0.001
CCR5-usingviruses,N=255 CXCR4-usingviruses,N=57 P-valueb
Numberofgp41N-linkedglycosylationsites(=3) 7(2.7%) 4(7.0%) >0.05
Numberofgp41N-linkedglycosylationsites(=4) 180(70.6%) 36(63.1%) >0.05
Numberofgp41N-linkedglycosylationsites(≥5) 68(26.7%) 17(29.8%) >0.05
aThemeanhydrophilicityandthemeannetchargewerecalculatedbyusingInnovagen’sPeptidePropertyCalculator(http://www.innovagen.se). b PvalueswerecalculatedbyusingMann–WhitneyUtest(forcontinuousvariables)and2test(forcategoricalvariables).
mutations
at
position
7
have
been
shown
to
abrogate
the
binding
294with
CCR5
co-receptor
(
Huang
et
al.,
2007
),
while
the
loss
of
the
295
glycosylated
site
has
been
associated
with
CXCR4-usage
in
both
B-296
and
C-subtypes
(
Back
et
al.,
1994;
Li
et
al.,
2001;
Losman
et
al.,
1999;
297
Malenbaum
et
al.,
2000;
McCaffrey
et
al.,
2004
).
In
our
dataset,
298N7K
mutation
was
found
only
in
CXCR4-using
viruses
(prevalence
299
7.9%
in
CXCR4-using
versus
0%
in
CCR5-using
viruses;
P
=
5.48E
−6)
300
(
Fig.
1
).
This
suggests
that
N7K
can
be
a
CXCR4
related
marker
also
301in
C-subtype.
In
addition,
the
loss
of
the
N-linked
glycosylation
site
302
was
observed
in
1.6%
of
CCR5-
and
20.8%
of
CXCR4-using
viruses
303
(P
<
0.001)
(
Table
1
)
(
Nabatov
et
al.,
2004;
Polzer
et
al.,
2002
).
304
Considering
the
physical
and
chemical
properties
of
CCR5-305
versus
CXCR4-using
viruses
(
Table
1
),
the
net
charge
of
CCR5-using
306
viruses
(mean
2.85,
median
3.00,
IQR
2.00–3.00)
was
significantly
307
lower
than
that
observed
in
CXCR4-using
viruses
(mean
5.32,
308
median
5.10,
IQR
4.00–6.09)
(P
<
0.001,
Mann–Whitney
U
tests),
as
309
expected
and
already
known
for
the
group
M
subtypes
(
Clevestig
310
et
al.,
2006
).
This
was
due
to
the
presence
of
increased
numbers
311of
K
and
R
residues
that
were
scattered
throughout
the
V3
region
312
of
CXCR4-using
viruses,
including
positions
11
and
25.
Moreover,
313
we
observed
an
increase
in
the
V3
hydrophilicity
in
CXCR4-using
314
viruses
compared
to
CCR5-using
viruses
in
both
C-
(median
0.07
315
for
the
V3
sequences
from
CCR5-using
viruses,
and
median
0.13
for
316
V3
sequences
from
CXCR4-using
viruses
[P
<
0.001,
Mann–Whitney
317U
tests])
and
B-subtype
(median
0.03
for
the
V3
sequences
from
318CCR5-using
viruses,
and
median
0.13
for
V3
sequences
from
CXCR4-
319using
viruses
[P
<
0.001,
Mann–Whitney
U
tests]).
The
increased
320hydrophilicity
of
V3
sequences
from
CXCR4-using
viruses
(for
both
321B-
and
C-subtypes)
can
be
one
of
the
potential
factors
affecting
322the
drift
from
CCR5
to
CXCR4
tropism
in
HIV-1
C-subtype
(
Choge
323et
al.,
2006;
Cilliers
et
al.,
2003;
McCormack
et
al.,
2002;
Ndung’u
324et
al.,
2006
).
This
could
(at
least
in
part)
explain
tropism
changes
325observed
in
HIV-1
C-subtype
infected
patients
who
had
progression
326to
AIDS
during
the
pre-highly
active
antiretroviral
therapy
(HAART)
327era
(
Connor
et
al.,
1997
).
All
these
results
are
consistent
with
previ-
328ously
published
papers
showing
correlations
between
an
increased
329hydrophilicity
and
net
charge
with
syncytium
inducing
ability
and
330CXCR4
usage
(
Fouchier
et
al.,
1992;
Wang
et
al.,
1998
).
These
two
331parameters
can
be
markers
of
tropism
changes
acting
on
secondary
332structure
of
the
V3-loop.
333Another
V3
region
critical
in
modulation
HIV-1
subtype
co-
334receptor
usage
is
the
GPGQ
crown
(at
positions
15–18)
(
Coetzer
335et
al.,
2006;
Lin
et
al.,
2011
).
This
motif
forms
a
proteic
-turn
and
336specific
amino
acid
changes
have
been
shown
to
be
critical
deter-
337minants
of
co-receptor
usage
(
Cormier
and
Dragic,
2002;
Hartley
338et
al.,
2005;
Hu
et
al.,
2000;
Pollakis
et
al.,
2004;
Shimizu
et
al.,
339Fig.1. FrequenciesofHIV-1gp120V3aminoacidchanges.FrequenciesofV3signaturesinHIV-1CCR5-tropicisolateswithFPR≥80%byGeno2Pheno-algorithmprediction
(darkgray)andHIV-1CXCR4-tropicisolateswithFPR≤5%byGeno2Pheno-algorithmprediction(lightgray).Theanalysiswasperformedinsequencesderivedfrom356 patients,255reportedasCCR5-tropicand101reportedasCXCR4-tropicatgenotypictest.Theco-receptorusageofthesequenceswasconfirmedusingFortinbrasPSSM algorithmandthecombinationofcriteriafromthenetchargeand“11/25”rules.Statisticallysignificantdifferenceswereassessedbychi-squaretestsofindependence.P valuesweresignificantatafalse-discoveryrateof0.05followingcorrectionformultipletests.*P<0.05,**P≤0.01,***P≤0.001.Thecodonswithablackdot(22/29)were significantandconfirmedalsousingadatasetofV3sequenceswithphenotypictropismdetermination(423CCR5-and48CXCR4-usingviruses,respectively).
S.Dimonteetal./VirusResearchxxx (2012) xxx–xxx 5
1999;
Suphaphiphat
et
al.,
2003
).
In
this
region,
the
wild-type
340amino
acid
at
V3
position
18
is
an
R
in
B-subtype
and
a
Q
in
C-341
subtype.
In
B-subtype,
the
position
18
was
found
never
mutated
in
342
114
(90.4%)
of
CXCR4-using
viruses
(one
sequence
per
patient),
sug-343
gesting
and
sustaining
a
functional
role
for
the
wild
type
arginine
344
(
Resch
et
al.,
2001
),
possibly
predisposing
B-subtype
viruses
to
use
345CXCR4
co-receptor.
In
C-subtype,
Q18R
was
a
frequent
mutation
346
in
CXCR4-using
viruses
(24.7%),
followed
by
Q18H
(6.9%).
Accord-347
ingly,
the
switch
to
CXCR4
usage
may
require
the
acquisition
of
348
Q18RH
in
order
to
increase
the
V3
net
charge
and/or
to
alter
the
V3
349
conformation
(
Hartley
et
al.,
2005
).
350
In
addition,
the
V3
position
18
(along
with
position
20)
resides
351
in
a
domain
shown
to
be
involved
in
the
binding
with
two
specific
352
glycosphingolipids
(GSLs):
galactosylceramide
and
sphingomyelin.
353
This
binding
has
been
shown
to
mediate
the
attachment
of
HIV-1
354
to
plasma
membrane
microdomains
(rafts)
(
Fantini
et
al.,
2002;
355
Rawat
et
al.,
2005;
Hammache
et
al.,
1998
).
Several
works
suggest
356that
GSLs
are
involved
in
the
entry
of
a
broad
range
of
HIV-1
isolates
357
into
cell
lines
expressing
CD4,
CCR5
and/or
CXCR4,
and
that
a
GSL
358
depletion
blocked
subsequent
viral
fusion
and
infection
(
Hug
et
al.,
359
2000;
Puri
et
al.,
1998
).
Hence,
mutations
at
V3
positions
18
and
20
360may
have
an
impact
on
HIV-1
ability
to
recognize
these
membrane
361
microdomains.
362
Additionally,
29.7%
of
HIV-1
C-subtype
CXCR4-using
viruses
had
363
an
insertion
of
2
amino
acids
between
V3
positions
13
and
14
364
(
Fig.
1
).
This
signature
has
been
observed
in
other
analysis
on
C-365subtype
CXCR4-tropic
viruses
(
Cilliers
et
al.,
2003;
Coetzer
et
al.,
366
2006;
Raymond
et
al.,
2010;
Singh
et
al.,
2009
).
Recently,
Zhang
367et
al.
(2010)
have
shown
that
removal
of
this
insertion
abolished
368CXCR4
utilization
by
dual-tropic
viruses,
indicating
its
critical
role
369
in
modulating
the
binding
to
CXCR4
co-receptor.
Differently,
this
370
insertion
was
never
found
in
CCR5-using
viruses
(
Fig.
1
).
371
The
high
variability
of
the
V3-loop
is
not
surprising,
since
pos-372
itive
selection
has
been
implicated
in
the
maintenance
of
such
373
diversity
(at
individual-
as
well
as
at
population-level)
It
is
likely
374
that
the
principal
driving
force
in
the
evolution
of
HIV-1
gp120-375
V3
region
is
the
cell
receptor
usage,
the
escape
from
host
immune
376
response,
or
a
combination
of
the
two
(
Leal
et
al.,
2007;
Lemey
377
et
al.,
2007;
Ross
and
Rodrigo,
2002;
Shankarappa
et
al.,
1999;
378Williamson,
2003;
Yang
et
al.,
2003
).
379Additionally,
we
selected
from
Los
Alamos
Database
a
new
set
380
of
471
HIV-1
C-subtype
V3-containing
sequences
(one
sequence
381
per
patient),
with
a
phenotypic
characterization
of
HIV-1
tropism
382
(423
CCR5-
and
48
CXCR4-using
viruses,
respectively)
in
order
to
383
confirm
the
correlation
of
V3
mutations
with
different
co-receptor
384
usage.
By
using
this
“phenotypic”
dataset,
despite
the
low
num-385
ber
of
CXCR4-sequences
available,
the
majority
(22/29;
76%)
of
386
V3
signatures
identified
using
genotypic
tropism
prediction
were
387
confirmed
(
Fig.
1
).
Of
note,
in
order
to
assess
the
reliability
of
geno-388
typic
tropism
testing
in
HIV-1
C-subtype,
we
applied
Geno2Pheno
389
and
PSSM
algorithms
to
predict
the
co-receptor
usage
of
the
390
471
V3
sequences
with
phenotypically
determined
viral
tropism.
391
Geno2Pheno
and
PSSM
algorithms
were
96.2%
and
87.5%
concord-392ant
with
the
phenotypic
determination
of
viral
tropism
and
showed
393
a
sensitivity
of
87.5%
and
87.7%,
and
a
specificity
of
95.2%
and
93.8%,
394
respectively.
These
results
support
that
genotypic
tropism
test-395
ing
can
be
a
valuable
tool
to
predict
co-receptor
usage
in
HIV-1
396
C-subtype
and
is
in
line
also
with
other
studies
(
Raymond
et
al.,
397
2010
).
3983.2.
Prevalence
of
gp41
amino
acid
changes
399
Among
the
312
env
sequences
containing
the
V3
and
gp41
400
encoding
regions,
both
Geno2Pheno
and
PSSM
algorithms
pre-401
dicted
57
CXCR4-using
and
255
CCR5-using
viruses.
By
analyzing
402
these
C-subtype
gp41
sequences,
we
found
63
out
of
353
gp41
403