Nucleic Acids Research, Vol. 19, No. 20 5661 -5667
Members of the zinc finger protein
gene
family
sharing
a
conserved
N-terminal module
Margherita Rosati, Maria
Marino,
Annamaria
Franze,
Anna
Tramontanol
and
Giovanna Grimaldi*
International Institute of
Genetics
and
Biophysics, Via Marconi 12,
Naples
80125
and
1lstituto
Ricerche
Biologia
Molecolare, Via Pontina Km 30.600, Pomezia,
Rome, Italy
Received July
3,
1991; Revised and AcceptedSeptember
20,
1991ABSTRACT
We report the isolation of human members of a
sub-family
of structurallyrelated
finger protein genes. These potentially encode polypeptides containingfinger
motifs of theKruppel
type at theC-terminus,
anda
conserved
amino acid module at the N-terminus;because of
its invariantlocation
the latter isreferred
to as
finger
precedingbox (FPB).
The FPB, detectedalso
inpreviously described finger proteins
fromhuman,
mouse
and
Xenopus,
extends
over
approximately
65 amino acidsand
appears tobe
composed of
twocontiguous
modules:FPB-A
(residues
1 -42) and FPB-B
(residues
43-65). The latter isabsent
in someofthe members analyzed. Elements
A and B and the zincfinger
domain are encodedby
separate exons in
the
ZNF2 gene, a human member of thissub-family.
The positioning of introns within this gene is remarkable.One intron flanks
and asecond
interrupts the first
codon of the FPB-A and FPB-Bmodules, respectively.
A third intron occurs a fewnucleotides downstream
of FPB-B marking itsseparation from the remainder of the
codingsequences.
This
organization, together
withthe
absence of
FPB-B in somecDNAs,
supportsthe
hypothesis that
mRNAsencoding
polypeptides
thatinclude one, both or none of the FPB-A
and
FPB-Bmodules
maybe assembledthrough alternative
splicing
pathways. Northern analyses showed that members
of thissub-family
areexpressed
asmultiple
transcripts
in severalcell lines. The
sequencesof distinct
cDNAshomologous
tothe ZNF2 gene indicate that alternativesplicing
eventsadjoin
either
coding
or noncoding
exons to the FPB sequences.
INTRODUCTION
Zincfingers,amino acid stretches
featuring regularly
spaced pairs
of cysteines and histidines or
differently
arranged
cysteines
organized around zinc(II) ions
(1-3),
are consideredmajor
structural motifs involved in
protein-nucleic
acid interaction (4-7). Finger motifs of theC2H2type,originally
describedasEMBL
accession
nos X60152-X60156 (incl.)the potential functional units that mediate the binding of the TFIIIA factor (8, 9) to the Xenopus 5S RNA gene promoter, have been subsequentlyfound in a variety of known or putative DNA binding regulatory proteins. Transcription factors from differentspecies, suchasyeastADRI andhuman SpI (10, 11), as well asproteins encoded by genes playing a pivotal role in Drosophiladevelopment (12, 13) contain arrays of these motifs. The evolutionaryconservationof a linker stretch (the H/C link; 14), first identified in the Kruppel segmentation gene from Drosophila (12), whichconnectsadjacentzinc fingershasbeen exploitedtocloneanumber offinger-encodinggenesfrom several species by cross-hybridization (14-26). Additional members of this large familywereisolated because oftheir properties.Murine Krox-20 (27) and Krox-24 (28), as well as their human counterpartsEGR1 (29)and EGR2(30),areresponsivetoearly growth stimuli; GLI (31), Evi-1 (32) and XT33 (33) are rearranged in tumor cells.
Thesequencesofseveralpotential zincfinger polypeptideshave been reportedto date. Additionalcommon elements outsideof the fingerrepeatswere found incDNA clones fromXenopus. Inthat instancecomparativeanalysisofthepotentially encoded ORFs ledtotheidentificationofseveral discrete motifs shared by the products of distinct finger genes (FAX boxes; 34). Conserved modules in regulatory proteins specify properties relevanttotheiraction; theoccurrenceofsimilar motifs indistinct genes maythereforerepresentthefirstindicationoffunctionally relatedprotein subsets.
Here we reportthe characterization of human members of a sub-familyofzincfingergenesencoding polypeptides that feature
a highly conserved module located at their N-terminus.
MATERIALS
ANDMETHODS
Nomenclature of zinc
finger
recombinant clonesClonesfrom ourlaboratory representativeof humanfingergenes which have beenassignedaZNFsymbolfrom the Human Gene MappingCommittee(35)areindicatedaccordingly;thosewhich donotfall in this categoryarenamedwithaHZF
(human
zinc finger) symbol. In either cases the numbers (X.Y)identify
different (Y)
homologous
(X) clones. Theorigin
of clones is indicated (c, cDNA; g, genomic).*Towhom correspondence should be addressed
5662 NucleicAcids Research, Vol. 19, No. 20
LibraryScreenings, SouthernBlotting, DNASequencing and Sequence Analysis
cZNF2.2 wasisolateduponscreeningofaXgtlOcDNAlibrary from undifferentiatedhuman teratocarcinoma NTera2-Dl cells (36) using a DNA segment derived from clone cZNF2.1 (see legendtoFigure 1). The latterwasisolatedfromacDNAlibrary in X T7-T3 E-H (37), prepared from human liver poly(A)+ RNA asdescribedby Huynh etal. (38), probedwitha500bp longEco RIfragment from the murine clone mKrl
(17)
under reduced stringency (hybridization in 6xSSC, lOx Denhardt's solution,0.5%
SDSat58°C;
washes in2xSSC,
0.5% SDSat50°C). A collection of zinc
finger
recombinants was obtained upon isolation ofcDNA clones from theXgtlO
cDNA library mentionedaboveoranequivalent
oneinXgtl
1(both
were akind gift ofJ. Skowronski and M.F.Singer), using
asprobe
a 700 bpEcoRI-HindIllfragment
from the humanfinger
geneZNF41 (26) under the conditions described above. ClonescHZF3.1,
cZNF41.1 and cHZF7.1 were sorted out from this collection using as a probe aDNA segment from clonecZNF2.2
(probe
B, Figure IA). The conditions used were:
hybridization
in 6xSSC, 10x Denhardt'ssolution,
0.5% SDSat58°C; sequential
washes in 6xSSC,0.5% SDS fromroomtemperatureto450C. Recombinants including the ZNF2 gene
(gZNF2
clones)
wereisolated fromahuman
genomic
library
inpcos2EMBL
(39) using
a DNA segment fromcZNF2.2
(probe A, Figure
IA).
All probes werelabelled to a
specific activity
of 2-5x108 cpm/mg byrandom-hexanucleotide-priming
aspreviously
described
(26).
DNApreparations, gel electrophoresis
and Southernblotting
wereperformed
according
to standard procedures(40). DNAfragments of interestweresubcloned into BlueScript (Stratagene). DNA sequences weredetermined,
by
the dideoxy chain-termination
method,
from double stranded plasmids,phage
and cosmid DNAsaccording
to described procedures (41),using
theSequenase
kit from United States Biochemical(Cleveland,
Ohio).
NBRF
protein
sequencedata basesearchwasperformed using
theprogram FASTA of the GCGsequence
analysis
package (42);
thesequenceusedas a
probe
corresponded
toaminoacids 15-79 of the cZNF2.2 ORF. Aminoacid sequence similarities wereidentifiedbypairwise
comparisons
using
the ISSC(Interactive
SensitiveSequence
Comparison;
43)
program. The COLLATE package ofsecondarystructureprediction
methods(44-46)
waskindlyprovided by Chris Sander (EMBL, Heidelberg). Cell
Cultures,
RNAisolation and Northernanalysis
The human cell lines used are: NTera2-Dl (teratocarcinoma); HepG2(hepatoblastoma);HeLa(cervicalcarcinoma); MOLT-4 (T-cell lymphoma); HL-60 (acute promyelocytic leukemia); Jurkat (B-cell lymphoma); HT-1080 (fibrosarcoma); A1251 (kidney carcinoma). HepG2, HeLa, HT-1080, A1251 and NTera2-D1 cellswerecultured inDMEM, 10% fetal calf serum; MOLT-4, HL-60 and Jurkat cells were grown in RPMI 1640, 10% fetal calfserum.
Five
/g
ofpoly(A)+ RNA, prepared from total RNA (26) were electrophoresed on 1% agarose-formaldehyde gels and transferred to Nytran membranes. Filters were hybridized to32P-labeled
DNA fragments derived from clones cZNF2.2, cZNF41.1, cHZF3.1, cHZF7.1 at 42°C in 50% formamide, 5xSSC, 5xDenhardt's,0.5
% SDS and washed in0.1
xSSC,0.2% SDS at 680C.
RESULTS
A second conserved protein module ispresentin several cDNA clones potentially encoding finger polypeptides
InFigure lA part of the nucleotide sequence of a 3.923 bp long cDNAclone (cZNF2.2),homologoustothehuman ZNF2 (Al.5) finger gene (47), is reported. cZNF2.2 was isolated from a cDNA libraryfrom undifferentiatedhuman teratocarcinoma NTera2-D1 cells, using as probe a DNA segment from apreviously isolated, shortercDNA clone (cZNF2. 1; see Materialsand Methods and below). Conceptual translation of the cZNF2.2 sequence predictedalong ORF (428 residues) which includes 9 finger units precededby 170 aminoacids.The finger motifs exhibit features ofKruppel-likegeneproducts, i.e. theyconform tothe general pattern
CX2CX3FX5LX2HX3H
and are precededby a conserved sequence of seven amino acids, the H/C link (consensus TGEKPYe; 14)The firstinframeATG withinthecZNF2.2 ORF is found at nucleotides 523-525.Visual inspection of sequences revealed thatregions similar to amino acids 16-56and48-76 ofthecZNF2.2 ORFwere present in the murine Zfpl (19) and humanHPF9 (23) finger proteins, respectively. This finding suggested thata subset of these proteins may share amino acids similarities outside the finger domain.ADNAfragment spanning theregion thatencodes amino acids 16-76 ofcZNF2.2(probeBinFigure
lA)
wasusedtoprobe theDNAof65cDNAclonespreviously isolated from the same cDNAlibraryonthe basis of cross-hybridizationtoan H/C link containing probe(seeMaterialsandMethods). 15 clones hybridizedto probe B;ofthese, 10 wereanalyzed furtherand 9 shownby Southern analysisto contain differentinserts (data
notshown). The nucleotidesequenceofthe inserts of three clones (cZNF41.1, cHZF3. 1, cHZF7.1) wasdetermined. Conceptual translation revealedthatanapproximately 65 amino acids long stretch, whichspansthe region ofsimilarity ofcZNF2.2 to
Zfpl
andHPF9 (shaded residues in Figure 1), was shared amongst cZNF2.2, cZNF41.1andcHZF7.1;in cHZF3.1 the relatedness
to the others was restrictedto the first 42 amino acids of this block.
A sequence data base search (see Materials and Methods) identified significant homologies alsoin severalother previously describedKruppel-like finger proteins. These includedthe human ZNF7(24), ZNF8 (24), Koxl (25) and ZFP36 (48) genes and Xfin (15) isolated from Xenopus. Sequences that exhibit similarities to the region mentioned above are reported in Figure2. The region sharedby all sequences is within the N-terminal portion ofthepotential polypeptidesand is separated fromthefinger domainbyamino acidstretches ofvariable length.
In some ofthese, a few similarities were found (see legend to Figure 2). The motif shared by all is referred to as finger precedingbox (FPB). Amino acids similarities within the FPB encompassastretch of 65residues in five out of eleven sequences (cZNF2.2throughtoKoxl); homology isrestricted to the first 42residues in three (cHZF3. 1, Zfpl and ZFP36). On this basis, the FPB region may be viewed as composed of two modules which include residues 1-42 (FPB-A) and residues 43-65 (FPB-B). HPF9 and ZNF8 are truncated cDNA moieties (23, 24) which do not extend over the entire FPB-A region; in
Xfin
theputativeinitiator methionine lies within FPB-A. Amino acid identities between each member and the derived consensus sequence range from45% to 69% in both A and B modules.
into account that different amino acids with similar physico-chemical characteristics (49) occur at several positions.
Ithas to be noticed that in the cZNF2.2 ORF the first in frame ATG is in the FPBmodule (nucleotides 523 -525 in Figure lA). Analysis of a distinctcDNA indicates that analternativesplicing eventplaces an additional coding exoncarryingan in frame ATG just upstream of the FPB-A module (see below and Discussion). Expression of FPB-containing sequences in human cell lines In Figure 3A are shown the results of Northern blot analyses with poly(A)+ RNA from undifferentiated human teratocarcinomaNTera2-D1 cellsprobed at high stringency with
A
DNAfragments including the FPB regions of clones cZNF2.2, cZNF41. 1, cHZF7.1 and cHZF3.1. The probes detected either apredominant(cHZF3. 1; lane 4) or multiple transcripts ofsimilar abundance (cZNF2.2, cZNF41.1, cHZF7.1; lanes 1-3). RNA sizesranged from approximately 7.000 (top band with cHZF7. 1; lane 3) to 2.600 nucleotides (bottom band with cZNF2.2; lane 1). Transcripts of similar lengths were detected with the cZNF2.2, cHZF3. 1 and cHZF7. 1 FPB-containing probes also in poly(A)+ RNA preparations from several other cell lines (Figure 3B), indicating that thesegenes are expressed in various cells. Similar results were obtained with the cZNF41.1 FPB-probe (not shown). This gene was previously shown to be expressed in several cell lines (26).
Exona _ GGAGGG -- IRCHONA T-TCAG
361 ,_.. __
R G A V F P G P E H S V P E .5 F..D VA V:8V F H: 4 Exonb _ GTAGCG --INTRONB-- TAGCAG
481
4 onc _ GTAACTG --INIRCN C-- GTlCCAG
601 G S -__ G:::::Dl:::X::::: :i:V:: L H G S E E R E W P E S V S L D W E T K P E I H D A S D K K S E 4- Exond _ 721 G S L R E C L G R Q S P L C P K F E V H T P N G R M G T E K Q S P S G E T R K K 841 CCTCCCICA S L S R D K G L R R R S A L S R E I L T K E R H Q E C S D C G K T F F D H S S L 961 T R H Q R T HI(T G E K P Y D C R E C G K A F S H R S S L S R H L M S H][T G E S P 1081 T AG 1 C Y E C S V C S K A F F D R S S L T V H Q R I HH[T G E K P F Q C N E C G K A F F D 1201 R S S L T R H Q R I HJ(T G E S P Y E C H Q C G K A P S Q K S I L T R H Q L I HIlT 1321 G R K P Y E C N E C G K A F Y G V S S L N R H Q K A HJ[A G D P R Y Q C N E C G K 1441O A FP D R S S L T Q H Q K I H](T G D K P Y E C S E C G K A F S Q R C R L T RH Q 1561 R V HJ[T G E K P F E C T V C G K V F S S K S S V I Q H Q R R YJA K Q G D I B X EE E I II probeA * B 1-lS1o p * C B GTMGG -IHRNCN A -*R G A V F P G P E H S V P E
361 CGCAC_--I---G-
-C--C-G-A-C-A--G---G-G---I---G-C---...I... CrAAGGACAGGAGGAcATaGGclCIT
SAL V H KE s T Q E R MA A V
GTGAMG -GIIRICN A' - TCATTAG 4 a 1TrICAG
430---ICGACCACCAGATAGh4' 'hq... S P T T R C Q I
alGGT -I8NI A - 7TCAG
::VL:: : : :L:: L H G S E E R E
529 GA A C ....vA A C G All yllllll Gn lT lC llh> A F
... ..
Figure 1. Nucleotide and deduced aminoacid sequence ofalternative cDNAsderived from the ZNF2gene and intron-exon organizationof thegene. A) Part of the nucleotide sequence of thecZNF2.2 insertand the amino acid sequence of thehypothetical zinc fingers containingORFareshown. Numbersontheleftrefer
tonucleotidepositions (total length3.923bp). Asterisksdenotein frametranslation terminationcodons. Thefirstin frame ATG codonis underlined.Brackets delimit the 9finger motifs. Thepositionof the threeintrons (A, BandC)andthesequences presentatthejunctionswith the fourexons(a,b,c,d) in thecorresponding
ZNF2 genearereported.Shading highlightsthe amino acid stretch also found in otherpotential fingerproteins (see text).Aschematicrepresentationof the cZNF2.2
cDNAinsertisdrawnatthebottom. The ORF is shown as abox. Finger repeatsaredelimitedbybars and shaded. Restrictionsites: B, BamiHI; E, EcoRI; X,
XbaI.Thepositionofprobes A,Band C used forlibraryscreeningsand Northernanalysesis also shown. ProbeCcorrespondstothe DNAsegment derivedfrom cZNF2. 1 utilized to isolate cZNF2.2(seetextandFigure lB). B)Partof the nucleotide sequence of cZNF2.1(bottomline;totallength824bp)isalignedtothat ofcZNF2.2 (topline). Hyphensdenote nucleotideidentity.Conceptual translationsare reportedabove and beloweachsequence, respectively. Thefirst in frame ATGfoundineachisunderlined. Numbersonthe leftrefertocZNF2.2 nucleotides asinFigure IA;dashesmarkthemissingexona'. Thepositionandsequence termini of intronsA, A', A"arereported. Thetriangledenotes thepositionof intron B. ThecZNF2.2 insert extends uptonucleotide 1147 of cZNF2.2.
5664 Nucleic Acids
Research,
Vol.19,
No. 20 cZNF2.2 cZNF41.1 cHZF7.1 ZNF7 Ko1x cHZF3.1 Zfpl ZFP36 R G A V F P G P E H S VPe MAANGDSP PWS PALAA8 GRGSSCEHPILSVIDLAKTSLHSFGABAEETHSRTDLSLADHLVQQQKMITMLQ GGGALSPQHSAVTQGSI I KNKEGIMDAKSLTAWSR METQADLVSQEPQALLDSALPSKVPAFSDKDSLGDHMLAAALLKAKSQ M4MG S Q IPRERRPGRSVSASVALGPTLAFKBDTRTPGSWBM 1 15 20 30 40 50 60
Cons. saVt Ve a:'Vd1pTx Wx4 dPx rXL'.rdV: L: Y1 XVc XcKPdvIxxLEq teEP.i4xc x x
..Z
N...
.. ......R
.. ...-.L...
S. ...N~~~~~~~~~~R HI.TIERAAN.KC.I-M L
cZN V H SG R Q IA(C LV1AGOSSAtOFSFLEQSRL cHZP71 T L5 KPS F DR KN4 T1 A::": QI K: KCAN PV QPLN1T DV L R K L
cNP DR NLFVSAG ESI KT CL D SMN K
Xfin MSV P N:"K M KG:S V[ L P RO VS M :K:Y;':L:V SK N
CHZ F32. 1 IR K 5- ASDILNTVH--C--P-D-SS-T-J--H-T-EKP-D--KA-S-S-SL
O-I-f-cZNP412K2L KER P E TEKKIP LTKILEKGHD CNLNGNRRL TCSNFLEGVKHITPPK NGNG
OHZNP7.1 NE LP S GYAGKAMEQDQPQVGDIG G I FP FLHC KF TI TE DW ELKCYWDQDKS E QLE R
GO RP LIDTSP KPSVW CKVDGNMMWHIGQKD1 DKSPLKICIKPRIGSHLEVCDH FGKN
31 R HP E T A5 E K S SCH LO(WPYIO S(IQ ESVHE KP INCR SQLD KIYQE IN cZNF R HS P AW S QKA G VYTSGKRHE H FSPHP PIDT ERYVK T LE4PTC5LPQI~KSSHCL 4VAKL
Xfin7 G Q IWLASD4AIElKI NSCLQKR AQMH KGEK VLCASCGKVPSLCAASQKLN3NVEDSGIH A K S H INfp GE PW KA GQ VENR) DCP RD Q TT LI KN QP P KTYKCTIRGKSPNIRHKIFHTL N RTD F V SALRSQLVQPYNERDLIH KT
ZFP36 E M W EDSLSKGDQC QKTRN RSDD(ANLIKTHTC FG DSIV NT PR CDS GEVV
cZNP4 1.1 VDVNPSVYLIKBRGYEHKNIEKII V T L P I R HN D I K T N H H R S
o~~CHF71 K H HdNLD L L LacdKF F K R NR gsoEo
.VG1RandKiDiV E K T R eNZN V eHatN L nGDPH eSFNT mI orI oYBN B( AeS aSas,IseHlsueH
oS SdLsEN C gGwiRKKGhYG C G F eRre R ,IK
IR D T HK CE QE 1Gat tA fH SFQT Pi LRR m HH DGYK
Alternativesplicing and intron-exon junctions ofthe ZNF2
gene
In Figure lB part of the nucleotide and predicted amino acid
sequences of the insert ofcZNF2.1 (824 bp), a cDNA clone
isolatedfromahuman liverlibrary(seeMaterialsand Methods), arealignedtothose of cZNF2.2. Alignmentstartsatnucleotide 397 of thecZNF2.2insert and endsatposition 1147withinthe
fingerregion. Thetwosequencesarecolinearexcept for 72 bp.
This results in distinct ORFs which differ in the absence
(cZNF2.2) orpresence (cZNF2.1)of24 amino acidsupstream
ofFPB-A (Figure IB).The additionalDNAsegmentin cZNF2. 1 notonlyextends the predictedORF butalso brings inaninframe ATG codon (Figure iB). Two bands of hybridization approximately4.200 and 2.600nucleotides in lengtharedetected
by Northern analysis with probes including the homologous regions of the two cDNAs (probe B, Figure 3; probe C, data
notshown). Because of itslength, thecZNF2.2 insert (over 3.9 kbp) likely derives from the longer transcript. The cZNF2.1 cDNA moiety (824 bp) mayhave originatedfrom eitherRNA, as a difference in length of 72 bp may not result in distinct mobilities of RNAs in the range of 4.000 nucleotides.
Clonescontainingthe ZNF2gene wereisolated fromacosmid libraryof human genomicDNA (Materialsand Methods). Two
overlapping cosmids (gZNF2. 1 and gZNF2.2) were shown by Southern analysis to span the entire cZNF2.2 insert (data not shown). Intron-exon junctions were determined by priming sequencing reactions on the genomic DNA contained in gZNF2. 1 and gZNF2.2 with a battery of oligonucleotides complementary to either the sense or anti-sense strand of cZNF2.2. The positions of the introns (A, B, C) as well as the sequences of their termini are reported in Figure 1A; their positions are also marked in Figure 2. The cZNF2.2 cDNA moiety derives from the joining of four exons (a through to d). Exon d contains the 3' untranslated region (nucleotides 1672 -3923; not shown) as a large portion of the potential ORF including all of the finger domain and the preceding 75 amino acid residues (Figure
1A).
Exon c encodes the entire FPB-B module and the following 15 amino acids. The junction of intron B with exon c breaks the codon for the glycine residue which marks the proposedboundary
between FPB-A and FPB-B (FiguresIA
and 2). Exon b includes the FPB-A coding sequences. Exon a extends to the first nucleotide of the DNA insert.The additional 72 bp found in cZNF2.1 may have been introduced through alternative splicing. This segment is indeed flanked in the ZNF2 gene by splice sites (Figure
iB);
the spliced out portions of intron A are marked as A' and A", while the 72 bp long coding exon is referred to as a'.Nucleic Acids Research, Vol. 19, No. 20 5665 2 3 4 B 2 3 4 5 6 7 8 28 5-..-O w .8~~~~. :: I>~-~ DKQRRRGO TSTSTITT
SQOQKKEA
VDDHDYDN W RHLCLROL 28 s-18 28 5-.;..i.. _ , r ^'::: 18 5-B-ACT IN - _ _ _ MTMMMMMMMM RRKRQRRRRK /1KWKRRRKRMR
5 ,LLLLLLTLLQ DRNAI IEDIS EDDENDSEEN L.* ..Figure3.ExpressionofFPBcontaining fingergenesin humancell lines. Five itgofpoly(A)+ RNA fromhumancell lineswereprocessedasdescribed under
Materialsand Methods forNorthern analysis. (A) Poly(A)+from Ntera2-D1
cellshybridizedtoprobes includingtheFPB-regionsof: 1) cZNF2.2; 2) cZNF41. 1; 3)cHZF7.1; 4)cHZF3. 1. (B) Poly(A)+RNAs from: 1)NTera2-D1, 2) HeLa, 3) HepG2, 4) HT-1080, 5) A1251, 6) MOLT-4, 7) Jurkat, 8) HL-60.Fromtop tobottom the filterwassequentially hybridizedtothesameprobesfromcZNF2.2,
cHZF3.1, cHZF7.1asinpanelA.The filterwasrehybridizedtoa,B-actin probe.
Thepositions ofmigrationof 28S and 18SrRNAsareindicated.
DISCUSSION
Fingerproteins relatedtotheDrosophilaKruppelgeneproduct are encodedby large multigene families in higher eukaryotes.
Inhumans, the number of family memberswasestimatedtobe
overthree hundred(23). The results reported here helptodefine
astructurally related subset ofgenes which potentially encode
polypeptides that share, in addition to finger motifs, a highly
conserved amino acidmodule whichencompassesapproximately
65 residues. Thisregion, referredtoasFPB, invariably precedes
thefingerrepeatsfromwhich isseparated by amino acid stretches
that vary in length and do not share extensive sequence
homologies (Figure 2). Additional short similar motifs found within someofthese linkersequences (Figure 2)may underlie
distinct degrees ofrelatedness among family members. FPBs
comeindiscretedifferentlengthsasonly thefirst42 amino acids are present in some instances (Figure 2). This size variation
indicates thatthe FPBiscomposedoftwomodules,FPB-Aand
FPB-B (amino acids 1-42 and 43-65; Figure 2). This
distinction, based on sequence comparisons, finds a physical
correspondanceintheseparationofthe FPB-A and FPB-Bcoding
sequences in two exons within the ZNF2 gene. A notable
scatteringofinterveningsequencesoccursin factwithin theregion spanningthe FPB module in ZNF2. Threeintrons separatethe
four exons found in a cDNA representative of this gene
(cZNF2.2; Figure IA).The A andBinterveningsequencesflank
andinterruptthe first codons of theFPB-AandFPB-Bmodules, respectively; intron C is placed a few nucleotides downstream
of the FPB-B sequences, marking their separation from the
remainder of thecodingregion (Figures 1-2). Thepositionof introns whithin the ZNF2 gene has interesting implications. Paralogousmembersofmultigenefamilieshaveoftenidentically placed introns (50, 51), therefore, it may be expected thatthe
intron-exon boundaries foundin the ZNF2 geneareconserved
B RQOOKOO/ DEDEDEDE
-KKKRRQCM
FFFSLST
KEEEEEEE <_
Figure4. Helical wheelprojectionofthreeputativea-helices within FPB-Aand
FPB-Bmodules. (A) Putative helices withinFPB-A. Atthetophelical wheel analysisof residues 10-20(seenumbering in Figure 2)atthe bottomof residues
25-34.(B)Aputative a-helix withinFPB-Bfrom 50to57. Circled amino acid correspond to invariant residues. Shading highlights position of conserved
hydrophobic character.Aminoacids whichvaryin distinct membersareall reported
atcorresponding wheel positions; lefttoright typing correspondsto top tobottom presentation in Figure 2. Inthetwohelicesin A, onlyresiduesfromsequences
spanning entirely these regionsarereported.
in other FPB-family members. Coding exons often equate to
distinctprotein domains (52, 53); the FPB A and B modulesmay
turn out to followthis pattern and serve distinctive functions.
Also, onthe basis ofthe intron positions within the FPBarea,
itcanbe hypothesized thatalternative splicing pathwaysleadto
theassemblyofmRNAsencoding polypeptides thatcontainone,
both or noneof the FPB-A and FPB-B modules. The absence
ofFPB-B incHZF3.1, Zfpl and ZFP36 (Figure 2)maytherefore
be theresultofsplicingeventswhich skipped thecognateexons.
The utilization of distinct splicing pathways would allow the assemblyof mRNAsencodingsetsofpolypeptides which differ
intheir FPBcontent andlikely in theirproperties. Transcripts
ofdifferentlengthsarehomologoustothreeof theFPB-containing
cDNAs described here (Figure 3), a result which indicates the
existence of alternative RNAs. Previously reportedgeneswhich
belong to this subset (Zfpl and ZNF7, ZNF8; Figure 2) also
give riseto multiple transcripts (19, 24). Moreover, the small sizes of the FPB-A and FPB-Bexons may not leadto a clear
resolution ofRNAs whichcontain bothormiss either of thetwo. Theisolationof distinct ZNF2homologouscDNAs indicates that alternative splicing also contributes to adjoin FPB coding
sequences to variable exons. cZNF2.1 and cZNF2.2 differ in
72bpplaced,within theformer,immediatelyupstreamof
FPB-A(Figure IB).While the first methionine of the cZNF2.2ORF
A 28 5- IB1S-A Ala- W..:: lmmwqw I-ROOW-v
5666 NucleicAcids Research, Vol. 19, No. 20
is within FPB-A, the additional72 bp exon present in cZNF2. 1 brings in an ATG codon which precedes the FPB coding sequences. Splicing events which skip or insert exon a' may therefore leadtothesynthesisofproteinsdifferinginlength and FPB contentdue to alternativesites ofinitiation of translation. Differentialsplicing leadingtoalternative isoformsof ZFX zinc finger proteins, initiatedatdistinctATGs, has been documented (54). On the other hand, it cannot be excluded that codons upstreamofthe first ATG arerecognizedastranslationstartsites incZNF2.2homologousmRNAs. Severalexamples indicatein fact that, though at reduced efficiency, initiation of translation may occur attriplets other than ATG(55andreferencestherein). Both FPB-A and FPB-B are enriched innegatively charged aminoacids(approximately 20%of the residuesareasparticor glutamic acid)and inhydrophobic leucinesandvalines( > 20% ). Invariant residuesarefoundatthirteenpositionswithin the FPBs analyzed (Figure2). Data base searches did not provide indications of significant similarityto protein motifsof known structure orfunction. Itisofinteresttomentionthat the results of differentsecondarystructureprediction methodsonall aligned sequences(Materials andMethods)consistentlysuggestthatthree regions centered aroundpositions 16 and 30 (FPB-A) andposition 54(FPB-B) haveahigh probabilityof a-helicalfolding. Helical wheelprojections(Figure 4)suggestthatputative a-heliceswithin theseregions would showanamphipathiccharacter; inthetwo FPB-Aputative helices the invariant amino acids would all be positioned on one face of the helix.
Proteinsinvolvedinthecontrol of geneexpression ofteninclude conserved moduleswhich confer
specific
functionalproperties. Common structural motifs as finger repeats and homeoboxes contribute to provide theproducts
oftwomajor
gene families with the ability tointeract specifically with nucleic acids. The presenceofadditional conserved modulesasthepairedbox(56) and the POU domain(57)
definesstructurally
related subsetsof homeoproteins with distinctive properties (58). Although it is difficult tospeculate at thisstage onthe role of FPB modules, theirhighlevel of conservationstrongly suggeststhattheymay have functional and/or structural relevance. In the second hypothesis the presence ofputative amphipathic helical motifs pointsatthepossibilityofinterorintra-molecularinteractions. Recently, the characterization of anothersetofhumancDNA clones encoding polypeptides which contain both zinc finger repeatsandaregion similartothe FPB has been reported (59).ACKNOWLEDGMENTS
We thank J.Skowronski and M.F.Singer for the gift of NTera2-D1 cDNAlibraries, H.Jackle and P.Gruss for the gift
of finger probes, Chris Sander for making the COLLATE package available as well as his own secondary structure prediction program SEGMENT83 and R. Cortese for support and encouragement in the initial phases of this work. We are indebted to P.P.Di Nocera, E.Boncinelli and R.Driscoll for critical reading of the manuscript. This work was supported in partbygrants from P.F.Ingegneria Genetica, C.N.R and from IV Progetto AIDS, Ministero della Sanita.
REFERENCES
1. Diakun,G.P., Fairall, L. andKlug, A. (1986)Nature, 324, 698.
2. Frankel, A.D.,Berg,J.M.,andPabo,C.O.(1987)Proc.Natl.Acad. Sci. USA, 84,4841-4845.
3. Parraga,G., Horvath, S.J., Eisen,A., Taylor, W.E., Hood,L.,Young, E.T., andKievit, R.E. (1988) Science, 241, 1489-1492.
4. Berg,J. (1990)J. Biol. Chem., 265, 6513-6516. 5. Rhodes, D. andKlug, A. (1986) Cell, 46, 123-132. 6. Brown, R.S., andArgos, P. (1986) Nature, 324,215. 7. Evans, R.M., andHollenberg, S.M. (1988) Cell, 52, 1-3.
8. Miller, J.,McLahan, A.D.,andKlug, A. (1985) EMBO J., 4, 1609-1614. 9. Brown, R.S., Sander, C.,andArgos,P.(1985) FEBSLett., 186,271-274. 10. Hartshome, T.A., Blumberg, H., and Young, E.T. (1986)Nature, 320,
283-287.
11. Kadonaga,J.T.,Carner,K.R., Masiarz, F.R.,andTijan,R. (1987)Cell,
51, 1079-1090.
12. Rosenberg, U.B., Schroder, C., Preiss, A., Kienlin, A., Cote, S., Riede, I., andJackle, H. (1986) Nature, 319, 336-339.
13. Tautz, D., Lehman, R., Schnurch, H., Schuh, R., Seifert, E., Kienlin,A.,
Jones,K. andJackle,H. (1987) Nature, 327, 383-389.
14. Schuh, R., Aicher, W., Gaul, U., Cote, S., Preiss, A., Maier, D.,Seifert,
E., Nauber, U., Schroder, C., Kemler, R.,andJackle,H.(1986) Cell,47, 1025-1032.
15. Ruiz iAltaba,A.,Perry-O'Keefe,H.,andMelton,D.A.(1987)EMBOJ., 6,3065-3070.
16. Koster, M., Pieler, T., Poting, A.,andKnochel, W.(1988)EMBOJ.,7, 1735-1741.
17. Chowdhury,K., Deutsch, U.,andGruss, P. (1987) Cell, 48, 771-778. 18. Chavrier, P., Lemaire, P., Relevant, O., Bravo, R.,andChamay,P.(1988)
Mol. Cell. Biol., 8, 1319-1326.
19. Chowdhury, K., Dietrich, S., Balling,R.,Guenet,J.L.andGruss,P.(1989)
Nucleic AcidsRes., 17, 10427-10438.
20. Cunliffe,V.,Koopman, P.,McLaren,A.andTrowsdale,J.(1990)EMBO
J., 9, 197-205.
21. Ruppert, J.M., Kinzler, K.W.,Wong, A.J.,Bigner,S.H., Kao, F.T., Law,
M.L.,Seunaez,H.N., O'Brien,S.J.,andVogelstein,B.(1988)MolCell. Biol., 8, 3104-3113.
22. Pannuti, A.,Lanfrancone,L.,Pascucci,A.,LaMantia, G., Pellicci,P.-G. and Lania, L.(1988)Nucleic AcidsRes.,16,4227-4237.
23. Bellefroid, E.J.,Lecocq, P.J., Benhida, A., Poncelet, D.A., Belayew,A., and Martial,J. A. (1989) DNA, 8, 377-387.
24. Lania,L.Donti, E., Pannuti, A., Pascucci,A.Pengue, G.,Feliciello, I.,
LaMantia, G., Lanfrancone, L.,andPelicci, P-G. (1990) Genomics,6, 333-340.
25. Thiesen, H.J. (1990)NewBiol., 2,363-374.
26. Franzd,A. M.,Archidiacono,N.,Rocchi,M.,Marino,M.andGrimaldi,
G. (1991)Genomics, 7, 728-736.
27. Chavrier, P., Zerial, M., Lemaire, P., Almendral,J., Bravo,R.andChamay,
P. (1988)EMBOJ., 7, 29-35.
28. Lemaire, P., Relevant,O.,Bravo,R.andCharnay,P.(1988) Proc.Natl. Acad.,Sci. USA, 85,4691-4695.
29. Sukhatme, V.P., Cao, X., Chang,L.C.,Tsai-Morris, C.H., Stamenkovic,
D., Ferreira,P.,Cohen, D.R., Edwards,S.A.,Shows, T.B., Curran,T., LeBeau, M.M., andAdamson, E.D. (1988) Cell, 53, 37-43. 30. Joseph, L.J.,LeBeau, M.M., Jamieson,G.A., Acharya, S., Shows,T.B.,
Rowley, J.D.,andSukhatme,V.P.(1988)Proc.Natl.Acad. Sci. USA,85,
7164-7168.
31. Kinzler,K.W., Ruppert,J. M., Bigner,S.H. andVogelstein,B. (1988)
Nature, 332, 371-374.
32. Morishita, K., Parker,D.S., Mucenski, M.L.,Jenkins, N.A., Copeland,
N. G. and Ihle,J. N. (1988) Cell, 54, 831-840.
33. Call,K.M., Glaser, T., Caryn, Y.I., Buckler,A.J., Pellettier, J., Haber,
D.A., Rose, E.A., Kral, A., Yeger, H., Lewis, W.H., Jones, C., and Housman, D.E. (1990) Cell, 60,509-520.
34. Knochel, W., Poting, A., Koster, M., El Baradi, T., Nietfeld, W., Bouwmeester,T.,andPieler,T. (1989)Proc. Natl. Acad. Sci. USA, 86, 6097-6100.
35. McAlpine,P.J.,Shows,T.B., Boucheix, C.,Stranc, L.C., Berent, T.G., Pakstis,A.J.andDoute,R.C.(1989) Cytogeneticsand Cell Genetics, 51, 13-66.
36. Skowronski,J.,Fanning,T.G.andSinger,M.F. (1985)Mol.Cell. Biol., 8, 1385-1397.
37. Grimaldi, G., Manfioletti,G. andSchneider,C.(1987)Nucleic Acids Res., 15,9608.
38. Huynh,T.V., Young,R.A.andDavis,R.W.(1988)InGlover, D. M. (ed.) DNACloning-Apractical approach.IRLPress,Oxford, Vol.I,pp. 49-78. 39. Poustka, A.,Rackwitz, H.R., Frischauf,A.M., Hohn, B.andLehrach,
H. (1984)Proc. Natl. Acad. Sci. USA, 81,4129-4133.
40. Maniatis,T., Fritsch, E.F. ,andSambrook,J. (1982)Molecular Cloning:
Alaboratorymanual. ColdSpringHarborUniversityPress, Cold Spring Harbor.
42. Devereux, J.,Haeberli,P.andSmithies,0. (1984) Nucleic AcidsRes.,12, 387-395.
43. Argos, P. (1987)J. Mol. Biol., 193, 385-396.
44. Gamier, J.,Osguthorpe, and Robson,B.(1978) J. Mol. Biol.,120,97- 120.
45. Gibrat, J.,Gamier,J.andRobson, B.,(1987)J.Mol.Biol., 198, 425-443. 46. Ptitsyn, O.B. and Finkelstein, A.V. (1983)Biopolymers, 22, 15-25.
47. Archidiacono, N., Grimaldi, G., Rocchi, M. and Romeo, G. (1989) Cvtogenet. Cell Genet., 51, 952.
48. Cunliffe, V. andTrowsdale,J. (unpublished). S08686.
49. Mc Keon, F.D., Kirschner, M.W. and Caput, D. (1986) Nature, 319, 463-468.
50. Gilbert, W. (1985)Science,228, 823-824.
51. Blake, C. C. F., (1985) Int. RevCvtol., 93, 149-185.
52. Collins,F., S.andWeissmann, S. M.(1984)Progr. Nucl. AcidRes.Mol. Biol.,31, 317-438.
53. Wilde, C.D., Chow, L.T.,Wefald, F.C. andCowan, N.J. (1982)Proc. Nati. Acad. Sci. USA,79, 96-100.
54. Schneider-Gedicke,A.,Beer-Romero, P.,Brown,L.G.,Mardon, G., Luoh, S.-W., and Page, D.C. (1989) Nature, 342, 708-710.
55. Saris,C., J.M., Domen,J.andBerns,A.(1991) EMBO J., 10, 655-664. 56. Bopp, D.,Burri, M.,Baumgarter, S.,Frigerio, G., and Noll,M. (1986)
Cell, 47, 1033-1040.
57. HerrW.,Sturm,R.A.,Clerc, R.G.,Corcoran, L.M.,Baltimore,D.,Sharp, P.A., Ingraham, H.A., Rosenfeld, M.G., Finney, M., Ruvkun, G., and
Horvitz, H.R. (1988) Genes Dev., 2, 1513-1516.
58. Ruvkun, G. and Finney, M. (1991)Cell, 64, 475-478.
59. Bellefroid,E.J.,Poncelet,D. A.,Lecocq, P.J., Relevant,0.andMartial, J.A. (1991) Proc. Natl. Acad. Sci. USA, 88, 3608-3612.