• Non ci sono risultati.

From CIL to Java bytecode: Semantics-based translation for static analysis leveraging

N/A
N/A
Protected

Academic year: 2021

Condividi "From CIL to Java bytecode: Semantics-based translation for static analysis leveraging"

Copied!
28
0
0

Testo completo

(1)

Contents lists available atScienceDirect

Science

of

Computer

Programming

www.elsevier.com/locate/scico

From

CIL

to

Java

bytecode:

Semantics-based

translation

for

static

analysis

leveraging

Pietro Ferrara

a

,

b

,

,

Agostino Cortesi

b

,

Fausto Spoto

c aJuliaSoft,Verona,Italy

bUniversitàCa’FoscaridiVenezia,Italy cUniversitàdiVerona,Italy

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received28June2019

Receivedinrevisedform20December2019 Accepted8January2020

Availableonline31January2020

Keywords:

Staticanalysis Abstractinterpretation Javabytecode CIL

AformaltranslationofCIL(i.e.,.Net)bytecodeintoJavabytecodeisintroducedandproved soundwith respecttothelanguagesemantics. Theresultingcodeisthenanalyzedwith Julia,anindustrialstaticanalyzerofJavabytecode.Theoverallprocessoftranslationand analysisisfast,scalestoindustrialprograms,andintroducesanegligiblenumberoffalse alarms. The main contribution ofthis work is to leverage existing,mature, and sound analyzers for Java bytecode by applying them alsoto the widerange of.Net software systems.Experimentalresultsshowtheactualeffectivenessofthisapproachwhenapplied to all the system libraries of the Microsoft .Net framework version 4.0.30319 (about 5 MLOCs).

©2020TheAuthor(s).PublishedbyElsevierB.V.Thisisanopenaccessarticleunderthe CCBY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

Static analysis infers, at compile-time, properties about the run-time behavior of computer programs. It allows one to verify, forinstance,the absence ofrun-time errors orsecurity breaches. Static analysis applies alsoto compiled code inassemblyorbytecodeformat.ThisisparticularlyinterestingforapplicationsdistributedontheInternet,ordownloaded frompublic(andpossiblyunsafe)applicationrepositories(e.g.,theGooglePlaystore),whenthesourcecodeisnotavailable, buttheuserwouldliketostaticallychecksomesafetyorsecurityproperties.

The analysisofJava bytecode(from now on,JB) forthe Java VirtualMachine hasa long research tradition andmany analyzersexist [1].Some analyses buildonformal mathematicalroots,such asabstractinterpretation [2–6].Moreover,JB makes the design ofstaticanalysis easierby requiringbytecode to be type-checkable [7] andwithout unsafe operations such asfree pointer operations.On thecontrary, CILbytecode,that is,the compiledbytecode usedforthe .Netplatform (from nowon, justCIL), hasnot received muchattention fromthestaticanalysis communityyet.Therefore, whilethere areseveralsyntactic(thatis,notbasedonformal methods)staticanalyzersforCIL,thereareveryfewindustrialsemantic analyzersbasedonformalmethods.Moreover,CILcanbeusedinanunsafe way,thatis,byallowingfreepointeroperations, whichmakesitsstaticanalysisharder.However,theseoperationsareveryoftenusedinverycontrolledcontexts.Hence,in mostcases,astaticanalyzercouldpossiblycapturetheiractualbehavioranyway.

Despitecleardifferences,JBandCILsharestrongsimilarities:botharelow-levelobject-orientedlanguageswhereobjects are stored andshared in theheap. Hence, it is tempting to leverage mature existing staticanalyses and toolsfor JB by translating CILintoequivalent JB andbyrunning thetoolsonthe latter.Obviously,thisintroduces issuesabouttheexact

*

Correspondingauthor.

E-mailaddresses:pietro.ferrara@unive.it(P. Ferrara),cortesi@unive.it(A. Cortesi).

https://doi.org/10.1016/j.scico.2020.102392

0167-6423/©2020TheAuthor(s).PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).

(2)

meaningofequivalence betweenCILanditstranslationintoJB.Moreover,thetranslationshouldnotintroducecodeartifacts thatconfusetheanalyzerandshouldworkonindustrial-sizeCILapplications,supportingasmanyunsafepointeroperations aspossible.

1.1. Contribution

Themaincontributionofthiswork1 istheintroductionofatranslationofCILtoJBthatisboththeoreticallysoundand effectiveinpractice,sothatanindustrialstaticanalyzerforJBcanbeappliedto.Net(andinparticularC#)programs.More languagescompileintoJB(e.g.,JavaandScala)andCIL(e.g.,VB.NetandF#),withdistinctfeaturesandcodestructures.Here, wefocusonJavaandC#,thathavesimilarstructureandcompileintocomparablebytecode.

We startbyformalizingtheconcretesemantics ofarepresentative subsetofCILandJB,andthetranslationofCILinto JB.Then,weprovethistranslationsound,inthesensethattheconcretesemanticsoftheinitialCILprogramisequivalentto thatofthetranslatedJBprogram.Thisguaranteesthat,ifweproveapropertyoftheJBprogram,thensuchpropertyholds also forthe original CIL program. Ourimplementation supports thefull set of CILsafe statements.Fig. 7in Appendix B

reportsalltheCILstatementsasdefinedbyECMA335standard [9],andifandhowwetranslatethem.Thenwepresenta deepexperimentalevaluationoverindustrial-sizeopen-sourcepopularprograms,byapplyingtheJuliastaticanalyzer [4] to thetranslatedJB.

Ourexperimentsarefocusedonthreemainresearchhypothesestoprovethescalability,precision, andcoverageofour approach.Inparticular,thisarticleanswersthefollowingresearchquestions:

ResearchQuestion1(Scalability).DoestheCILtoJBtranslationscale,thatis,(i)canitdealwithlibrariesofindustrialsize(100 KLOCs) inafewminutes,and(ii)isitscomputationaltimenegligiblew.r.t.thatrequiredbytheoverallanalysis?

ResearchQuestion2(Precision).DoestheCILtoJBtranslationintroducelessthan10%offalsealarmsw.r.t.theoverallnumberof alarmsproducedbytheanalyzerwhenconsideringhighseveritywarnings?

ResearchQuestion3(Libraries).DespitesupportingonlyasubsetofCIL,doestheCILtoJBtranslationsucceedonatleast95%ofthe systemlibraries?

About scalability,theoverallgoalistoobtainatool thatcan beappliedduring thenormalsoftwaredevelopment life-cycle, thatis,itcanperformtheanalysisateachbuildofthesystem.Therefore,weneedatoolthatanalyzeshundreds of thousandsofLOCsinfewminutes.

About precision,the10%bound waschosen assuchlimitisperceived asstandard bythecommunitywhenevaluating theeffectivenessofstaticanalyzers.Falsealarmsarenotrealissuesinthecode,buttheyareduetosomeformsof approx-imationperformedbytheanalysis.Forinstance,arecentpaper [10] statedthatGoogleadoptsastaticanalyzerduringcode reviewifit“produceslessthan10%effectivefalsepositives”(aka,alarms),sincesuchamountofnoise doesnotcompromise thepracticaleffectivenessofthetoolwhenusedbysoftwaredevelopers.Notethatourresearchquestionreferstothefalse alarms duetothetranslation,andthatwouldnotbeproducedonthesamecodewritteninJavabytheJavaanalyzer.For instance,letusassumethatananalyzerproduces10falsealarmsonaJavaprogram.ResearchQuestion2issatisfiedifthe analysison thesameprogramwrittenin.NET (andtranslatedto JBbeforebeinganalyzedbythe Javaanalyzer)produces atmost11false alarms(the 10produced ontheJava program–that aredueto theimprecisionoftheanalysis– and1 introduced bythetranslation).Inourexperience,thesefalsealarmsusually happenbecausethe.NETcompilerintroduces some patterns(e.g.,whencheckingthenullnessofavariable)thataredifferentfromthoseproducedbytheJavacompiler, andtheJavastaticanalyzerapproximatesthemtoocoarsely.

Withrespecttoquestion 3,itiscrucialthatastaticanalyzerunderstandsthebehavior ofsystemlibraries(e.g.,semantics ofmethodcallsandassumptionsmadeontheirparameters)orotherwiseitcouldonlyrelyonmanualannotationsor (pos-siblyunsound) assumptions ontheirexecution. However,systemlibraries needtoaccessmemorythroughunsafepointer. Javaallowssuchbehaviors throughnativemethods(writteninlanguagesotherthanJavaandboundthroughtheJavaNative Interface),while.Net allowsunsafepointersdirectlyinits samecode.Inthesecases,ourtranslationproduces Javanative methods.ThroughResearchQuestion3,weensurethattheeffortofmanuallyannotating.Netlibrariesiscomparabletothat neededforJava.

Ourtranslationcouldalsobeapplied toletJavaandC# codeinteroperate,bycompilingthembothintoJavabytecode. Thisrequires,however,thetranslationofthelibraries calledby theapplications,includingthestandardlibraryofC#that containsrelevantportionsofunsafeandnativecode,whichisoutofthescopeofthiswork.

1 Thispaperisarevisedandextendedversionof [8],distinguishedpaperawardatICSE-FormaliSE(2018).Inparticular,thispaperaddsdetailedformal

(3)

1.2. Relatedwork

Few attemptshavebeenmadeinthepasttotranslate CILtoJB.Grasshopper isprobablythemostpopularone. How-ever, it is not available anymore.2 As far as we can see, it was abandoned about a decade ago, and we cannot make any comparisonwith our translation. A similar tool is CLR2JVM [11]: it translatesCIL to an intermediate X M LC L R rep-resentation, that can be then translated into X M LJ V M, and finally to JB. As far as we can see,3 the tool should read .NET executables, but it failed to parse all the executable files of our experiments (see Section 5 for the complete list). This probably happened because CLR2JVM is not maintained any more (the last commit to the repository https:// sourceforge.net/p/xmlvm/code/HEAD/tree/trunk/xmlvm/src/clr2jvm/occurredmorethansixyearsago),anditdoesnot sup-portthelastCILversions.NeitherGrasshoppernorCLR2JVMhasanydocumentationordiscussionabouthowthetranslation isperformed(in particular,howtheyhandleinstructionsthat aredifferentbetweenCILandJB,such asdirectreferences). Therefore,asfaraswecansee,ourtranslationfromCILtoJBistheonlyonethatworksonrecentreleasesofCILandJB, andisformalizedandprovedsound.

Other translationsbetweenlow-level languagesexist,justified bythe needofapplying verificationtoolsthatwork on aspecificlanguage only.Forinstance,[12] definesa translationfromBoogieintoWhyMLandproves itssoundness,aswe havedonefromCILtoJB.Similartranslationsworkalsoatruntime,inparticularinsideajust-in-timecompiler,asin [13]. However,wedidnotfindanyliteratureonthetranslationofCILintoJBforindustrial-sizesoftware.

Many other static analyzers for .Net exist, in particular forC# code. There are tools that verify compliance to some guideline,suchasFxcop [14] andCoverityPrevent [15].Others,suchasNDepend [16] andCodeMetrics [17],provide met-ricsaboutthe codeunderanalysis. ReSharper [18] applies syntacticalcodeinspections, finds codesmells andguarantees complianceagainstcodingstandards.Asfaraswecansee,thereexistonlytwomainfundamentaltoolswithscientificbase: Spec# [19],an extension ofC#withstaticcheckingofvariouskindsofmanual specifications,andCodeContracts [20],an abstractinterpretation-based staticanalyzerfor CIL.Inthe Javaworld, the numberofstaticanalyzers basedon syntactic reasoning(thatis,not basedonformal methods),such asCheckstyle [21],FindBugs [22] andPMD [23],iscomparableto that for.Net. However, Java attractedmuch more attentionfromthe scientific community, andmoresemantic analyzers havebeenintroducedduringthelastdecade,suchasCodeSonar [24],ThreadSafe [25] andJulia [4].Instead,therearequite fewer semantic static analyzers for.NET. In addition,few semantic analyzers, such asWALA [26], havebeen applied to variouslanguages(e.g.,JavaandJavaScript),butwithad-hoctranslationofthesourcetotheanalyzedlanguage.

Ourapproach lets usapply allthe Javaanalyzers on.Net programs (almost)forfree, that is,by translatingCIL intoJB andby using theanalyzers asthey are (weexpect that few manual annotations are neededtoimprove theprecision of theanalysis,inparticularwhendealingwithlibrarycalls).WehavealsostudiedperformanceandresultswithJulia.Asfar asweknow, ourworkisthe firsttranslationofCILintoJBforstaticanalysisthat isprovento besound andcomes with evidencethatthistranslationapplies toindustrial-sizesoftware, withresultsthatarecomparableintermsofprecision and efficiencytothoseobtainedonJB.

2. Background

Thissection providessome backgroundon CILandJB,arunningexample,anda discussiononthe architectureof our approach.ForanexhaustivedefinitionofJBandCIL,see [7] and [9],respectively.

2.1. CILandJB

Bytecode is a machine-independent low-level programming language, used as target of the compilation of high-level languages, that hence becomes machine-independent. Bytecode languages are interpreted by their corresponding virtual machine, specific to each execution architecture. Both .Net and Java compile into bytecode. However, they use distinct instructionsandvirtual machines..Net compilesintoCIL,whileJavacompiles intoJB.Thesehavestrongsimilarities:both use an operand stack for temporary values andan array of local variables standing for source code variables; both are object-oriented, withinstructions forobject creation, field accessand virtual methoddispatch. Despitethese undeniable similarities, CIL andJB differforthe wayof performing parameter passing (CIL uses aspecific array ofvariables forthe formal parameters, while JB merges them into the array of local variables); they handle object creation differently (CIL createsandinitializes theobjectatthe sametime, whiletheseare distinctoperationsinJB);they allocatememoryslots differently(inCILeach valueusesaslot,whileJBuses1or2slotsper32- or64-bitvalues,respectively);finally,CILuses pointersexplicitly,alsoin type-unsafeways, while JBhasnonotion ofpointer.We focusourformalization ona minimal representativesubsetofJBandCIL,asdefinedinFig.1.Thatfigurepresentsbytecodeinstructionsfor:

arithmetic: JBhastype-specificoperations,suchas

iadd

and

ladd

toaddtwointegerorlongvalues,respectively.Instead, CILhasgenericoperations,suchas

add

toaddtwonumericalvaluesofthesametype;

2 Wewereunabletoaccessthewebsitehttp://dev.mainsoft.comthat,frompastforumdiscussions(http://stackoverflow.com/questions/95163/differences -between-msil-and-java-bytecode),seemstobethewebsiteofthetool.

(4)

JB CIL iadd

add (arith. op.)

ladd iload i ldloc i (local vars) lload i stloc i aload i ldarg i istore i lstore i astore i invokevirtual call (meth. call) invokestatic new T newobj T(· · · ) (objects) getfield f ldfld f putfield f stfld f

if_icmpgt bgt (cond. branch)

dup dup (stack) dup2 ldloca i (pointers) stind ldind

Fig. 1. JB and CIL minimal bytecode languages.

localvariablesaccess: JBhasasinglearrayofvariablesforbothlocalvariablesandmethodarguments,andreadsandwrites valuesfromthisarraythroughx

load

andx

store

,wherexis

i

forintegervalues,

l

forlongvaluesand

a

forreferences, respectively.Inthisarray,64bitsvaluesusetwosubsequentslots.CIL,instead,usestwoarrays:oneformethod’sarguments (

ldarg

i

loadsthevalueofthei-thargument)andoneforlocalvariables(

ldloc

i

and

stloc

i

readandwrite the i-thlocalvariable,respectively).Inaddition,itusesoneslotbothfor32- and64-bitvariables;

methodcall: JBhasseveralkindsofmethodcallinstructions,suchas

invokevirtual

and

invokestatic

.Instead,CIL hasaunique

call

instruction;

objectmanipulation: inJB,instructions

new

,

getfield

,and

putfield

allocateanewobject,read,andwriteitsfields, respectively.CILhassimilarinstructions

newobj

,thatalsocallstheconstructor,

ldfld

and

stfld

;

conditionalbranch: JBhastype-specificconditionalbranchinstructionssuchas

if_icmpgt

(tobranchifthegreaterthan operatorreturnstrueonthetopmosttwointegervaluesoftheoperandstack);CILhasgenericinstructionssuchas

bgt

; stack: CILduplicates thetopvalueofthestackthroughthe

dup

instruction.JBdoesthesamewith

dup

for32bitsvalues and

dup2

for64bitsvalues;

pointers: CIL contains some instructions to load the address of a local variable (

ldloca

i

), and to store and load a value into the memory cell pointedby a reference (

stind

and

ldind

, respectively). Instead, JB has no direct pointer manipulation.

Intherestofthisarticle,StCILandStJBdenoteCILandJBinstructionsorstatements,respectively.Amethod(bothinCIL

andJB)isrepresentedby(i) asequenceof(possiblyconditionalandbranching)statements,and(ii)thenumberandstatic typesofargumentsandlocalvariables.

2.2. Runningexample

Fig.2showstherunningexamplethat Sections3and4usetoclarifytheformalization.TheC#codeinFig.2adefines a class

Wrap

that wrapsan integer value, and a staticmethod

WrapsCollection

that, given an integer

n

, returns a collectionofn wrappers containingvaluesfrom0 to n

1.Fig.2bpresentsthe(simplified)CILobtainedfromits compi-lation:asusualwithCIL,codeisunstructured(e.g.,therearebranchesatlines 7and 20),andeach sourcecodestatement could be translatedinto manybytecodestatements (e.g.,line 7 inFig. 2a compiles into lines 3and 4in Fig.2b). Fig.2c presentstheresultsofourtranslationofCILintoJB.Thenextsectionsexplainthestepsofthetranslation.Firstofall,notice some ofthedifferenceshighlightedinSection 2.1.Namely,thetype-genericCILstatement

stloc.1

atline 6ofFig.2bis translatedintothe type-specific

istore_2

atline 7ofFig. 2c.Similarly,

newobj

(line 11)istranslatedinto multipleJB statements(line 12-16).Therunningexamplecontainssomeinstructionsthatarenotpartoftheminimallanguagedefined inSection 2.1,andinparticular(i)

ldc

and

iconst_*

thatloadconstant(integer)valuesinCILandJB,respectively,(ii)

blt

and

if_icmplt

forconditionalbranchingwhenan(integer)valueisstrictlylessthananother,(iii)

invokespecial

(5)

1 public class Wrap

2 {

3 int f ;

4 Wrap(int f) { this . f = f ; }

5 static ICollection <Wrap>WrapsCollection(int n)

6 {

7 ICollection <Wrap> result = new List<Wrap>();

8 for ( int i = 0; i < n; i ++)

9 result .Add(new Wrap(i));

10 return result ;

11 }

12 }

(a)TheC#sourceoftherunningexample.

1 static ICollection <Wrap>

2 WrapsCollection(int n) { 3 newobj List<Wrap>::.ctor() 4 stloc.0 5 ldc.0 6 stloc.1 7 br #18 8 9 ldloc.0 10 ldloc.1 11 newobj Wrap::.ctor(int32)

12 call ICollection <Wrap>::Add

13 ldloc.1 14 ldc.1 15 add 16 stloc.1 17 18 ldloc.1 19 ldarg.0 20 blt #9 21 22 ldloc.0 23 ret 24 }

(b)Compiling(a)intoCIL.

1 static WrapsCollection(I)LICollection {

2 new List

3 dup

4 invokespecial List.< init >

5 astore_1 6 iconst_0 7 istore_2 8 goto 23 9 10 aload_1 11 iload_2 12 istore_3 13 new Wrap 14 dup 15 iload_3 16 invokespecial Wrap.<init>

17 invokevirtual <ICollection .Add>

18 iload_2 19 iconst_1 20 iadd 21 istore_2 22 23 iload_2 24 iload_0 25 if_icmplt 10 26 27 aload_1 28 areturn 29 } (c)Translating(b)intoJB.

Fig. 2. The C# code, CIL, and JB of the running example.

1 void init ( ref int i )

2 { 3 i ++; 4 } 5 6 void run() 7 { 8 int i =0; 9 init ( ref i ); 10 } (a)C#code

1 void init (WrapRef i) {

2 i .value = i .value + 1;

3 }

4

5 int run() {

6 int i = 0;

7 WrapRef wrap = new WrapRef();

8 wrap.value = i ;

9 init (wrap);

10 i = wrap.value;

11 }

(b)Javacode

1 void init ( ref A& A)

2 { 3 ldarg.0 4 ldarg.0 5 ldind.i4 6 ldc. i4 .1 7 add 8 stind.i4 9 } 10 11 int run() 12 { 13 ldc. i4 .0 14 stloc.0 15 ldloca.s 0

16 call void Temporary.Foo

17 ::’ init ’( int32&)

18 }

(c)CIL

Fig. 3. CIL code usingrefparameters.

2.2.1. Examplewithdirectreferences

SafeC#codeadoptsdirectpointersonlyfor

out

and

ref

methodparameters.Theseparameters canbeassigned(and readaswell incaseof

ref

)insidethemethod,and“anychangetotheparameterinthecalledmethodisreflectedinthe callingmethod”.4 Inourtranslation,webuild wrapperobjectstosoundlyrepresenttheir semantics.Consider forinstance theC#codeinFig.3a.Method

init

receivesa

ref

parameteranditincrementsitbyone.Thisiscompiled(Fig.3c)into amethodthat readsthevalue pointedby thedirectreference(line 5),andwritesit(line 8). Ourgoalistotranslate this code intothe Javacode inFig. 3b: we simulatethedirect referenceby constructing awrapper object(line 7), assigning thevalueofthelocalvariabletofield

value

ofthewrapper(line 8),andthenpropagatingbacktheresultsofthecallto

init

(line 9)byassigningthevalueofthelocalvariablewiththeonestoredinthewrapper(line 10).Inaddition,the

ref

parameterisreplacedbythetypeofthewrapperobject.

(6)

Fig. 4. Architecture of the analysis.

2.3. Julia

Julia [4] isa static analyzerforJB, based onabstract interpretation [2]. It transforms JBto basic blocksof code,that it analyzesthroughafixpointalgorithm.Analysesare constraint-basedordenotational.Currently, Juliafeaturesaround 70 checkers,includingnullness,termination,synchronizationandtaintanalysis.SinceJuliaworksonJB,inprincipleitanalyzes anyprogramminglanguage thatcompilesintothatbytecode.Inparticular,inthisarticleweapplyittotheanalysisofthe compilation of CILinto JB.Note thatJulia verifies that the analyzedbytecode iswell-formed, that is, itmainly performs sanity checks to avoid the static analyzerto fail while analyzing the bytecode. Therefore, all the translated bytecode is verified by Julia before beinganalyzed. Julia relies onJava annotations in order to allow a userto specify the semantic modelofsomecomponents(e.g.,ifamethodisanentrypointinaparticularruntimeenvironment,oranAPIcallreturns user input). https://static.juliasoft.com/docs/latest/annotations.html reportsthe full list of Julia’s annotations. While these annotationsaddsemanticinformationabouttheapplicationandthelibraries,theydonotaffectthetranslationfromCILto JB.

Fig.4isahighlevelview ofouranalysis,wherethegraycomponentshavebeenimplementedandmodifiedtosupport .NETanalyses.Javacodeiscompiledby

javac

intoa

jar

file,thenparsedthroughtheByteCodeEngineeringLibrary [27] (BCEL). Juliareceives thislatterformat, appliesits analysis(byusingmanycomponents such asthe checkers,thatdefine the analysesrelying on variousabstractdomains, afixpoint engine,andtheframework specifyingthe semanticsofsome specific componentsoftheprogramminglanguage),andoutputsalistofwarnings.Theaddedcomponentinourapproach isthetranslationofCILintoBCEL format(greyarrowintheupperpartofFig.4).SinceaprograminBCELformatcanbe dumpedintoajarfile,wecandumpa.dllfileinthisformat.

We instructed Julia by adding manual annotations to the main components of the .NET run-time environments. All together, this amounts to about 1.500 annotations on the main library APIs. For instance, we annotated field System

.

Decimal

.

One to specify that it is externally initialized with Julia’s annotation Injected, and method System

.

Web

.

HttpRequest

.

Params withUntrustedUserInputto specifythisreturns a userinput (thatis,it isa source forJulia’s In-jectionanalysis).

3. Concretesemantics 3.1. Notation

Webrieflyrecallsomestandardnotationadoptedbythefurtherformalizationinthissectionandthenextone. Commonmathematicalelements. Firstofall,weadoptedstandardsetnotations,thatis:(i)a

A denotesthatelementa is containedinset A,(ii) A

×

B denotes theCartesianproduct ofsets A and B (thatis,theset ofall orderedpairs

(

a

,

b

)

suchthata

A andb

V ,(iii)

and

denotesetunionandintersection,respectively,and(iv)

|

A

|

denotesthecardinality (thatis,thenumberofelements)ofset A.

Aboutfunctions, wedenoteby (i) F

:

D

C thesetofallfunctionsrelatingelementsinadomain D toacodomain C , (ii)

[

d

→

c

]

thefunctionthatrelateselementd toc,(iii) f

[

d

→

c

]

thefunctionthatbehaveslike f exceptforelementd that isrelatedtoc,and(iv) f

(

d

)

theapplicationoffunction f totheelementofthedomaind (e.g., f

(

d

)

=

c if f

= [

d

→

c

]

).

We represent lists and stacks as functions mapping natural indexes to elements. In particular, a stack s (or list) of elements in T isa functionin

N

T suchthat

i

∈ N : ∀

i1

i

:

i1

dom

(

s

)

∧ ∀

i2

>

i

:

i2

/

dom

(

s

)

; dom

(

s

)

denotes the

domainofthegivenfunction(thatis,theindexesonwhichstacks isdefinedinthisparticularcase).Wewillrefertoi as theheightofs (height

(

s

)

).Givenastacks andanelemente,s

::

e denotesastackwhosetopelement(thatis,theonewith

(7)

typeOf(v1)=typeOf(v2)

add, (s::v1::v2,l,a,h) →CIL(s:: (v1+v2),l,a,h) (add)

ldloc i, (s,l,a,h) →CIL(s::l(i),l,a,h) (ldloc)

stloc i, (s::v,l,a,h) →CIL(s,l[i →v],a,h) (stloc)

ldarg i, (s,l,a,h) →CIL(s::a(i),l,a,h) (ldarg)

isStatic(m(arg0,· · · , argi))=false∧t= null∧

body(m(arg0,· · · , argi), (t,v1,· · · ,vi)), ([], ∅, [0→t,j→vj:j∈ [1..i]],h) →CIL(s,l,a,h)

call m(arg1,· · · , argi), (s::t::v1:: · · · ::vi,l,a,h) →CIL(s,l,a,h)

(call)

isStatic(m(arg0,· · · , argi))=true∧

body(m(arg0,· · · , argi), (v1,· · · ,vi)), ([], ∅, [j−1→vj:j∈ [1..i]],h) →CIL(s,l,a,h)

call m(arg1,· · · , argi), (s::v1:: · · · ::vi,l,a,h) →CIL(s,l,a,h)

(call static)

fresh(T,h)= (r,h1)body(ctor(arg1,· · · , argi), (v1,· · · ,vi)), ([], ∅, [0→r,j→vj:j∈ [1..i]],h1) →CIL(s,l,a,h)

newobj T(a1,· · · , ai), (s::v1:: · · · ::vi,l,a,h) →CIL(s::r,l,a,h)

(newobj) o= null ldfld f, (s::o,l,a,h) →CIL(s::h(o)(f),l,a,h) (ldfld) o= null s =h(o)[f →v] stfld f, (s::o::v,l,a,h) →CIL(s,l,a,h[o→s]) (stfld) typeOf(v1)=typeOf(v2)v1>v2 bgt l, (s::v1::v2,l,a,h) →CIL l, (s,l,a,h)

(bgt true) typeOf(v1)=typeOf(v2)v1≤v2

bgt l, (s::v1::v2,l,a,h) →CIL(s,l,a,h) (bgt false) ldloca i, (s,l,a,h) →CIL(s::ri,l,a,h) (ldloca) stind, (s::ri::v,l,a,h) →CIL(s,l,a,h[ri→v]) (stind) dup, (s::v,l,a,h) →CIL(s::v::v,l,a,h) (dup) ldind, (s::ri,l,a,h) →CIL(s::h(ri),l,a,h) (ldind)

Fig. 5. Concrete CIL semantics.

thehighestindex)ise followedbythestacks.Inaddition,

< ...

>

denotesalistofelements(e.g.,

<

a

,

b

>

denotesthelist representedbythefunction

[

0

→

a

,

1

→

b

]

).

Semantics. Whenformalizingconcretestatesandsemantics,wedenotedby



thesetofconcretestatesofexecutionforJB (



JB)andCIL(



CIL).ThesmallstepsemanticsofJBandCILisdenotedby

JBand

CIL,respectively,where,forinstance,

st

,

σ

1

JB

σ

2 representsthattheexecutionofJBstatementstwithanentrystate

σ

1 resultsintheexitstate

σ

2.Thesmall

stepsemanticsisdefinedthroughinferencerules.Forinstance,

n

=

n

+

1

increment

,n

n

formalizesthatwhenweexecutestatementincrementonavaluen weobtainavaluendefinedasn

=

n

+

1.

Object-orientedcomponents. We denote the sets of referenceand numerical valuesby Ref andNum, respectively, and valuesby Val

=

Ref

Num.As usual forobject-oriented programminglanguages,an objectis amap from fieldnames to values,andaheapisamapfromreferencestoobjects.Formally,Heap

:

Ref

Field

Val,whereFieldisthesetcontaining all field names. fresh

(

T

,

h

)

= (

r

,

h

)

allocates an objectof type T in heap handreturns (i) thereference r ofthe freshly allocatedobject,and(ii)theheaphresultingfromtheallocationofmemoryonh.

Forsimplicity,we consideronlyinteger(Int) andlong(Long)numericaltypes (NumTypes

= {

Int

,

Long

}

), andreferences (Ref).Givenavaluev,typeOf

(

v

)

returnsitstype(Int,Long,orRef).SinceJBinstructionsoftenprependaprefixtodistinguish instructionsdealingwithdifferenttypes(e.g.,

iadd

and

ladd

),wedefineasupportfunctionJVMprefix that,givenatypet, returnstheprefixofthegiventype(i.e.,

i

ift

=

Int,

l

ift

=

Long,and

a

ift

=

Ref).WedefinebyWRefanobjecttypewith auniquefieldvalue.

Given a method signature

m

and a list of arguments L (with the receiver in the first argument if

m

is not static), body

(m,

L

)

: N →

St returns the body of the method resolving the call, that is, a sequence of statements (represented byafunctionmappingindexestostatements).Similarly,eachstatementbelongstoamethod;hencegetBody

(st)

=

b isthe bodyofthemethodwhere

st

occurs.Finally,isStatic(

m

)meansthat

m

isstatic.

3.2. CIL

WedefinetheconcretesemanticsoftheCILfragmentofSection2.1.

ConcreteState. AlocalstateinCILiscomposedbyastackofvaluesorreferencetolocalvariablesStack

: N →

Val

RefLoc

(where ri

RefLoc representsthecell’s referenceof thei-thlocalvariable), anarray oflocalvariables Loc

: N →

Val,and

an array of method arguments Arg

: N →

Val. A concrete CIL state consistsof a local state anda heap, that is,



CIL

=

Stack

×

Loc

×

Arg

×

Heap.

ConcreteSemantics. Fig.5showstheconcreteCILsemantics

st,

σ

CIL

σ

.Forastatement

st

andanentrystate

σ

,it

(8)

typeOf(v)=Long dup, (s::v,l,h) →JB(s::v::v,l,h)

(dup) typeOf(v1)=Long

dup2, (s::v1::v2,l,h) →JB(s::v1::v2::v1::v2,l,h)

(dup2 32) typeOf(v)=Long

dup2, (s::v,l,h) →JB(s::v::v,l,h)

(dup2 64) typeOf(v1)=Int∧typeOf(v2)=Int

iadd, (s::v1::v2,l,h) →JB(s:: (v1+v2),l,h) (iadd)

typeOf(v1)=Long∧typeOf(v2)=Long

ladd, (s::v1::v2,l,h) →JB(s:: (v1+v2),l,h) (ladd) x=JVMprefix(typeOf(l(i))) xload i, (s,l,h) →JB(s::l(i),l,h) (xload) x=JVMprefix(typeOf(v)) xstore i, (s::v,l,h) →JB(s,l[i →v],h) (xstore)

isStatic(m(arg0,· · · , argi))=false∧t= null∧

body(m(arg0,· · · , argi), (t,v1,· · · ,vi)), ([], [0→t,j→vj:j∈ [1..i]],h) →JB(s,l,h)

invokevirtual m(arg1,· · · , argi), (s::t::v1:: · · · ::vi,l,h) →JB(s,l,h)

(invokevirtual)

isStatic(m(arg0,· · · , argi))=true∧

body(m(arg0,· · · , argi), (v1,· · · ,vi)), ([], [j−1→vj:j∈ [1..i]],h) →JB(s,l,h)

invokestatic m(arg1,· · · , argi), (s::v1:: · · · ::vi,l,h) →JB(s,l,h)

(invokestatic) fresh(T,h)= (r,h) new T, (s,l,h) →JB(s::r,l,h) (new) o= null getfield f, (s::o,l,h) →JB(s::h(o)(f),l,h) (getfield) o= null s=h(o)[f →v] putfield f, (s::o::v,l,h) →JB(s,l,h[o→s])

(putfield) typeOf(v1)=Int∧typeOf(v2)=Int∧v1>v2

if_icmpgt l, (s::v1::v2,l,h) →JB l, (s,l,h)



if_icmpgt true



typeOf(v1)=Int∧typeOf(v2)=Int∧v1≤v2

if_icmpgt l, (s::v1::v2,l,h) →JB(s,l,h)



if_icmpgt false



Fig. 6. Concrete JB semantics.

executeisthatat

l

.Otherwise,thenextinstructiontoexecuteisimplicitlyassumedtobethesubsequentone,sequentially (ifany).

Forthemostpart,theconcretesemanticsisastraightforwardformalizationoftherun-timesemanticsdefinedbytheCIL ECMAStandard [9].Forinstance,rule

add

popsthetwotopmostvaluesoftheoperandstackandreplacesthemwiththeir addition.However,itssemanticsisdefinediffthetwovalueshavethesametype.Instead,

ldloc

i

pushestotheoperand stackthevalue ofthei-thlocalvariables,while

stloc

i

storesthetopoftheoperandstackintothe i-thlocalvariable. Statements workingwithobjects,such as

ldfld

and

stfld

,read fromandwrite intothe heap,iftheir receiveris not

null

.Rules

call

and

call

static

createaframe(i.e.,anarrayofarguments,anemptyarrayoflocalvariablesandan empty operandstack),executethecalleeandleaveitsreturnedvalue onthestack,ifany.Forsimplicity,theformalization assumes that there isno returned value.5 Finally,

ldloca

i

loadsthe reference to the i-th local variable to the stack (represented byri),

stind

storesthegivenvalue tothegivenreference,and

ldind

loadsthevalue pointedbythegiven reference.

3.2.1. Runningexample

Consider the runningexamplein Fig.2bandapply theconcrete semanticswhen

n

is 1,thatis, whentheentrystate consistsofanemptyoperandstackandofanarrayoflocalvariables,whilethevalueoftheargumentsis

[

0

→

1

]

.Assume that

newobj

allocatestheobjectataddress#1.Then afterthefirstblock(lines 3-7)theaddressoftheobjectisstoredin localvariable 0,localvariable 1(representingvariable

i

ofthesourceprogram)holds0,andaddress#1 intheheapholds an objectoftype

List

Collection

.Formally,theconcretestateatline 7is

(

∅,

[

0

→

#1

,

1

→

0

],

[

0

→

1

],

[

#1

→<>])

, where

<>

standsfortheemptylist.Thenthebodyofthe

for

loop(lines 9-16)isexecutedonceandcreatesanew

Wrap

object(assumeataddress#2)wrapping0,addsittothelistataddress#1 andincrementscounter

i

(i.e.,localvariable 0) by1.Hencetheexecutionoftheconcretesemanticsofthebodyoftheloopleadstotheconcretestate

σ

= (∅,

[

0

→

#1

,

1

→

1

],

[

0

→

1

],

[

#1

→<

#2

>,

#2

→ [

f

→

0

]])

atline 16.Theconditionatline 20willthenroutetheprogramtoline 22(since argument 0andlocalvariable 1hold1).Inconclusion,theconcretesemanticswillreachstatement

ret

atline 23witha state

σ

CIL equalto

σ

,butwheretheoperandstackcontainsreference#1.

3.3. JB

WedefinetheconcretesemanticsoftheJBfragmentofSection2.1.

ConcreteState. A localstateinJBconsistsofastackofvaluesStack

: N →

Val andanarrayoflocalvariablesLoc

: N →

Val.AconcreteJBstateconsistsofalocalstateandaheap,thatis,



JB

=

Stack

×

Loc

×

Heap.

5 Thiswouldhaverequiredtodefinetwodistinctrules(theonewithreturnedvalueswherethefinalstackcontainsthereturnedvaluefollowedby

thepreviousstackinadditiontotheonealreadyformalized).However,theimplementationfullysupportsthiscaseaswell,andweomitithereonlyto improvethereadabilityoftheformalizationandformalproofs.

(9)

ConcreteSemantics. Fig.6reportstheconcreteJBsemantics

JB.Forastatement

st

andanentrystate

σ

,ityieldsthe

state

σ

resultingfromtheexecutionof

st

over

σ

.Forthemostpart,thebehaviorofthissemanticsisidenticaltothatfor CIL.ThemaindifferencesarethatJBinstructionsworkonspecifictypes(e.g.,whileCIL

add

statementaddstwovaluesof thesametype,JB

iadd

and

ladd

statementsadd thevaluesiffthey areboth

int

or

long

,respectively),and

new

only allocatesanewobject,whileCIL

newobj

statementsalsocallsaconstructor.

3.3.1. Runningexample

TheapplicationoftheJBconcretesemanticsissimilar tothatforCIL,butthere aretwominordifferences:(i)thereis onlyonearrayoflocalvariablesrepresentingbothCILargumentsandlocalvariables(e.g.,CILlocalvariable 0isrepresented byJBlocalvariable 1,sincethefirstlocalvariableholdstheargumentofthemethod);and(ii)thereisaninstrumentation localvariable6 atindex 3.Therefore,afterweapplytheJBconcretesemanticsfromtheentrystatethatmapstheargument to1,weobtaintheconcretestate

(

[

0

→

#1

],

[

0

→

1

,

1

→

#1

,

2

→

1

,

3

→

0

],

[

#1

→<

#2

>,

#2

→ [

f

→

0

]])

.

4. FromCILtoJB 4.1. Concretestates

Function

T

σ

J K

: 

CIL

→ 

JBtranslates CILconcretestates intoJBconcrete states(where l representsthenumber of

elementsinthestackofs):

T

σ

J(

s

,

l

,

a

,

h

)

K

= (

s

,

cnvrtLoc

(

l

,

a

),

hl

)

.

Thisfunction(i) replacesdirectreferenceswithwrapperobjects,and(ii)mergesthearray oflocalvariablesand argu-ments,adjustingvariableindexesfor64bitsvalues.Formally:

i

dom(s

),l

=

height(s

),

s

=



i

→



s

(i)

if s

(i)

Val

ri if s

(i)

RefLoc



where h1

=

h

,

and

(

hi

,

ri

)

=

allocWrp(hi−1

,i),

allocWrp(h

,

j)

=

(

h

, null)

if s

(

j)

Val

(

h

[

r

→

h

(r)

[value →

l

(

j)

]])

where

(r,

h

)

=

fresh(WRef

,

h

)

if s

(i)

RefLoc

Intuitively,eachdirectreferenceintheoperandstack(thatis,s

(

i

)

RefLoc)isreplacedbyanotherreferencepointingto

awrapperobjectfreshlyallocatedandcontaininginitsfieldthevaluepointedbytheoriginaldirectreference. Then,foranarrayofvaluesb andanindexi,thefollowingfunctioncountsthe64bitstypesamongthefirsti:

64bi

= |{

j

:

0

j

<

i and b

[

j

]

is a 64 bit value

}|

ThenthearraycnvrtLoc

(

l

,

a

)

isdefinedasfollows:

0

i

<

|

a

| :

cnvrtLoc(l

,

a

)

[

i

+

64ia

] =

a

[

i

]

0

i

<

|

l

| :

cnvrtLoc(l

,

a

)

[|

a

| +

64|aa|

+

i

+

64il

] =

l

[

i

]

Definition1.Considerafunction f thatmapselementsofa pairofarrays

(

a

,

b

)

intoelements ofan arrayc.Let f

(

a

[

i

])

=

c

[

i

]

and f

(

b

[

i

])

=

c

[

i

]

.Thefunction f isanidentityembedding ifthefollowingconditionshold:

1

)

0

i

<

|

a

| :

a

[

i

] =

c

[

i

]

and

j

<

|

b

| :

b

[

j

] =

c

[

j

]

2

)

0

i,j

<

|

a

| :

i

j

i

j and

0

h,k

<

|

b

| :

h

k

h

k

3

)

{

i

:

0

i

≤ |

a

|} ∩ {

j

:

0

j

≤ |

b

|} = ∅

Lemma1.ThefunctioncnvrtLoc isanidentityembedding.

Proof. Itissufficienttoobservethat,byconstruction,thefunctionconcatenatesthetwoarraysbyshiftingindexeswhena 64bitsvalueoccurs,hencepreservingthevaluesandtheorderingoftheelements’indexes.

2

4.1.1. Runningexample

Consider theCILexitstatecomputedinSection3.2,thatis,

σ

CIL

= ([

0

→

#1

],

[

0

→

#1

,

1

→

1

],

[

0

→

1

],

[

#1

→<

#2

>,

#2

→ [

f

→

0

]])

. TheCIL toJB translationcomputes

T

σ

J

σ

CIL

K

= ([

0

→

#1

],

[

0

→

1

,

1

→

#1

,

2

→

1

],

[

#1

→<

#2

>,

#2

→

[

f

→

0

]])

(it merges CILargumentsandlocalvariables). Thisstate isalmost identicaltotheexit stateofthe JBconcrete semanticsappliedtotherunningexample(Section3.3)exceptfortheinstrumentationlocalvariableatindex 3.

(10)

TJdup,s:: t,l,a,wK =  dup if t=Long dup2 if t=Long TJadd,s:: t1:: t2,l,a,wK =  iadd if t1=t2=Int ladd if t1=t2=Long

TJldloc i,s,l,a,wK = xloadj where j= |a| +64a|a|+i+64ilx=JVMprefix(typeOf(l(i)))

TJstloc i,s:: t,l,a,wK = xstorej where j= |a| +64|aa+i+64i

l∧x=JVMprefix(typeOf(l(i)))

TJldarg i,s,l,a,wK = xloadj where j=i+64i

ax=JVMprefix(typeOf(a(i)))

TJcall m(arg1,· · · , argi), = invoke; aloadp1idx1; getfield value ;xidx1storep 2 idx1 ; · · · s::t1:: · · · ::ti,l,a,w::p1:: · · · ::piK · · · aloadp1idx

j; getfield value ;xidxjstorep

2 idxj;

whereinvoke= 

invokestatic m(arg1,· · · , argi) if isStatic(m(arg1,· · ·, argi))

invokevirtual m(arg1,· · · , argi) otherwise

{idx1,· · · ,idxj} = {k: argk∈RefLoc}

k∈ [1..j] :xidxk=JVMprefix(typeOf(l(p 2 idxk)))∧ ∀r∈ [1..i] :pi= (p 1 i,p 2 j)

TJnewobj T(a1,· · · , ai), = xistoreidxi; · · · ;x1storeidx1; new T ; dup ;

s::t1:: · · · ::ti,l,a,wK x1loadidx1; · · · ;xiloadidxi; invokevirtual < init > (arg1,· · · , argi)

where∀j∈ [1..i] :xj=JVMprefix(aj)idxj=freshIdx(newobj T(a1,· · · , ai),j)

TJldfld f,s:: to,l,a,wK = getfield f

TJstfld f,s:: to:: tv,l,a,wK = putfield f

TJbgt k,s:: t1:: t2,l,a,wK = if_icmpgt kwhere k=statementIdx(getBody(bgt k)(k))if t1=t2=Int

TJldloca i,s,l,a,wK = TJnewobj WrapRef() ; dup2 ; stloc j; ldloc i; stfld value,s,l,a,wK where j=freshIdx(ldloca i,0)

TJstind,s,l,a,wK = TJstfld value,s,l,a,wK TJldind,s,l,a,wK = TJldfld value,s,l,a,wK

Fig. 7. Translation of CIL statements into JB.

4.2. Statements

Fig. 7 formalizesthe translation

T

Jst

CIL

, K

K

= st

JB of a single CIL statement into a sequence ofJB statements.The

componentswithalinearaccent(e.g.,s)denotesstaticinformationaboutcomputationalvalues.Inparticular,ourtranslation relieson statictype informationaboutlocals (denotedby l),arguments(a)andstackelements (s),computedat

st

CIL by

a standard algorithm [9]. Inparticular,types andheight ofthestack arefixed andknownforeach bytecode.Inaddition, thefourthcomponentwisastackofelementin

∪ (N × N)

that,foreachelementinthestack,tells(i)

ifitisnot a directreference,or(ii)

(

i

,

j

)

wherei istheindexofthelocalvariablespointedbythedirectreference,7 and j istheindex of thelocal instrumentationvariable containing apointer to thewrapper simulatingthereference. Instrumentation local variablesareappendedattheendofthearrayofJBlocalvariables(thatis,aftertheargumentsandthelocalvariablesused intheprogram).TheyareneededinordertocachesomevaluesthatarelaterneededtorepresentthesemanticsoftheCIL programinitsJBtranslation.

Few CILstatements (namely,

ldfld

and

stfld

) have a one-to-one translationinto a JB statement (

getfield

and

putfield

).Thestatementsreadingandwritinglocalvariablesandarguments(

ldarg

,

ldloc

,and

stloc

)aretranslated intotheir JBcounterpart(x

load

,x

store

,respectively),takingintoaccountthetype ofthevalueatthetopofthestack, andadjustingtheindexofthevariable,takingintoaccountargumentsand64bitvariables.SomeCILstatements(

dup

)get translatedintodifferentJBstatementsonthebasisofcontextualinformationsuchasthetypeofvaluesintheoperandstack (

dup

and

dup2

).Other CILstatementscanbetranslatedonlyifthetypeofthevaluesintheoperandstackisnumeric:(i)

add

can betranslatedinto

ladd

and

iadd

,and(ii)

bgt

to

is_icmpgt

;iftheyare appliedtoreferences(asingeneric CILcode),thenthecodeisconsideredunsafeandisnottranslated.

call

requiresto (i)translate themethod callto thecorresponding staticordynamic invocationstatement inJB,and (ii)to propagatethesideeffectsondirectpointerspassed tothemethodas

out

/

ref

parameterstothelocalvariablesof the callee. The translationof

newobj

is trickybecause ofthe differentpatterns usedin CILandJB forobject creation.8 While CIL createsandinitializes the object (i.e.,calls its constructor)witha single instruction, JB splits theseoperations andrequiresthenewlycreatedobjecttooccurbelowtheargumentsonthestack,beforecallingtheconstructor.Hence,the translationreliesonafunctionfreshIdx tostoreandloadthevaluesoftheconstructorargumentsthroughinstrumentation localvariables.Inparticular,givenaCILmethodm, thenumberandtypesofargumentsandlocalvariablesofthemethod are known(SectionII.15.4of [9]).Therefore,functioncnvrtLoc tellswhichlocalvariablesthe translatedJBmethodalready uses.Then,foreachargumentofeach

newobj

statementinm, itispossibletoallocateafreshlocalvariabletostoreand loaditsvalue.Inthisway,thetranslationallocatesanewobjectandputsitsaddressbelowtheconstructorarguments.

7 SincethelanguageweintroducedinFig.1supportsonlyldlocatogetadirectpointer,weneedtotrackonlythisinformationintheformalization. 8 For sakeofsimplicity,weassumethe constructordoes nothaveout/refparameters.Intheimplementation,they aretreatedas forthecall

(11)

Instructionsdealingwithdirectpointers(namely,

ldloca

,

stind

,and

ldind

inourminimallanguage)aretranslated through equivalent CILinstructions dealing withwrapperobjects (and their field

value

). Therefore,

stind

and

ldind

aresimplytranslatedthroughequivalent writeandreadoffield

value

,respectively.

ldloca

insteadrequirestoallocate a wrapper object

newobj

,stores a reference to the wrapper (

stloc

) inan instrumentation variable obtainedthrough freshIdx,storesthevaluepointedbythedirectreferenceinthelocalvariabletoitsfield

value

(

ldloc

and

stfld

),and leavesareferencetothewrapperinthestack(

dup

).

Each CIL statement is translated into one or more JB statements, hence offsets are not preserved. Thus, function statementIdx

:

St

→ N

yields the JB offset of the first statement in the translation of the given CIL statement. In addi-tion,sincedirectreferencesarereplacedbywrapperobjects,whenamethodparameterhasadirectreferencetype

&T

(and thishappenswhenitisa

ref

or

out

parameterinsafeC#),thisisreplacedbyawrapperobjectWRef.

4.2.1. Runningexample

Considerthe runningexampleinFig.2.MostCILstatementsaretranslatedintoa singleJBstatement (e.g.,lines 18-20 and 22-23ofFig.2baretranslatedintolines 23-25and 27-28ofFig.2c),withthenoticeableexceptionoftheCIL

newobj

statements at lines 3 and 11, translated into lines 2-4 and 12-16, respectively. The former passes no argument to the constructor;thelatter(thatinstantiatesa

Wrap

)callsaconstructorwithanargument,hencerequiringaninstrumentation variableatindex 3.

4.2.2. Directreferences

AssketchedinSection2.2.1,wemodelthesemanticsofpointersinsafeC#codethroughwrapperobjects.Inparticular,

ldind

(line 5ofFig.3c)istranslatedintothefieldaccess

i.value

(rightsideoftheassignmentatline 2ofFig.3b),while

stind

(line 8 ofFig.3c)istranslatedintotheassignment of

i.value

(leftsideofthe assignmentatline 2ofFig.3b). In addition,

ldloca

(line 15of Fig.3c) leads tothe construction andassignment ofa wrapperobject(lines 7 and 8 of Fig.3b),whileafterthemethodcallthevaluecontainedinthewrapperobjectiswrittenintothelocalvariable(line 10of Fig.3b).

4.3. Correctness

ThissectionprovesthatthetranslationfromCILtoJBstatementsiscorrect.Namely,givenaconcreteCILstate

σ

CIL and

applyingtheoperationalsemanticsforastatement

st

,oneobtainsastatethat,whentranslatedintoJB,isexactlythestate resultingfromthetranslationof

σ

CIL intoJBandtheapplicationoftheJBsemanticstoit:

∀st ∈

StCIL

,

σ

CIL

∈ 

CIL

: st,

σ

CIL

CIL

σ

CIL and

TJst, KK, T

σ

J

σ

CIL

K →

JB

σ

JB

T

σ

J

σ

CIL

K =

σ

JB

where

σ

1

=

σ

2 meansthatthetwostatesareequaluptoinstrumentationvariablesintroducedbythetranslationprocess.

Formally,let

σ

1

= (s

1

, l

11

:: . . . :: l

n

1

, h

1

)

and

σ

2

= (s

2

, l

12

:: . . . :: l

n

2

,

:: l

n+12

:: . . .:: l

n+k2

, h

2

)

,then

σ

1

=

σ

2 iff

s

1

= s

2,and

∀i ≤ n : l

i

1

= l

i

2, and

h

1

= h

2. Note that instrumentation variables are present only in the JB state, hence in the right

hand-sideoftheequality. 4.3.1. Formalproofofsoundness

Lemma2.Thetranslationof

ldloc

iscorrect.

Proof. Let usconsider the caseof integer arguments(the other casecan be treated analogously).Let usprove that the translationof

ldloc

forintegervaluesiscorrect,i.e.,that

σ

CIL

∈ 

CIL,if

ldloc i,

σ

CIL

CIL

σ

CIL and

TJldloc i,

(

s

,

l

,

a

,

w

)

K

,

T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB .Let

σ

CIL

= (

s

,

l

,

a

,

h

)

bearbitrary.Bythe

ldloc

CILrulewehave

σ

CIL

=

(s

:: l[i], l, a, h)

.Byrule (3)ofFig.7,

T

Jldloc i,

s

,

l

,

a

,

w

K

= iload

j where j

= |

a

|

+

64a|a|

+

i

+

64i

l.By definitionof

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s, l, a, h)K = (s



,cnvrtLoc(l, a), h



).

Bythe

iload

JBrule:

iload j, (s



,cnvrtLoc(l, a), h



)

JB

(s



::

cnvrtLoc(l, a)

[j],

cnvrtLoc(l, a), h

)

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s :: l[i], l, a, h)K = (s :: l[i],

cnvrtLoc(l, a), h).

(12)

Lemma3.Thetranslationof

ldarg

iscorrect.

Proof. Let us consider thecase ofinteger arguments (the other case can be treatedanalogously). Letus provethat the translationof

ldarg

forintegervaluesiscorrect,i.e.,that

σ

CIL

∈ 

CIL,if

ldarg i,

σ

CIL

CIL

σ

CIL and

TJldarg i,

(

s

,

l

,

a

,

w

)

K, T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB .Let

σ

CIL

= (

s

,

l

,

a

,

h

)

bearbitrary.Bythe

ldarg

CILrulewehave

σ

CIL

=

(s

:: a[i], l, a, h)

.By rule (5)of Fig. 7,

T

Jldarg i,

s

,

l

,

a

,

w

K

= iload

j where j

=

i

+

64a|a|

+

i

+

64i

l. Bydefinition of

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s, l, a, h)K = (s



,

cnvrtLoc(l, a), h

).

Bythe

iload

JBrule:

iload j, (s



,

cnvrtLoc(l, a), h

)

JB

(s



::

cnvrtLoc(l, a)

[j],

cnvrtLoc(l, a), h

)

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s :: l[i], l, a, h)K = (s :: l[i],

cnvrtLoc(l, a), h).

BydefinitionofcnvrtLoc wehavecnvrtLoc

(l,

a

)

[j] = l[i]

,whichimplies

T

σ

J

σ

CIL

K

=

σ

JB ,andthus

T

σ

J

σ

CIL

K

=

σ

JB .

2

Lemma4.Thetranslationof

stloc

iscorrect.

Proof. Let us consider thecase ofinteger arguments (the other case can be treatedanalogously). Letus provethat the translationof

stloc

forintegervaluesiscorrect,i.e.,that

σ

CIL

∈ 

CIL,if

stloc i,

σ

CIL

CIL

σ

CIL and

TJstloc i,

(

s

,

l

,

a

,

w

)

K

,

T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB . Let

σ

CIL

= (

s

::

v

,

l

,

a

,

h

)

be an arbitrary state where the stack has

an integer value v at the top. By the

stloc

CIL rule we have

σ

CIL

= (s, l[i → v], a, h)

. By rule (4) of Fig. 7,

T

Jstloc i,

s

:: t,

l

,

a

,

w

K

= istore

j where j

= |

a

|

+

64|aa|

+

i

+

64i

l.Bydefinitionof

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s :: v, l, a, h)K = (s



::

v,cnvrtLoc(l, a), h

)

Bythe

istore

JBrule:

istore j, (s



::

v,cnvrtLoc(l, a), h

)

JB

(s



,

cnvrtLoc(l, a)

[j → v], h



)

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s, l[i → v], a, h)K = (s,

cnvrtLoc(l

[i → v], a), h



)

By definition ofcnvrtLoc we have that cnvrtLoc

(l

[i → v],

a

)

=

cnvrtLoc

(l,

a

)

[j → v]

which implies

T

σ

J

σ

CIL

K

=

σ

JB ,and

thus

T

σ

J

σ

CIL

K

=

σ

JB .

2

Lemma5.Thetranslationof

add

iscorrect.

Proof. Let us consider thecase ofinteger arguments (the other case can be treatedanalogously). Letus provethat the translation of

add

for integer values is correct, i.e., that

σ

CIL

∈ 

CIL, if

add,

σ

CIL

CIL

σ

CIL and

TJadd, (

s

,

l

,

a

,

w

)

K,

T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB . Let

σ

CIL

= (

s

::

v1

::

v2

,

l

,

a

,

h

)

be an arbitrary state where the stack hastwo

integer values v1 and v2 atthe top. By the

add

CILrule we have

σ

CIL

= (s :: (v

1

+ v

2

), l, a, h)

. By rule(2) ofFig. 7,

T

Jstloc i,

s

:: t

1

:: t

2

,

l

,

a

,

w

K

= iadd

j.Bydefinitionof

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s :: v

1

:: v

2

, l, a, h)

K = (s



::

v1

::

v2

,cnvrtLoc(l, a), h



)

Bythe

iadd

JBrule:

iadd j, (s



::

v1

::

v2

,

cnvrtLoc(l, a), h

)

JB

(s



:: (v

1

+ v

2

),

cnvrtLoc(l, a), h

)

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s :: (v

1

+ v

2

), l, a, h)

K = (s



:: (v

1

+ v

2

),

cnvrtLoc(l, a), h

)

whichimplies

T

σ

J

σ

CIL

K

=

σ

JB ,andthus

T

σ

J

σ

CIL

K

=

σ

JB .

2

(13)

Proof. Let usconsider the caseof integer arguments(the other casecan be treated analogously).Let usprove that the translation of

dup

for integer values is correct, i.e., that

σ

CIL

∈ 

CIL, if

dup,

σ

CIL

CIL

σ

CIL and

TJadd, (

s

,

l

,

a

,

w

)

K,

T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB .Let

σ

CIL

= (

s

::

v

,

l

,

a

,

h

)

beanarbitrarystatewherethestackhasanintegervalue

v atthetop.Bythe

dup

CILrulewehave

σ

CIL

= (s :: v :: v, l, a, h)

.Byrule(1)ofFig.7,

T

Jdup,

s

:: t,

l

,

a

,

w

K

= dup

j. Bydefinitionof

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s :: v, l, a, h)K = (s



::

v,cnvrtLoc(l, a), h

).

Bythe

dup

JBrule:

dup, (s



::

v,cnvrtLoc(l, a), h

)

JB

(s



:: v :: v,

cnvrtLoc(l, a), h

)

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s :: v, l, a, h)K = (s



:: v,

cnvrtLoc(l, a), h

)

whichimplies

T

σ

J

σ

CIL

K

=

σ

JB ,andthus

T

σ

J

σ

CIL

K

=

σ

JB .

2

Lemma7.Thetranslationof

ldfld

iscorrect.

Proof. Letusconsiderthecaseofanumericalfield(thecaseofa referencefieldcanbetreatedanalogously).Wewantto provethat

σ

CIL

∈ 

CIL,if

ldfld f,

σ

CIL

CIL

σ

CIL and

TJldfld f, (

s

,

l

,

a

,

w

)

K, T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB .Let

σ

CIL

= (

s

::

o

,

l

,

a

,

h

)

beanarbitrarywherethestackcontainsareferenceo at thetop.Bythe

ldfld

CILrulewe

have

σ

CIL

= (s :: h(o)(f), l, a, h)

.Byrule(8)ofFig.7,

T

Jldfld f,

s

:: t

o

,

l

,

a

,

w

K

= getfield f

.Bydefinitionof

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s :: o, l, a, h)K = (s



:: r

o

,cnvrtLoc(l, a), h



)

wherero isdefinedasareferencetoawrapperJBobjectrepresentingtheCILobject.Bythe

getfield

JBrule:

getfield f, (s



:: r

o

,

cnvrtLoc(l, a), h

)

JB

(s



:: h



(r

o

)(f),

cnvrtLoc(l, a), h

)

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s :: h(o)(f), l, a, h)K = (s



:: h(r

o

)(f),

cnvrtLoc(l, a), h

)

whichimplies

T

σ

J

σ

CIL

K

=

σ

JB ,andthus

T

σ

J

σ

CIL

K

=

σ

JB .

2

Lemma8.Thetranslationof

stfld

iscorrect.

Proof. Letusconsiderthecaseofanumericalfield(thecaseofa referencefieldcanbetreatedanalogously).Wewantto provethat

σ

CIL

∈ 

CIL,if

stfld f,

σ

CIL

CIL

σ

CIL and

TJstfld f, (

s

,

l

,

a

,

w

)

K, T

σ

J

σ

CIL

K

JB

σ

JB then

T

σ

J

σ

CIL

K

=

σ

JB . Let

σ

CIL

= (

s

::

o

::

v

,

l

,

a

,

h

)

be an arbitrarywhere thestack contains areferenceo and avalue v at the top.Bythe

stfld

CIL rule we have

σ

CIL

= (s, l, a, h[o → h(o)[f → v]])

. By rule (9) of Fig. 7,

T

Jstfld f,

s

:: t

o

:: t

v

,

l

,

a

,

w

K

=

putfield f

.Bydefinitionof

T

σ

J

K

:

T

σ

J

σ

CIL

K = T

σ

J(s :: o :: v, l, a, h)K = (s



:: r

o

:: v,

cnvrtLoc(l, a), h

)

wherero isdefinedasareferencetoawrapperJBobjectrepresentingtheCILobject.Bythe

putfield

JBrule:

putfield f, (s



:: r

o

:: v,

cnvrtLoc(l, a), h

)

JB

(s



:: h



(r

o

)(f),

cnvrtLoc(l, a), h

[r

o

→ h



(r

o

)

[f → v]])

=

σ

JB

.

Bydefinitionof

T

σ

J

K

wehave

T

σ

J

σ

CIL

K = T

σ

J(s, l, a, h[o → h(o)[f → v]])K

= (s



,cnvrtLoc(l, a), h



[r

o

→ h(o)[f → v]])

= (s



,cnvrtLoc(l, a), h



[r

o

→ h



(r

o

)

[f → v]])

whichimplies

T

σ

J

σ

CIL

K

=

σ

JB ,andthus

T

σ

J

σ

CIL

K

=

σ

JB .

2

Riferimenti

Documenti correlati

La rivista Fortune ogni anno intervista 10mila dirigenti chiedendo loro di classificare le imprese rispetto a nove attributi: qualità del management, qualità dei prodotti

In ogni caso, al di là della eventuale consistenza di una frequentazione fra VII e X secolo, la nuova ed intensa stagione di rioccupazione della pianura del Tavoliere, a partire

plant communities in the area (Phragmites australis and Myriophyllum aquaticum community)

Figura 3.36: Grafico riassuntivo dei risultati delle prove di rigidezza per provini maturati 7 gg a 40°C in forno autoventilato+ 4 ore a 20°C all’aria (28 gg) e con parametri di

White matter and grey matter lesion probability maps in African American and Caucasian MS patients, overlaid on the MNI standard brain template, showing the probability of each

Singularity near the tip of the interfacial crack dynamically growing along the stiff interface was investigated by Mishuris, Movchan, and Movchan 2006b and Mishuris, Movchan,

La sostenibilità ambientale attiene alla necessità di una migliore utilizzazione delle risorse ambientali attualmente disponibili, affinché anche le generazioni future