• Non ci sono risultati.

Search-Based Software

N/A
N/A
Protected

Academic year: 2021

Condividi "Search-Based Software"

Copied!
90
0
0

Testo completo

(1)

Annibale Panichella

Advisor:

Prof. Andrea De Lucia Dott. Rocco Oliveto

Search-Based Software

Maintenance and Evolution

(2)

Search-Based Software Engineering

1. Problem Reformulation: reformulating typical SE problems as optimization problems

2. Fitness Function: definition of functions to optimize

3. Optimization Algorithms: applying search algorithm to solve such functions

- Genetic Algorithms - Hill climbing

- Simulated Annealing - Random Search

- Tabu Search

- Particle Swarm Optimization - …

«The application of meta-heuristic search-based optimization

techniques to find near-optimal solutions in software engineering

problems.»

(3)

Why SBSE?

Large Search Space Presence of conflicting goals

(4)

Optimization Problem

min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Local Optimum

Global Optimum

Minimize

(5)

Genetic Algorithms (GAs)

Initial PopulationInitial Population min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

(6)

Genetic Algorithms (GAs)

Initial PopulationInitial Population min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

𝐴 + 𝐵 2

Crossover Crossover

(7)

Genetic Algorithms (GAs)

Initial PopulationInitial Population min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Crossover Crossover Mutation Mutation

(8)

Genetic Algorithms (GAs)

Initial PopulationInitial Population min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Crossover Crossover Mutation Mutation Selection Selection

(9)

Genetic Algorithms (GAs)

Initial PopulationInitial Population min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Crossover Crossover Mutation Mutation Selection Selection No

No

Yes Yes End?

End?

(10)

Genetic Algorithms (GAs)

min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Initial PopulationInitial Population

Crossover Crossover Mutation Mutation Selection Selection No

No

Yes Yes End?

End?

(11)

Genetic Algorithms (GAs)

min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Initial PopulationInitial Population

Crossover Crossover Mutation Mutation Selection Selection No

No

Yes Yes End?

End?

(12)

Genetic Algorithms (GAs)

min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Initial PopulationInitial Population

Crossover Crossover Mutation Mutation Selection Selection No

No

Yes Yes End?

End?

(13)

Genetic Algorithms (GAs)

min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Initial PopulationInitial Population

Crossover Crossover Mutation Mutation Selection Selection No

No

Yes Yes End?

End?

(14)

Genetic Algorithms (GAs)

min 𝑓(𝑥) = sin 𝑥 − 1 8 + 1

Initial PopulationInitial Population

Crossover Crossover Mutation Mutation Selection Selection No

No

Yes Yes End?

End?

(15)

Main Contributions

Search-Based Program

Comprehension

Multi-Objectives Defect Prediction

Search-Based Test Data Generation

Multi-Objective Test Suite

Optimization

(16)

Main Contributions

Search-Based Program

Comprehension

Multi-Objectives Defect Prediction

Search-Based Test Data Generation

Multi-Objective Test Suite

Optimization

(17)

Program Comprehension

public class LoadConfiguration extends AbstractHandler { IWorkbench wb = PlatformUI.getWorkbench();

IWorkbenchWindow window = wb.getActiveWorkbenchWindow();

public LoadConfiguration() { }

@Override

public Object execute(ExecutionEvent event) throws ExecutionException { IWorkbench wb = PlatformUI.getWorkbench();

IWorkbenchWindow window = wb.getActiveWorkbenchWindow();

IWorkbenchPage page = window.getActivePage();

IEditorPart editor = page.getActiveEditor();

//reading the instantiation variable in SCM ResourceSet resourceSet = new ResourceSetImpl();

IFile file1;

try{

file1 = (IFile) editor.getEditorInput().getAdapter(IFile.class);

}catch (Exception exc){

printError("Please select a State Chart Model", window);

return null;

}

SCMDiagram scd = null;

Resource scdResource = resourceSet.createResource();

try {

scdResource.load(null);

scd = (SCMDiagram) scdResource.getContents().get(0);

} catch (IOException e) {

printError("Corrupted State Chart Modell file", window);

scdResource = null;

public class LoadConfiguration extends AbstractHandler { IWorkbench wb = PlatformUI.getWorkbench();

IWorkbenchWindow window = wb.getActiveWorkbenchWindow();

public LoadConfiguration() { }

@Override

public Object execute(ExecutionEvent event) throws ExecutionException { IWorkbench wb = PlatformUI.getWorkbench();

IWorkbenchWindow window = wb.getActiveWorkbenchWindow();

IWorkbenchPage page = window.getActivePage();

IEditorPart editor = page.getActiveEditor();

//reading the instantiation variable in SCM ResourceSet resourceSet = new ResourceSetImpl();

IFile file1;

try{

file1 = (IFile) editor.getEditorInput().getAdapter(IFile.class);

}catch (Exception exc){

printError("Please select a State Chart Model", window);

return null;

}

SCMDiagram scd = null;

Resource scdResource = resourceSet.createResource();

try {

scdResource.load(null);

scd = (SCMDiagram) scdResource.getContents().get(0);

} catch (IOException e) {

printError("Corrupted State Chart Modell file", window);

scdResource = null;

“Software that is not comprehended cannot be changed”- Rajlich and Wilde - ICPC 2002

“Software that is not comprehended cannot be changed”- Rajlich and Wilde - ICPC 2002

40% to 60% of the maintenance effort is devoted to understanding the software to be modified - Dorfman and Thayer – IEEE Software Engineering 1996

40% to 60% of the maintenance effort is devoted to understanding the software to be modified - Dorfman and Thayer – IEEE Software Engineering 1996

(18)

Information Retrieval

Software Artefacts

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Artefacts Indexing

Similarity Computation

Distance Function Distance

Function IR ModelIR Model Software Maintenance task:

- Traceability Recovery - Source code labeling - Bug duplication - Feature Location

-

Software Maintenance task:

- Traceability Recovery - Source code labeling - Bug duplication - Feature Location

-

(19)

Information Retrieval

Special Chars.

Digits

White spaces

Stop-word function Java stop-word list English stop-word list Italian stop-word list

No Stemmer Porter Stemmer

English Snowball Stemmer Italian Snowball Stemmer

Software Artefacts

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Boolean tf tf-idf Log(tf+1) Entropy

Distance Function Distance

Function IR ModelIR Model

Vector Space Model Latent Semantic Indexing Latent Dirichlet Allocation Jensen-Shannon Divergence

Cosine Similarity

Jaccard Distance Euclidean Distance

Jensen-Shannon Divergence

Software Maintenance task:

- Traceability Recovery - Source code labeling - Bug duplication - Feature Location

-

Software Maintenance task:

- Traceability Recovery - Source code labeling - Bug duplication - Feature Location

-

(20)

What is the right IR process?

It is not possible to build a set of guidelines for assembling IR-based solutions for a

given data set

It is not possible to build a set of guidelines for assembling IR-based solutions for a

given data set

Different dataset require different IR parameters

Different dataset require different IR parameters

If not well calibrated, IR techniques perform worst than simple heuristics. A. De Lucia, M. Di Penta, R.

Oliveto, A. Panichella, and S. Panichella –Empirical Software Engineering

If not well calibrated, IR techniques perform worst than simple heuristics. A. De Lucia, M. Di Penta, R.

Oliveto, A. Panichella, and S. Panichella –Empirical Software Engineering

(21)

Predicting the performances?

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Special Chars.

Digits White space

Stop-word function Java stop-word list English stop-word list Italian stop-word list No Stemmer

Porter Stemmer

English Snowball Stemmer Italian Snowball Stemmer Boolean

tf tf-idf Log(tf+1) Entropy

Distance Function Distance Function IR Model IR Model

Cosine Similarity Hellinger Distance

LSI (k)

LDA (alpha, beta, n, k)

(22)

Predicting the performances?

Term 1 Term 2

Doc 1

Doc 2

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Special Chars.

Digits White space

Stop-word function Java stop-word list English stop-word list Italian stop-word list No Stemmer

Porter Stemmer

English Snowball Stemmer Italian Snowball Stemmer Boolean

tf tf-idf Log(tf+1) Entropy

Distance Function Distance Function IR Model IR Model

Cosine Similarity Hellinger Distance

LSI (k=3)

LDA (alpha, beta, n, k)

(23)

Predicting the performances?

Term 1 Term 2

Doc 1

Doc 2

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Special Chars.

Digits White space

Stop-word function Java stop-word list English stop-word list Italian stop-word list No Stemmer

Porter Stemmer

English Snowball Stemmer Italian Snowball Stemmer Boolean

tf tf-idf Log(tf+1) Entropy

Distance Function Distance Function IR Model IR Model

Cosine Similarity Hellinger Distance

LSI (k=4)

LDA (alpha, beta, n, k)

(24)

Predicting the performances?

Term 1 Term 2

Doc 1

Doc 2

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Special Chars.

Digits White space

Stop-word function Java stop-word list English stop-word list Italian stop-word list No Stemmer

Porter Stemmer

English Snowball Stemmer Italian Snowball Stemmer Boolean

tf tf-idf Log(tf+1) Entropy

Distance Function Distance Function IR Model IR Model

Cosine Similarity Hellinger Distance

LSI (k=4)

LDA (alpha, beta, n, k)

(25)

Predicting the performances?

Term 1 Term 2

Doc 1

Doc 2

Term Extraction

Term Extraction

Stop Word Removal Stop Word

Removal

Morphological Analysis Morphological

Analysis

Term Weighting

Term Weighting

Special Chars.

Digits White space

Stop-word function Java stop-word list English stop-word list Italian stop-word list No Stemmer

Porter Stemmer

English Snowball Stemmer Italian Snowball Stemmer Boolean

tf tf-idf Log(tf+1) Entropy

Distance Function Distance Function IR Model IR Model

Cosine Similarity Hellinger Distance

LSI (k=4)

LDA (alpha, beta, n, k)

(26)

Search-Based Solution (LSI-GA)

1) Problem Reformulation: Finding the IR process which maximize the quality of clusters

2) Solution Encoding

3) Fitness Function: Silhouette Coefficient

4) Solver: Genetic Algorithms X =

F(X)=Silhouette Coefficient (X) =1

𝑛 𝑖=1𝑛 𝑠𝑒𝑝𝑎𝑟𝑎𝑡𝑖𝑜𝑛 𝑑𝑖 −𝑐𝑜ℎ𝑒𝑠𝑖𝑜𝑛 𝑑𝑖 max 𝑠𝑒𝑝𝑎𝑟𝑎𝑡𝑖𝑜𝑛 𝑑𝑖 , 𝑐𝑜ℎ𝑒𝑠𝑖𝑜𝑛 𝑑𝑖

(27)

Empirical Evaluation

System KLOC Files Methods Features

jEdit 104 503 6,413 159

JabReg 74 579 4,607 39

2) Feature Location

3) Bug Report Duplication 1) Traceability Recovery

System N. Bug Rep. N. Duplication

Eclipse 225 44

(28)

Empirical Evaluation

1) Traceability Recovery 1) Traceability Recovery

(29)

Empirical Evaluation

1) Traceability Recovery 1) Traceability Recovery

Experimented techniques:

1. LSI-GA

2. Previously published IR process 3. Ideal IR process

Performance metrics:

Average precision

(30)

Results

0 10 20 30 40 50 60 70 80 90

EASYCLINIC ETOUR ITRUST

Average Precision

Systems

Comparison

Reference LSI-GA Ideal

LSI-GA outperforms baseline (p-value < 0.05)

(31)

Configuring LDA using GAs

A. Panichella, B. Dit, R. Oliveto, M. Di Penta, D.

Poshyvanyk, A. De Lucia

How to Effectively Use Topic Models for Software Engineering Tasks? An Approach based on Genetic Algorithms. ICSE 2013

A. Panichella, B. Dit, R. Oliveto, M. Di Penta, D.

Poshyvanyk, A. De Lucia

How to Effectively Use Topic Models for Software Engineering Tasks? An Approach based on Genetic Algorithms. ICSE 2013

(32)

Other works on P.C.

A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, S. Panichella Labeling Source Code with Information Retrieval Methods: An Empirical Study. Journal EMSE 2013

A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, S. Panichella Applying a Smoothing Filter to Improve IR- based Traceability Recovery Processes: An Empirical Investigation.Information and Software Technology (2012).

G. Capobianco, A. De Lucia, R. Oliveto, A. Panichella, S. Panichella Improving IR-based Traceability Recovery via Noun-based Indexing of Software Artifacts.Journal of Software: Evolution and Process(2012).

B. Dit, A. Panichella, E. Moritz, R. Oliveto, M. Di Penta, D. Poshyvanyk, A. De Lucia Configuring Topic Models for Software Engineering Tasks in TraceLab. TEFSE 2013

G. Bavota, A. De Lucia, R. Oliveto, A. Panichella, F. Ricci, G. Tortora The Role of Artefact Corpus in LSI-based Traceability Recovery. TEFSE 2013

A. Panichella, C. McMillan, E. Moritz, D. Palmieri, R. Oliveto, D. Poshyvanyk, A. De Lucia When and How Using Structural Information to Improve IR-Based Traceability Recovery. CSMR 2013

G. Bavota, L. Colangelo, A. De Lucia, S. Fusco, R. Oliveto and A. Panichella. TraceME: Traceability Management in Eclipse. ICSM 2013

(33)

Other works on P.C.

A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, S. Panichella Using IR Methods for Labeling Source Code Artifacts: Is It Worthwhile?ICPC 2012

A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, S. Panichella Improving IR-based Traceability Recovery Using Smoothing Filters.ICPC 2011. Best Paper Award

G. Capobianco, A. De Lucia, R. Oliveto, A. Panichella, S. Panichella Traceability recovery using numerical analysis.WCRE 2009.

G. Capobianco, A. De Lucia, R. Oliveto, A. Panichella, S. Panichella On the Role of the Nouns in IR-based Traceability Link Recovery.ICPC 2009.

G. Bavota, L. Colangelo, A. De Lucia, S. Fusco, R. Oliveto and A. Panichella. Enhancing Traceability Management in Eclipse via Information Retrieval and User Feedback Analysis. ECLIPSE

(34)

Main Contributions

Search-Based Program

Comprehension

Multi-Objectives Defect Prediction

Search-Based Test Data Generation

Multi-Objective Test Suite

Optimization

(35)

Bugs are everywhere…

(36)

Practical Constraints

Sofwtare

Quality Money

Time

(37)

Defect Prediction

Spent more resources on components most likely to fail

Spent more resources on

components most likely to fail

(38)

Defect Prediction Methodology

Predicting Model Predicting

Model

Predictors Predictors

Defect Prone Class1 YES Class2 YES Class3 NO

YES

ClassN

Past Projects Past Projects

New Project New Project

Past Defects

(39)

Defect Prediction Methodology

Predicting Model Predicting

Model

Predictors Predictors

Defect Prone Class1 YES Class2 YES Class3 NO

YES

ClassN

Past Projects Past Projects

New Project New Project

Past Defects

(40)

Multi-objective

Defect Prediction

(41)

Multi-objective Reformulation

1) Problem Reformulation: Finding the logistic regression coefficients (a,b,c,…) that optimize cost and effectiveness

2) Objectives Function:

3) Solver: Multi-objective Genetic Algorithms (NSGA-II)



 

i

i i

i i

i

Bug Pred

call

LOC Pred

Cost Re

min max

.

1

e

Logit e

 

a + b mi1+c mi2+ … a + b mi1+c mi2+ …

(42)

Multi-objective Genetic Algorithm

Pareto Optimality: all solutions that are not dominated by any other solutions form the Pareto optimal set Pareto Optimality: all solutions that are not dominated by any other solutions form the Pareto optimal set Multiple otpimal solutions (models)

can be found

Multiple otpimal solutions (models) can be found

Cost

Recall/Effectiveness

Multiple objectives are optimized using Pareto efficient approaches

Multiple objectives are optimized using Pareto efficient approaches

(43)

Empirical Evaluation

Context: Experimented Algorithms:

1. Multi-objective cross-project Logistic Regression

2. Traditional cross-project Logistic Regression

3. Traditional within-project Logistic Regression

4. Clustering (local) cross-project defect prediction

Performance metrics:

Cost = # LOC to analyze

Effectiveness/Recall = % defect-prone classes identified

(44)

Results

Log4j

jEdit

(45)

Multi-Objective Defect Prediction

G. Canfora, A. De Lucia, M. Di Penta, R.

Oliveto, A. Panichella, S. Panichella

Multi-Objective Cross-Project Defect Prediction. ICST 21013

G. Canfora, A. De Lucia, M. Di Penta, R.

Oliveto, A. Panichella, S. Panichella

Defect Prediction as Multi-Objective Optimization Problem.

Submitted as Special Issue on Journal STVR

(46)

Main Contributions

Search-Based Program

Comprehension

Multi-Objectives Defect Prediction

Search-Based Test Data Generation

Multi-Objective Test Suite

Optimization

(47)

Diversity is essential to the genetic algorithm because it enables the algorithm to search a larger region of the space.

GAs in Software Testing

(48)

Diversity is essential to the genetic algorithm because it enables the algorithm to search a larger region of the space.

GAs in Software Testing

Low Diversity High Diversity

(49)

Diversity is essential to the genetic algorithm because it enables the algorithm to search a larger region of the space.

Population drift

GAs in Software Testing

Low Diversity High Diversity

(50)

public class Triangle {

public String check (double a, double b, double c){

if(a == b) {

if(a == c)

return ‘equilater’;

else

return ‘isoscele’;

} else

{

if(a == c || b == c)

return ‘isoscele’;

else

return ‘scalene’;

} }

} 1.

2.

3.

4.

5.

6.

7.

Triangle Program

(51)

Search-based approach

public class Triangle {

public String check (double a, double b, double c){

if(a == b) {

if(a == c)

return ‘equilater’;

else

return ‘isoscele’;

} else

{

if(a == b || a == c || b == c) return ‘isoscele’;

else

return ‘scalene’;

} }

} 1.

2.

3.

4.

5.

6.

7.

(52)

Branch distance

(a == b) -> abs(a - b) (a == c) -> abs(a - c)

min f(a,b,c) = 2 * abs(a - b) + + abs(a - c)

public class Triangle {

public String check (double a, double b, double c){

if(a == b) {

if(a == c)

return ‘equilater’;

else

return ‘isoscele’;

} else

{

if(a == b || a == c || b == c) return ‘isoscele’;

else

return ‘scalene’;

} }

} 1.

2.

3.

4.

5.

6.

7.

Search-based approach

(53)

Branch distance

(a == b) -> abs(a - b) (a == c) -> abs(a - c)

min f(a,b,c) = 2 * abs(a - b) + + abs(a - c)

public class Triangle {

public String check (double a, double b, double c){

if(a == b) {

if(a == c)

return ‘equilater’;

else

return ‘isoscele’;

} else

{

if(a == b || a == c || b == c) return ‘isoscele’;

else

return ‘scalene’;

} }

} 1.

2.

3.

4.

5.

6.

7.

Search-based approach

Test Case 4

Triangle t= new Triangle();

String s=t.check(2,2,2)

(54)

c=2 a, b ϵ [-1;4]

a

b

Branch Distance

1) Flat seach space

2) Several Local optimal 3) Only one global

optimum

Triangle Program

(55)

a, b ϵ [-30;30]

c=2

a b

Branch Distance

GAs Simulation

Mutation Rate = 0.10 Population = 50

Crossover = single-point Premature convergence

(genetic drift)

Premature convergence (genetic drift)

(56)

Injecting Diversity

during the Evolution

(57)

What is the evolution direction?

P(t) = Population at generation t

(58)

What is the evolution direction?

P(t) = Population at generation t

P(t+k) = Population after k generations

(59)

What is the evolution direction?

Evolution Directions P(t) = Population at generation t

P(t+k) = Population after k generations

(60)

Why?

P(t) = Population at generation t

P(t+k) = Population after k generations Evolution Directions Orthogonal Individuals

(61)

How? Singual Value Decomposition

Population at generation t

t t

t

t

U V

P    

Population at generation t + k

k t k

t k

t k

t

U V

P

 

v2

v1

v1

v2

The currect evolution direction is proportional to

(62)

Using SVD for Evolution Direction

𝑈𝑡+𝑘 ∙ Σ𝑡+𝑘 + Σ ∙ 𝑉𝑡+𝑘 + 𝑉 𝑇

𝑃𝑡+𝑘

𝑃𝑡

(63)

Using SVD for Evolution Direction

𝑈𝑡+𝑘 ∙ Σ𝑡+𝑘 + Σ ∙ 𝑉𝑡+𝑘 + 𝑉 𝑇

𝑃𝑡+𝑘

𝑃𝑡

(64)

Then, we construct a new orthogonal population as follows

Using SVD for Evolution Direction

𝑈𝑡+𝑘 ∙ Σ𝑡+𝑘 + Σ ∙ 𝑉𝑡+𝑘 + 𝑉 𝑇

𝑃𝑡+𝑘

𝑃𝑡

𝑈𝑡+𝑘 ∙ Σ𝑡+𝑘 + Σ ∙ 𝑉𝑡+𝑘 + 𝑉¬ 𝑇 Orthogonal

Direction

(65)

Integration SVD with Standard GA

Selection Selection Initialize populationInitialize population

Terminate?

Terminate?

Crossover Crossover

Mutation Mutation No

• Rank Scaling Selection

• Single-point crossover

• Uniform mutation

(66)

Select best 50%

of individuals Select best 50%

of individuals

Generate an orthogonal sub-population Generate an orthogonal

sub-population

Replace the worst 50%

of individuals with new sub-populations Replace the worst 50%

of individuals with new sub-populations

SVD + Standard GAs

Selection Selection Initialize populationInitialize population

Terminate?

Terminate?

Crossover Crossover

Mutation Mutation No

(67)

Simulation on Triangle Program

SVD-GA Standard GA

Branch Distance

Branch Distance

(68)

Empirical study

No. Name Coverage Goals

P1 ArithmeticUtils 99

P2 Arrays 75

P3 Beta 90

P4 CreditCardValidator 32

P5 Complex 126

P6 FastMath 60

P7 Fraction 108

P8 IPAddressValidator 243

P9 LUDecomposition 76

P10 KolmogorovDistribution 50 P11 QRDecomposition 72

P12 Quadratic 7

P13 RootsOfUnity 27

P14 SaddlePointExansion 16

P15 Sort 70

P16 Tomorrow 107

P17 TriangularDistribution 50

Experimented Algorithms:

1. SVD-GA 2. R-GA

3. R-SVD-GA 4. Standard GA

Performance metrics:

Effectiveness = % covered braches

Efficiency/cost = # executed statements

(69)

RQ1:

Does orthogonal exploration improve the effectiveness of evolutionary test case generation?

40 50 60 70 80 90 100

P1 P2 P4 P5 P6 P8 P10 P11

% Branch Coverage

GA R-GA SVD-GA2 R-SVD-GA

(70)

0 10 20 30 40 50 60 70 80

P10 P12 P13 P14

Cost (#Exec. Statements)

GA R-GA SVD-GA R-SVD-GA

RQ2: Does orthogonal exploration improve the efficiency of evolutionary test case generation?

0 100 200 300 400 500 600

P7 P8 P15 P16

Cost (#Exec. Statements)

GA R-GA SVD-GA R-SVD-GA

(71)

Estimating the Evolution Direction of Populations to Improve Genetic Algorithm. A. De Lucia , M.

Di Penta, R. Oliveto, A. Panichella GECCO 2012

Orthogonal exploration

Orthogonal Exploration of the Search Space in Evolutionary Test Case Generation F. M. Kifetew, A. Panichella , A. De Lucia , R. Oliveto, P. Tonella ISSTA 2013

(72)

Main Contributions

Search-Based Program

Comprehension

Multi-Objectives Defect Prediction

Search-Based Test Data Generation

Multi-Objective Test Suite

Optimization

(73)

Software Evolution

Time Software continuously changes (evolves):

• Add new functionalities

• Removing old functionalities

• Bug fixing activities

• …

(74)

Regression Testing

Software before changes Software after changes

Test Case 1 Test Case 2 Test Case 3

Test Case n

Test Case 1 Test Case 2 Test Case 3 Test Case n

… …

(75)

Regression Testing is time consuming

1000 machine-hours to execute 30,000 functional test cases

for a software product…

1000 machine-hours to execute 30,000 functional test cases

for a software product…

Mirarab, et al. The effects of time constraints on test case prioritization:

A series of controlled experiments. TSE 2010

Mirarab, et al. The effects of time constraints on test case prioritization:

A series of controlled experiments. TSE 2010

(76)

Test Suite Optimization

Code Coverage

Execution Cost minimize

maximize

(77)

Multi-Criteria Regression Testing

Multi-Objective Paradigm Multi-Objective Paradigm

Multiple otpimal solutions can be found

Multiple otpimal solutions can be found

(78)

Multi-Criteria Regression Testing

There is no clear winner

(79)

No Solutions

Multi-Criteria Regression Testing

Population Drift There is no clear winner

(80)

There is no clear winner Population Drift

No Solutions

Multi-Criteria Regression Testing

(81)

Diversity Injection in NSGA-II

• Non Dominated Sorting Algorithm

• Crowding Distance

• Tournament Selection

• Multi-points crossover

• Bit-flip mutation

Selection Selection Initialize populationInitialize population

Terminate?

Terminate?

Crossover Crossover

Mutation Mutation No

(82)

Diversity Injection in NSGA-II

Selection Selection

Terminate?

Terminate?

Crossover Crossover

Mutation Mutation No

Generate orthogonal initial population Generate orthogonal

initial population • Use orthogonal design methodology to generate well diversified initial population

(83)

SVD + NSGA-II

Select best 50%

of individuals Select best 50%

of individuals

Generate an orthogonal sub-population Generate an orthogonal

sub-population

Replace the worst 50%

of individuals with new sub-populations Replace the worst 50%

of individuals with new sub-populations Selection

Selection

Terminate?

Terminate?

Crossover Crossover

Mutation Mutation No

Generate orthogonal initial population Generate orthogonal

initial population • Use orthogonal design methodology to generate well diversified initial population

(84)

Empirical Evaluation

No. Name LOC Test Suite Size

1 bash 59,846 1,200

2 flex 10,459 567

3 grep 10,068 808

4 gzip 5,680 215

5 printtokens 726 4,130 6 printtokens2 520 4,115

7 schedule 412 2,650

8 sechedule2 374 2,710

9 sed 14,427 360

10 space 6,199 13,583

11 vim 122,169 975

Experimented Algorithms:

1. SVD-NSGA-II + Init. Pop 2. NSGA-II

3. Additional Greedy Algorithm Problems:

1. 2-objectives

Execution Cost

Code Coverage

2. 3-objectives

2-objectives + Past Faults Coverage

Software systems:

Performance metrics:

# Pareto optimal solutions

% hypervolume = % detected faults per unit time

(85)

Results

RQ1: To what extent does SVD-NSGA-II produce near optimal solutions, compared to alternative techniques?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

0 3000000 6000000 9000000

Coverage %

flex

NSGA-II

Additional Greedy DIV-GASVD-NSGA-II

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 5000 10000 15000 20000

Coverage %

printtokens

Additional Greedy vNSGA2-II

DIV-GASVD-NSGA-II

(86)

0 50 100 150 200 250 300 350 400

N. of optimal solutions

Add. Greedy NSGA-II SVD-NSGA-II

Results

RQ1: To what extent does SVD-NSGA-II produce near optimal solutions, compared to alternative techniques?

(87)

RQ2: What is the cost-effectiveness of SVD-NSGA-II compared to the alternative techniques?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

% Detected faults per unit cost

Add. Greedy NSGA-II SVD-NSGA-II

Results

(88)

Diversity in T.S. Optimization

On the role of diversity measures for multi- objective test case selection. A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella. International Workshop on Automation of Software Test (AST) 2012

On the role of diversity measures for multi- objective test case selection. A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella. International Workshop on Automation of Software Test (AST) 2012

Improving Multi-Objective Search Based Test Suite Optimization through Diversity Injection. A.

Panichella, R. Oliveto, M. Di Penta, A. De Lucia. In major revision at IEEE Transactions on Software Engineering (TSE).

Improving Multi-Objective Search Based Test Suite Optimization through Diversity Injection. A.

Panichella, R. Oliveto, M. Di Penta, A. De Lucia. In major revision at IEEE Transactions on Software Engineering (TSE).

(89)

Summary

Search-Based Program

Comprehension

Multi-Objectives Defect Prediction

Search-Based Test Data Generation

Multi-Objective Test Suite

Optimization

(90)

Thanks!

Question?

Riferimenti

Documenti correlati

ore 20.00 - 06.00 OGNI SABATO nelle vie e piazze. DIVIETO

“Biblioteche e Musei rivestоo un ruolo sociale e di aggregaziоe in quanto ambienti ideali per la crescita formativa e culturale della persоa , la scelta di allestire un Baby

La campagna è stata inoltre presentata in occasione della presentazione del dossier sui voucher promossa da Inps e Veneto Lavoro il 16 maggio scorso alla presenza del presidente

PREVEDERE H &gt;= 0,60 m CONDOTTA IDRICA!. FOGNA H&gt;=0,60m L

&#34;Nuovo Codice Della Strada&#34;; essi dovranno essere realizzati con lamiere di alluminio e pellicola retroriflettente di classe 2, pali di sostegno

[r]

PIANO URBANISTICO ATTUATIVO IN VARIANTE DI POC

Il 22 gennaio 2019, in occasione del World Economic Forum (WEF) di Davos, organizzazioni della società civile, sindacati e movimenti, coordinati in Italia dalla Campagna Stop