Modelling and Analysing Software Repository Logs by Business Intelligence Techniques

(1)

Chapter 1 Introduction

Organization habits and prior experiences are the most used standard tech-niques to help the decision making process in modern software development Companies.

Practitioners often rely on their experience, intuition and good feeling in making important decisions. Managers allocate development and testing re-sources based on their experience in previous project and/or their intuition about the complexity of the new project relative to prior projects.

Managers hardly make use of software repositories to study the project progress.

Nevertheless, software repositories contain a wealth of valuable information about projects that could help Managers to understand strong and weak points of the software development process. Software repositories could sup-port the decision making process instead of the standard record-keeping use, hence Practicioners can depend less on their intuition and experience, and rely more on historical and field data.

Mining Software Repositories (MSR) is the field of the Software Engineering that analyzes the kind of data available in software repositories with the goal to extract information.

In literature papers, the main focus in the automation of the software devel-opment process helping many actors to understand and improve their work praxis. Only few works cover useful MSR from a Manager point of view, introducing some data analysis advices, such as Masticola [23], or giving a data organization structure, such as Sillitti et Al. [24].

In this study I display the results of mining software repositories in order to assess developers adopted work praxis to fix eventual problem occurred during the project development life cycle. These results could be useful in a decision making context, as they express the organization of the software

(2)

8 CHAPTER 1. INTRODUCTION team project and their praxis, revealing eventual weaknesses and improve-ment margins in e↵ectiveness.

This thesis tackles the MSR context by focusing on data warehouse niques for gathering and organizing data. Moreover it applies analysis tech-niques such as OLAP to explore data, and Data Mining to uncover interesting information.

The thesis is organized as follows:

The first part discusses the State of the Art in which the field background and applied theory will deepen.

The second part presents the methodology adopted, particularly will treat how the data warehouse is set up, which kind of structures are managed to query the data warehouse and finally will describe how the dataset was set up and its structure.

The third part treats results obtained from described techniques; charts, tables and data models are showed and pointed out.

The last part describes work conclusions and potential future works. This work was carried out at the department of computer engineering of the university of Coimbra with the CISUC1 _{researchers group.}