Computational models for liposomes generation and extrusion

(1)

1

Università degli studi di Pisa

Facoltà di Scienze Naturali, Fisiche e Matematiche

Corso di laurea magistrale in Biologia Molecolare e Cellulare

Anno accademico 2012-2013

Computational models for liposome

generation and extrusion

Relatore

Candidato

(2)

2

Introduction

Liposomes

Liposomes were observed for the first time in 1961 by British hematologist Dr Alec D Bangham FRS in 1961, at the Babraham Institute, in Cambridge[1]. When Babraham and his colleague were testing a new electron microscope by adding negative stain to dry phospholipids, the could clearly observe the resemblance to the plasmalemma, and obtain the first real evidence for the cell membrane being a lipid bilayer structure. The name liposome is derived by two Greek words, lipo (meaning fat) and soma (meaning body), and could be defined as “colloidal, vesicular structures composed of one or more lipid bilayers surrounding an equal number of aqueous compartments”[2]

. Liposome’s membrane is usually made of phospholipids, which are amphipathic molecules composed by a hydrophilic head and a long hydrophobic tail. Stable liposomes membrane are formed by two layers of phospholipids (bilayer). The hydrophilic heads are in contact with the aqueous environment and the hydrophobic tails facing each other. The lipids in the plasma membrane are chiefly phospholipids like phosphatidylethanolamine and phosphatidylcholine. Liposomes with a monolayer structure are called micelles. Liposomes usually contain a core of aqueous solution, and lipid spheres that contain no aqueous material are called micelles, however, reverse micelles can be made to encompass an aqueous environment [3]. Liposomes are a powerful tool for various applications, such as drug and gene delivery [4] and also are a good laboratory model for studies about the structure and function of biological membranes [5]. In addition, liposomes are of particular interest as microreactors for studies about enzymatic reactions inside vesicles [6][7].

Liposomes are classified on the basis of: structural parameter, method of preparation or composition and application type [8]. The classification based on structural parameters is anyhow the best to express their functional characteristics.

Classification based on structural parameters

 Small unilamellar vesicles (SUV): formed by a single lipid bilayer; its size ranges from 20 to 40 nm

 Medium unilamellar vesicles (MUV): formed by a single lipid bilayer with size ranging from 80 to100 nm

(3)

3

 Large unilamellar vesicles (LUV): formed by a single lipid bilayer with size ranging from 100 to 1.000 nm

 Oligolamellar vesicles. They are made up of several bilayers (typically 2-10), which surround a large internal volume.

 Multilamellar vesicles: they have several bilayers that can compartmentalize the internal volume in a large number of different ways. The bilayers’ arrangement can even be nested, with concentric spherical bilayers of LUV or MLV containing several SUV.

Figure 1: Different type of vesicles according lammellarity and size

Classification based on method of liposome preparation

 REV: Single or oligolamellar Vesicles made by reverse-phase evaporation method

 MLV-REV: Multilamellar Vesicles made by reverse-phase evaporation method

 SPLV: Stable Plurilamellar Vesicles

 FATMLV: Frozen and Thawned MLV

 VET: Vesicles prepared by Extrusion Technique

 DDRV: Dehydratation-rehydratation method

Classification based on composition and application

 Conventional Liposomes (CL): Liposomes composed by neutral or negatively charged phospholipids and cholesterol

 Fusogenic Liposomes (RSVE): Liposomes made by the reconstituted Sendai virus envelope

(4)

4

 Cationic Liposomes: Liposomes made by positively charged lipids

 Long Circulatory Liposomes (LCL): they have PEG (poly-ethylen-glycol) derivate on their surface to decrease their detection by phagocyte system. In such manner they extend their circulation time in the body.

 Immuno-Liposomes: CL or LCL with attached a monoclonal antibody or recognition elements.

The great liposomes’ variety is the main reason for their vast number of usages.

Liposomes in biotechnologies

Liposomes capability of entrapping soluble molecules justifies their extensive use in biotechnology and pharmaceutics. Liposomes are useful carrier for a wide range of molecules, from small drugs to DNA. Liposomes are used to carry the entrapped molecules in a specific district, protecting the internal environment from external. This can also reduce drug toxicity, by decreasing deleterious effects observed at similar concentration or by decreasing the concentration required for maximum therapeutic activity. Liposomes can target the drug delivery(?) to a single cell, as they are able to recognize a specific cell, to fuse with cell’s membrane, then releasing their content into the cytoplasm.

Generally, owing to their low cost and high productivity, proteins are prepared using in vivo ene expression systems. However, the problems associated with using living cells for recombinant protein expression include protein degradation and aggregation, or loss of template DNA. Furthermore, it requires several laborious experimental steps including DNA cloning in the vector, DNA transformation in cells, and overexpression of the desired protein in cells. Thus, there are limitations associated with using in vivo technology for protein production. Cell-free translation represents an alternative to in vivo expression, and rapid progress is being made in this field, which is gaining attention for its simplicity and high degree of controllability [9].

Liposomes and the “origin-of-life” issue

The “origin-of-life” problem is one of the most interesting problems with which modern biology is called upon to face. During last century two main theories were proposed to try to explain the emergence of living organisms. The Abiogenesis hypothesis describes the formation of living matter from a molecular evolution of inorganic compounds (“α – βίος” = non - life”), whereas the Panspermia hypothesis, stating that life itself, or its primary precursors, is present throughout the Universe (“πᾶν” = everything, “σπέρμα” = seed), and it is carried to different planets by space

(5)

5

vectors like meteorites, asteroids and similar objects. However, regardless where firsts organic compound were formed, whether on the Earth or in the outer space, a major attention must be paied to liposome organization and the features gained due to this organization. A cell is an autopoietic (i.e. self-producing) closed system, that means it is capable of generating its own components via a network process that is internal to its boundary [10]. Beyond autopoiesis, the other two main features that a living organism must show to complaint its definition are: self-maintenance and evolvability, both of which can be existent only in closed systems, capable to divide volumes and reactions. There is, then, a dichotomy between membrane formation and cellular component assembling, which is crucial for the understanding of the origins of life. Living cell membranes are composed by phospholipids structured in a bilayer, similarly to a liposome. Moreover phospholipids can aggregate in a closed system, and this self-organization capability is a fundamental property for any tool to be employed in the investigation about earliest biotic structures. During aggregation and closure processes, liposomes entrap molecules present in the solution, and thanks to lipid layer semi-permeability properties, only water or very small molecules can pass trough the membrane. A detailed study of entrapment phenomena and mechanisms can lead to a better understanding about the arising of complexity in living organism. In fact the majority of process that can take place even extremely simple form of life, involve metabolic networks. Those networks use a lot of molecules, with a large size range, that must be present in the same boundary, and be separated from other networks. Liposomes are excellent tools to investigate all this issues

Methods of liposomes preparation

Liposome formation is generally a spontaneous process. Lipid molecules organize themselves so as to minimize the interaction between their hydrophobic region (i.e. the aliphatic tail of fatty acids) and the aqueous environment. This is a thermodynamically favorable event, and brings to the spontaneous formation of stable close aggregates from a solution of non-aggregate lipid molecules in an aqueous solution. The shape and structure of such aggregates mainly depend on the relative concentration of lipid molecules. By changing the environmental parameters (concentration of species, temperature, pressure, pH) it is possible to influence the outcome of the aggregation event, thus obtaining liposomes with certain size or shape. The preparation method itself determines which kind of liposome is possible to obtain. A generic method to produce liposomes is to dissolve lipid molecules in an organic solvent. After the solvent evaporation a tiny film of lipid remain on the surface of the container. The addition of an aqueous solution (that could contain the molecules to

(6)

6

entrap inside the liposomes) and the shaking of the obtained mixture lead to the formation of liposomes.

Liposome radius distribution

The Helfrich distribution

A first approach to determine the size distribution of a liposomes population can start from their thermodynamic equilibrium. Is well known that lipid molecules has to be in a termodinamic equilibrium with both the aqueous and lipid phase [8]. If 0 is the chemical potential of a lipid

molecule in aqueous phase and 1 is the chemical potential in lipid phase, the probability to find an

aggregate formed by N molecules is

( )

_〈

_〉

( (

))

This is an exponential distribution, and <N0> is the mean number of lipid molecules in the aggregate. The corresponding distribution for lipid radii R is obtainable considering

Where a0 is the area occupied by each lipid molecules. So eq1 becomes

( ) ( (

))

Where Rm is the maximum in the theoretical distribution. This distribution reflect very well the link between the energy of condensation and the gain in entropy, and consider a decrease in condensation energy as an increasing in aggregate size[12]. This model does not include a parameter depending of thermal fluctuations of the interface between the two phases. Thermal fluctuations cause a decrease in total elastic energy of the membrane, and this is reflected on the

(7)

7

specific macroscopic size. In fact larger vesicle are more flexible than smaller ones so, considering the thermal component eq 3 being transformed in the Helfrich distribution[13]:

( )

(

)

Counting the thermal contribution to the aggregation event a relevant narrowing in the distribution is observed.

Figure 2: Helfrich distribution centred on mean radius of 50 nm. 2a:Distribution using eq 3 2b: Distribution using eq 4

Scattering in polydisperse samples

In the last years Dynamic Laser Scattering (DLS) has become a standard tool for the measurement polymerization event, and the formation of lipid membranes and they’re size is no exception. Measurement of DLS are very useful indeed to determines the size distribution of a vescicles population. Scattering is a general physical process where some forms of radiation, such as light or sound (but even moving particles), are forced to deviate from a straight trajectory by one or more localized non-uniformities in the medium through which they pass. In this case a laser is used as the source and liposomes are the non-uniformities which deviates the laser trajectory. The spectrum and the autocorrelation function of the scattered light are the elements which contain the information

(8)

8

about sample size distribution. The measured quantity in a DLS experiment is the scattered electric field correlation function[14]. For N molecules (i.e. vesicles) of polarizzability , the intensity of scattered light can be calculated as

( )

where A is a geometric constant, 10 is the incident power and P(qr) is the form factor. The function

P depends on the product of the momentum transfer q

(

) (

)

and the characteristic size r. In a DLS the correlation function for the population can be written

( )

( ) ( )

where D is the translational diffusion coefficient and t is the time. If the distribution is quite narrow, the correlation function is

( )

₍

₎



is a parameter linked to the mean radius

Where k is the Boltzman constant, T the temperature,  the viscosity and R the radius of the equivalent sphere. The sigma parameter is linked by the dispersion, or the standard deviation normalized for the average of the radii

〈

_{〈 〉}

〉

Logarithm of correlation function is:

(9)

9

( ) (

)

Considering the approximation according to which ln(at2) is equal to at2 when a tend to infinity, eq 7 becomes:

( )

From this equation we can easily obtain both parameters  and 

The Shultz distribution

For extruded liposomes, the size distribution can be well described by Schultz distribution [14]. Schultz function is a well know function used in organic chemistry to describe polymerization processes, based on their relative probabilities of occurrence. For the two parameter unimodal Schultz distribution, the fraction of liposome, after extrusion, having size r is given by:

( )

_{( )}

(

_{〈 〉}

)

( )

( ( )

_{〈 〉}

)

The z parameter is an index of the distribution width, related to the nth order average radius. For narrow distribution z is among 0.05-0.1, and among 0.3-0.5 for withers. Due to this fact eq1 must be modified to allow the calculation of factorial of numbers with a comma. To improve this step is used the gamma function

( )

(

〈 〉

)

( )

( ( )

〈 〉

)

(10)

10

(

̅̅̅̅̅̅̅̅

)

̅̅̅

( )

This value is the most useful to describe a population distributed by Shultz. One of the most interesting features of this distribution, is that it has an asymptotic approach to Gaussian distribution, aspect which is very useful for biological system which have the tendency to form aggregates with a large spectrum of different sizes.

Figure 3: Schultz distribution at various value of z

Solute distribution

Analyzed , therefore, the liposome size distributions, obtainable from all the methods previously illustrated, we will focus our attention on the soluble molecules that can be trapped inside them, and the distribution, theoretical and observed, which they follow.

Entrapment following a Poisson Distribution

The conventional idea is that the entrapment of water-soluble molecules, in an aqueous environment, inside a lipid vesicle is a random event which follows a Poisson distribution. [16] [17] The mean number a of internalized molecules is calculated by an a priori probability, depending by the concentration of the molecules in the solution medium and by the internal volume of the liposome (which is supposed to be spherical). Those element are combined in eq 1

(11)

11

(

)

where Ca is the concentration of the specimen is solution, d is the vesicle diameter and ρ is the bilayer thickness. The Poisson distribution states that the probability to find a liposome with n molecules inside, of a certain specimen is given by eq 2

( ) (

)

From this equation we can easily calculate the probability of obtaining a liposome with at least 1 molecule inside, by the difference between the total probability ( that is equal to 1) and the probability to find an empty liposome.

( ) ( )

If more than one specimen is present in the medium, we can use eq 3 to calculate the probability to find a liposome which contains k specimens. The entrapment events are assumed to be independent, so the entrapment event of a specimen does not modify the probability of another entrapment event. The equation that calculate this probability takes the form of a product of sequences

(

) ∏( (

)

(12)

12

Figure 4: Poissonian distribution at various 

The anomalous entrapment phenomena

As shows before the probability of having a liposome which encloses a great number of different specimens (and of course a large amount of molecules), heavily depends on the vesicle diameter d and on the concentrations of the different species (Ck). To analyze the inner content of such small liposome, it’s necessary to use technique of electron microscopy, due that the resolution limit of an optical microscope is about 300 nm, unfeasible even to distinguish two different 100 nm-radius liposomes in the medium. Using electron microscopy the limit of detection is highly extended.

Electrons have

wavelengths about 100,000 times shorter

than photons (visible light has

a wavelength from 740 to 380 Figure 5: Solute distribution of Ferritin in Liposome with 100 nm radius according to Poissonina expected distribution (open symbol) vs experimental observed distribution (closed symbol)

(13)

13

nm), making possible to achieve a resolution limit even below the nanometer range. Two studies, used ferritin [18] or ribosomes [19] as entrapped species shows a great deviation from the Poisson distribution of solute distribution inside nano-scale liposomes, independently from the initial solute distribution or the liposome preparation method. In all the cases the majority of liposome are practically empty, but there is small (but still important) number of nanoscale liposome filled with a large number of molecules and so able to protein translation. This phenomena, that we can name a “super-crowding phenomena”, implies the presence of a biophysical phenomena, still unknown, which influence the entrapment event and departs it from the assumption of full randomness in the whole process. The “super-crowding phenomena” is strictly related by the size of the liposomes, and affects only nano-scale liposomes, while liposomes with a volume of femtoliter (10-15L with a radius of 1m) show a inner solute distribution according by the Poisson distribution. In 2001 Ueda and collaborators [20] develops a free cell system for protein synthesis called PURESYSTEMTM. The “Protein Synthesis Using Recombinant Elements” (PURE) system includes 83 elements between enzymes, amino acids, elongation factors, ribosomes and other components. These 83 elements are the minimal set of molecules able to afford protein production from a DNA sequence. Measurements of fluorescence provide data from which one obtains easily estimate the amount of protein produced, and so of the amount of each PURESYSTEMTM element encapsulated inside the liposome. Considering that

all the PURESYSTEM macromolecular species are present in solution with a concentration <10 μM, the probability to find a 100 nm (inner diameter) liposome capable to afford protein production is absolutely negligible, even considering a possible aggregation of the PURESYSTEM species into macromolecular clusters. However, a recent works [21], shows that liposomes with an inner radius of 100 nm (that is the smallest radius considered in literature for protein expressions) are still capable of protein expression, and the measurement of fluorescence, connectable to the presence of GFP molecules resulting of a complete protein translation process, is 6.1-times higher than in bulk water. This means that in one liposome are simultaneously presents all the 83 species of PURESYSTEM, and that their local concentration inside the liposome become higher by a factor twenty. In this case we’re in front of the “supercrowding effect” involving all the species, independently from their chemical nature.

(14)

14

Entrapment following a power-law distribution

The power-law distribution (also called heavy tail distribution, Pareto-like laws or Zipf-like laws) is a mathematical tool that have been largely used in the modeling of distinct real phenomena [22]. The results collected by the previously named papers, show that the distribution that better fit with experimental data, about solute distribution in nanoscale liposomes, is just the power law distribution. Prove the influence of a power-law behavior in natural systems is a difficult task because there is still a controversial between many authors that claims that power-laws is simply a statistical phenomena that emerges when handling large amounts of data, or that is a supersede lack in power-laws model and theory. However power-law is more frequently used, with good results, to investigate a large number of natural or human linked phenomena. Power-laws says that the size of an event is inversely proportional to its frequency. If x, the amount of molecules enclosed inside a liposome, is a non negative, discrete random variable following the power-law distribution, then its complementary cumulative distribution function is in the form of eq 1

( )

Where , the coefficient of the power law distribution, and C, a normalization constant, are both major than 0. This equation state that the is very unlikely to find an empty liposome, and the probability to find a liposome with a great number x of entrapped specimens, even if very low, is still meaningful considering a large number of created vesicles. The solute concentration measured by fluorescing signals, shows a strong super concentration effect in the few filled liposomes.

(15)

15

Figure 6: Power-law distribution with various k

Solute distribution in extruded liposomes

The processes that drive the anomalous entrapment in nano-scale liposomes are still unknown. One of the more interesting question is if this phenomena take place only when liposomes directly form itself or we can observe the same behavior even if we force the creation of smaller (nano-scale) liposomes from a pre-existing bigger one. Answering this question will lead us to better understand the role of lipid membrane in the whole process. Established that Poisson distribution well represent the solute distribution in large liposomes, this kind of distribution no more describe the new system. In fact under divide a whole which contain only few element, undermines the conditions in which we can properly apply the distribution. A full random model to describe the solute distribution after the liposomes extrusion, is the multivariate hypergeometric distribution.

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes in n draws without replacement from a finite population of size N containing a maximum of K successes. This is in contrast to the binomial distribution, which describes the probability of k successes in n draws with replacement. A random variable X follow the hypergeometric distribution if its probability mass function is given by eq 1

(16)

16

( )

(

)(

)

(

)

When we considered the solute distribution in directly formed liposomes the entrapment event of a single molecule does not modify the probability of a second entrapment event for the same specimen, due to the fact that the concentration in the bulk solution does not change for each single event. In the extrusion process the starting concentration inside liposome is too low to justify this assumption. In the hypergeometric distribution each random selection made from the population decrease the population itself and modify the probability of success causing its change at each draw. Hypergeometric distribution can be applied only to population whose elements can be classified into two mutually exclusive categories. This is not the case of PURESYSTEM. In a population with

N elements that can be divided in i specimens, each present Ki times, the probability to have ka elements in a random sample of size n is given by eq 2:

(

)

∏

(

)

(

)

This kind of distribution, the multivariate hypergeometric distribution, has the same relationship to the multinomial distribution that the hypergeometric distribution has to the binomial distribution. A solute distribution that follows this kind of distribution does not imply any process guiding the molecules allocation in liposomes resulting from de extrusion, but is a full random process.

(17)

17

Aim of the thesis

The fact that anomalous solutes entrapment phenomena takes place when vesicles, droplets or liposome forms themselves at nano-scale ranges is known by several years. Those phenomena had been well observed when vesicles forms from disperse molecules in an aqueous environment. Previous studies performed by our group have developed suitable stochastic models to detect anomalies when the volume of the considered vesicle tends to zero, and the simulations performed well fix with experimental data. This have confirmed that the solute distribution follow a power law distribution instead a Poisson distribution, with the appearance of a relevant number of vesicles filled with a large number of different chemical species (in this studies PURESYSTEMTM, a commercial cell-free system of protein synthesis composed by 80 different chemical species, was used to study the entrapment phenomenon). In this work the question if this anomalous entrapment phenomena does occur only when nanoscale liposomes are directly formed or if them take up also when liposomes are generated by extrusion from larger liposome. This question is interesting because different experimental evidence suggest that in large volume vesicles, the entrapped solution doesn’t show any anomalies in its distribution. Answering this question would also lead us to better understand the role played by the lipid membrane, or by the dimension of entrapped solutes, in this anomalous entrapment phenomenon. To do this a new stochastic simulation environment is necessary, an environment able to simulate various kind of distribution (Poisson distribution, power-law distribution, Shultz distribution etc.) with the aim to simulate and give a quantitative description of the whole process, starting from the formation of large liposomes, until the formation of liposome after extrusion.

(18)

18

Materials and Methods

Computational tools

The

Python language

Because scientists have long relied on the open availability of each other's research results, it was only natural that they would turn to Open Source software when it came time to apply computer processes to the study of biological processes.In comparison to Perl, Python is a relative newcomer to bioinformatics, but is steadily gaining in popularity. A few of the reasons for this popularity are the:

 Readability of Python code

 Ability to development applications quickly

 Powerful standard library of functionality

 Scalability from very small to very large programs

The Python language was designed to be as simple and accessible as possible, without giving up any of the power needed to develop sophisticated applications. Moreover Python has many module that can be added to manipulate statistical and biological tools. In addition, Python integrates well with systems written in other languages, such as C, C++, Java and Fortran. One of the main benefits of C is speed. When a programmer needs an algorithm to run as fast as possible, they can code it in C or C++ and make it available to Python as an extension module. This is the case of LPAC library in Numpy, a Python module. The Python library, created for this work and named lbstatistica, containing the functions used to create the models for the simulation, is based on three famous Python modules:

 Numpy

 Scipy

 Mathplotlibrary

This three modules contain useful top level, statistical and mathematical tool. They’re based on C++ library LAPACK, so they combine the power and the performances of C++ language, to Python flexibility, user friendly, and high capability to use list and arrays. They were imported in the library as

(19)

19 from math import exp

import math

import scipy.special as scp

import scipy.misc as scm

import numpy as np

import scipy.integrate as sct

Those modules were used to write the code forming the different functions implied to create the mathematical model for the stochastic simulations.

The lbstatistica library

To improve a computational environment with the aim to link all the models shows below the first step is create a series of function capable to calculate all the propose distribution for a list of given value. In lbstatistica there are two classes of function named function and distribution. These classes contain, respectively, functions which calculate the value of a point from a determinate distribution, and the whole distribution. For what concerns the computation and the generation of random number distributed by a Poissonian distribution, Numpy provide all the function needed, so, we use this built-in function in our work.

Belonging to the function class are:

 schultz function  helfrich function  power_law function  hypergeom_mul function  volume  exponential  sphere surface  normal_value

The schultz function allows to calculate the value of this function from three given data, the value for which we want to calculate the corresponding value in the function (r), the mean radius(rm) and the z parameter of the distribution (z).

(20)

20 def schulz(r,z,rm): r2 = r*r z1 = scp.gamma (z + 1) p1 = (r2/z1) p2 = (((z + 1)/rm)**(z+1)) p3 = exp(-(z+1)*(r/rm)) return p1*p2*p3

This function calculate the Schultz function’s value for a passed value. To calculate the distribution of directly formed liposomes, the helfrich function is implemented. This function allows to calculate the corresponding value from a given number representing the liposome radius (r) and the maximum of the theoretical distribution (rm).

def helfrich (r, rm):

a = (9*(r**3))/(2*(rm**4))

b = exp((-1)*((3*(r*r))/(2*rm*rm)))

return a*b

The hypergeom_mul function allow to calculate the probability of (k) success in (n) draws without replacement from a finite population of size (N) containing exactly (K) successes.

def hypergeom_mul (n,k,N,K): num = 1 up = len(n) for i in range (0, up): num *= (scm.comb (n[i], k[i])) den = (scm.comb(N,K))

val = num/den return val

(21)

21

The power_law function allows to calculate the corresponding value for a passed value (x), knowing two parameters, the coefficient of the power law distribution (a) and the normalization constant (c) def power_law (c, a, x): pt1= c pt2 = x**(a-1) ris = pt1*pt2 return ris

The volume function allow to calculate the volume of a sphere of radius r

def volume(r):

ris=(((3.14*4)/3)*(r**3))*1e3

return ris

The sphere_surface function allows to calculate the surface of a sphere of radius r

@staticmethod

def sphere_surface(r):

return 4*np.pi*r*r

The exponential function allows to calculate the exponential function’s value for a passed value x, knowing the intercept value y, the normalizing constant y and the exponent parameter b.

@staticmethod

def exponential(x,y,a,b):

return y+a*exp(-b*x)

At least the normal_value function calculate the probability that a random variable x, take the value value, assuming that it is distributed by a Normal distribution with parameters mean and sigma

(22)

22

def normal_value (mean,sigma,value,neighborhood):

p = sct.quad(lambda x: 1/(sigma * np.sqrt(2*np.pi))*np.exp(-(((value

-mean)**2)/(2*sigma**2))), (value-neighborhood), (value+neighborhood))

return p

Belonging to the distribution class are:

 schultz function

 helfrich function

 power_law function

 hypergeom_mul function

To obtain the whole Schultz distribution for a series of values passed in the form of

List_of_value: [value1,value2,value3…valuen]

The schultz function, in distribution class was implemented. (el) is the list of passed values, while (rm) is the mean radius and (z) the polydispersity index

def schulz(el, z, rm):

el.sort()

radii=[]

n = 0

while n < len(el):

val = function.schulz (el[n],z,rm)

radii[n:n]=[(el[n])*((val)*(100))]

n = n + 1

return radii

this function return a list of list, each containing a calculated value, that form the whole Shultz distribution:

List_of_result = [[result1],[result2], [result3],….,[resultn]]

To obtain the whole Helfrich distribution from a list of values, the helfrich_dist function must be used. (el) is the list of passed values, and (rm) is the mean radius

(23)

23 def helfrich(el, rm): el.sort() dis = [[]] * len (el) radii=[] n = 0 while n < len(el):

dis[n] = function.helfrich (el[n],rm)

radii = radii + ([el[n]]*int((dis[n])*10**10))

n=n+1

return radii

This function return a list of list, each containing a calculated value, that form the whole Helfrich distribution:

To obtain the whole Multivariate Hypergeometric distribution, establishing the number of success considered, we use the hypergeom_mul function, where (el) is a list of given values, which represent the number of draws, (hit) is the number of success, (N) is the population size and (K) the number of possible success.

def hypergeom_mul (el, hit, n, r):

dis = [[]] * len (el)

a = 0

while a < len(el):

val = function.hypergeom_mul (el[a], hit, n, r)

dis[a:a] = [(el[a])*((val)*(100))]

a = a +1

return dis

As for the previous functions, this function return a list of list, each containing a calculated value, that form the whole Multivariate Hypergeometric distribution:

(24)

24

To obtain the whole distribution for a range of passed values, we used the power_law_dist function, where (el) is a list of passed values, and (a) and (x) the parameters request for the calculation of the function

def power_law (el, c, a):

dis = [[]] * len (el)

n = 0

while n < len(el):

val = function.power_law (el[n],c, a)

dis[n] = (dis[n] + [val])

n = n +1

return dis

As for the previous functions, this function return a list of list, each containing a calculated value, that form the whole Power Law distribution :

(25)

25

Random number generation

The generation of random number is a crucial point of this work. Any stochastic simulation performed on a computer have the limitation that it is not actually possible the generate a real random number, but you must use an algorithm, intrinsically deterministic, that mime a stochastic event. There are many way to do this, and we choice to use Numpy package due to its velocity and its performance from the point of view of the randomness of the result. Basically random number are used to check if a certain event take place or not, comparing the number with the probability that the event occur. The functions which simulate a random event, such the formation of a liposome with a certain radius or the inclusion of a certain amount of molecules, are collected in the random class in lbstatistica library. Belonging to the random class are:

 schultz function

 helfrich function

 power_law function

 hypergeom_mul function

 exponential function

The Schultz function allow to obtain nsample random number distributed by the Schultz function, knowing the “polydispersity index” z, and the mean radius, rm, of the distribution. inf and sup are the respectively the minimum and maximum values that could be used assumed generated by the function, while increase is the distance from each possible values given by the function. neighborhood is used to calculate the integer which give the probability associated to each point of the distribution.

(26)

26 def schulz (inf, sup, increase, nsample,z,rm,neighborhood):

prob = {}

radii = np.arange(inf,sup,increase)

for i in radii:

temp=sct.quad(lambda x: function.schulz(i, z, rm), (i-neighborhood), (i+neighborhood)) prob[i] = temp[0] cont = 0 ris = [] norm = 0 for i in prob: norm += prob [i]

while cont < nsample:

caso = norm*(np.random.random_sample())

choiche = 0

for i in prob:

choiche = choiche + prob[i]

if caso < choiche: ris.append(i) break cont += 1 return ris

(27)

27

This function returns a list of elements, each corresponding to a different random radius

List_of_result = [result1,result2,result3,….,resultn]

The helfrich function allows to obtain nsample random number distributed by the Helfrich distribution, knowing the mean radius, rm, of the distribution. inf and sup are the respectively the minimum and maximum values that could be used assumed generated by the function, while increase is the distance from each possible values given by the function. neighborhood is used to calculate the integer which give the probability associated to each point of the distribution.

def helfrich (inf, sup, increase, nsample, rm, neighborhood):

prob = {}

for i in radii:

temp=sct.quad(lambda x: function.helfrich(i, rm), (i-neighborhood), (i+neighborhood)) prob[i] = temp[0] cont = 0 ris = [] norm = 0 for i in prob: norm += prob[i]

choiche = 0

for i in prob:

if caso < choiche:

ris.append(i)

break

cont = cont + 1 return ris

(28)

28

The power_law function allows to obtain a single random number distributed by a Power law distribution, knowing the normalizing constant c and the function degree a. means is a list of values where each values represents the amount of a certain type of elements.

def power_law (means,a,c):

values=[]

for i in means:

prob=[]

cont=1

while cont<=i:

temp1 = function.power_law(c, a, cont)

temp2 = function.power_law(c, a, (cont+1))

prob = prob + [temp1-temp2] cont=cont+1

random = c*(np.random.random_sample())

cont = len(prob) - 1

while cont >=0 and random > prob[cont]:

cont = cont - 1

values = values + [(1+cont) * (cont >= 0)]

return values

The function returns a list of element, one for each element in medie, representing the chosen random number.

(29)

29

The hypergeom_mul function allows to obtain a random sample of number distributed by an Multivariate Hypergeometic distribution from a starting whole containing different type of elements. type is a list of number corresponding to the amount of elements of each type presents in the starting whole, while size is the sum of element in each random sample and sample the number of random sample

def random_hypergeom (types, sample, size):

p = len(types)

n_all = sum(types)

rvs = np.zeros((size, p), int)

n_bad = n_all

n_remain = sample * np.ones(size, int)

for ii in range(p-1):

n_good = types[ii]

n_bad = n_bad - n_good rvs_ii = rvs[:,ii]

mask = n_remain >= 1

need = mask.sum()

rvs_ii[mask] = np.random.hypergeometric(n_good, n_bad, n_remain[mask],

size=need)

rvs[:,ii] = rvs_ii

n_remain = np.maximum(n_remain - rvs_ii, 0)

return rvs

This function returns an array of lists, where the numbers contained in the lists are the amount of each type in the starting whole, contained in the sample.

Array_of_results=([Amount_of_Type_1,Amount_of_type_2,…,Amount_of_type_n],[…]) The exponential function allows to obtain nsample of random number distributed by an exponential distribution, knowing , knowing the intercept value y, the normalizing constant y and the exponent parameter b. increase is the distance from each possible values given by the

(30)

30

function. neighborhood is the left, and the right border used to calculate the integer which give the probability associated to each point of the distribution.

@staticmethod

def exponential(inf, sup, increase, nsample, y, a, b, neighborhood):

prob = {}

for i in radii:

temp=sct.quad(lambda x: function.exponential(i,y,a,b), (i

-neighborhood), (i+neighborhood)) prob[i] = temp[0] cont = 0 ris = [] norm = 0 for i in prob: norm += prob[i]

choiche = 0

for i in prob:

if caso < choiche: ris.append(i) break cont = cont + 1 return ris

(31)

31

The computational models

All the functions previously shown, had been created to answer to a certain purpose. They are tools that must be used to create a linked model which, at the end, generate an amount of vesicles and their filling. The way of how this filling is created, and the number (as the radius) of the vesicles are determined by the investigator from the choice of which function use. While the use of the Helfrich distribution and the Shultz distribution is obligatory, because literature and experimental practice give a strong indication to use it, the way of generating the solute distribution is the center of this work. The experimental verify of each hypothesis, is long and quite difficult. The final results of those experimental studies, by a mere numerical point of view, does not give information about which hypothesis is correct and which not, so it is obligatory to adopt a “reverse ” method. This imply the creation of several computation models able to discriminate different experimental cases, by giving results obtained in a well characterized way, which can be confronted with experimental data. Even if the start hypothesis are different, and so the models are effectively different, is impossible to know if the simulations will give results that can be distinguished or not, and in the second case if it is due to stochastic fluctuations or to other reasons.

Workbench

All the simulations were performed on a AMD A10-5800K APU 3.80 GHz processor, with 4,00 GB of installed RAM. Data analysis was performed on Microsoft Ecell® and SigmaPlot software.

(32)

32

Experimental distribution

Parallel to this study another job was performed, more experimental in nature, with the purpose to analyze the real behaviors of solute during the formation of liposomes and their extrusion, in addition to describing the size distribution of liposome before and after the extrusion. The collected data were used as a test bed to test the robustness of the computational models. On the other hand, computational models were also meant to highlight any anomalous deviation in the distribution of size and concentrations from the expected ones. This work was performed in University of Rome 3, and will be subject for a master degree thesis.

Molecular solutes

Different populations of GUVs, and so of VETs, were created, containing differents kind of molecular solutes. This solutes are Pyranine, Calcein, FITC-Dextane, Fitch-BSA.

• Pyranine (trisodium 8-hydroxypyrene-1,3,6-trisulfonate; MW 524.37 Da) is a small hydrophilic molecule. It shows a pH-dependent fluorescence (excitation 460 nm; emission 510 nm in 0.1 M Tris pH 8.0).

• Calcein (fluorexon) is a small fluorescent dye (excitation 495nm; emission 515 nm; MW 622.55 Da).

• FITC (Fluorescein 5-IsoThioCyanate) is a functionalized fluorescein molecule, in which a isothiocyanate replaces a hydrogen atom from the bottom ring of fluorescein. FITC molecules have a 389.382 molecular mass and pH-dependent fluorescence (excitation 490 emission 520 at pH 8). We used FITC molecules conjugated with BSA or dextrane, with several fluorophores attached to a single BSA or dextrane molecule.

• FITC-DEX: Dextrane molecules conjugated with FITC molecules.Dextran is a polysaccharide composed of glucose molecules (such polysaccharides are named glucans) and characterized by complex and branched organization. Its structure consists in a strait chain (alfa-1,6 glycosidic) of glucose molecules while branches begins from

(33)

33

alfa-1,3 linkages. Dextran molecular mass varies, there are various commercial dextran populations with different molecular mass. In our work we used 40 kDa Dextran labeled with FITC (FITC molecules are randomly conjugated to dextran at a frequency of 0.003 to 0.020 moles of FITC per glucose mole).

• BSA-FITC: Bovine Serum Albumine (BSA) conjugated with FITC molecules. BSA is a serum albumin protein derived from cows (583 aa; molecular weight 69 kDa ), often used as molecular weight standard. BSA-FITC consists in BSA conjugated with FITC molecules ( >7 FITC moles per BSA mole).

The different nature of the solute, especially in terms of size, has been used to verify if there were phenomena of divisions, which deviate from the expected distributions, related to the size of the solute itself. For each molecular solute, size of GUVs and VETs, were measured as their concentration in both of them. A small aliquot of each population has been analyzed by confocal microscopy, followed by image analysis to infer solute and size distributions.

(34)

34

Experimental size distribution

Here are reported the size distribution for VETs and GUVs of the four population, containing each a different solute from the ones described before. Graphs reports the relative frequencies of the measured radii, on the y axis, and the radii(in m), divided in class, on the x axis.

Pyranine

0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,1 0 1 2 3 4 5 re l fr e q vesicle radius(m)

GUv size distribution

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 0 0,2 0,4 0,6 0,8 1 re l fr e q vesicle radius (m)

(35)

35

Calcein

0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0 2 4 6 8 10 12 14 16 re l fr e q vesicle radius (m)

GUV size distribution

0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0 0,2 0,4 0,6 0,8 1 1,2 re l fr e q Vesicle radius (m)

(36)

36

FITC-Dex

0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0 1 2 3 4 5 6 7 re l fr e q vesicle radius (m)

GUV size distribution

Serie1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 0 0,2 0,4 0,6 0,8 1 1,2 1,4 re l fr e q vesicle radius (m)

(37)

37

FITC-BSA

0 0,02 0,04 0,06 0,08 0,1 0,12 0 1 2 3 4 5 6 7 8 9 10 re l fr e q vesicle radius (m)

GUV size distribution

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0 0,2 0,4 0,6 0,8 1 1,2 re l fr e q vesicle radius (m)

(38)

38

Experimental solutes distributions

Through fluorescence measurements acquired by confocal microscopy, and subsequent analysis of the images obtained by means of special software, we measured the concentration of solutes within the liposomes. This was performed for each population of VETs and GUVs, and for each kind of solutes. Below are reported the results of this measuring. Graphs show the number of entrapped molecules on the x axis, and the absolute frequency of each occurrence on the y axis.

(39)

39 Calcein

(40)

40 FITC-DEX

(41)

41 Pyranine

(42)

42

Computational strategies

Generation of GUVs theoretical size distribution

The first step is the generation of a population of GUVs, and this is accomplished using the helfrich function in the random class. To avoid errors linked to stochastic fluctuation each population generated is composed by 100.000 elements. All the radius generated are then saved in a .txt file to be then analyzed and confronted with experimental data. Below is reported the models used

import libstatistica as lbs

import numpy as np

#step1

lista_GUV = lbs.random.helfrich(1.25e-06, 9.5e-06, 1e-08, 100000, rm, 5e-09)

raggi_GUV = []

# step2

for i in lista_GUV:

temp = np.random.random_sample()*1e-08

raggi_GUV.append(i+temp)

# step3

lista_GUV = str(raggi_GUV)

lista_GUV=lista_GUV.replace("[","").replace("]","").replace(",

","\n").replace("(","").replace(")","").replace(",","").replace(".",",") f0 = open("GUV .txt", "wb")

f0.write(lista_GUV)

(43)

43

In the first step we generate 100.000 radius are generated, following an Helfrich distribution of mean radius rm. The radius can take values from 1.25x10-6m and 9.5x10-6m, according with the definition of GUV size. These numbers are accurate up to the second decimal place, so to avoid error due to a too high discretization of radii, two others decimal place are added to each radius. This is done in step two. Generate random number accurate to the fourth decimal place proved to be too expensive in terms of time and computational resources, so we prefer to low the discretization of the distribution, by taking random float number with uniform probability to enhance the number’s accuracy . The last step is to save the obtained data in a .txt file.

(44)

44

Generation of VETs theoretical size distribution

To generate a population of VETs first we generate a initial population of GUVs. This part is similar to that seen previously for the generation of GUVs populations. The mean radius used is depending to the molecular solute, to preserve the most similarity to the experimental data. Below is reported the entire model used for VETs generation

#step 1

GUV_radius = lbs.random.helfrich(1.25e-06, 5e-06, 1e-07, 9000, rm, 5e-08)

#step 2

VET_list = []

cont = 0

for i in GUV_radius:

GUV_surface = lbs.function.sphere_surface(i)

VET_partial_surface = 0

while VET_partial_surface < GUV_surface:

temp1 = lbs.random.schulz(1.25e-07, 1e-06, 1e-08, 1, pi, rm, 5e-09)[0]

VET_list.append(temp1)

VET_partial_surface += lbs.function.sphere_surface(temp1)

cont += 1

print cont

risultati = str(VET_list)

risultati=risultati.replace("[","").replace("]","").replace(':','').replace(", ","\n").replace("{","").replace('}','').replace("(","").replace(")","").replace( ",","").replace(".",",")

f0 = open("VET BSA.txt", "wb")

f0.write(risultati)

(45)

45 In the first step the GUVs population is generated using the rm specific for the molecular solute contained in the liposomes. The second step generate the VETs population properly. Is taken into account the first radius of the population generated before, and the surface of the corresponding sphere is calculated. Then one random radius distributed by Schulz distribution of parameters pi and rm is generated, and the surface of the associated sphere is calculated. Random radii distributed by Schulz distribution are generated until the sum of the surfaces associated with these radii equals the one associated to the GUV radius. These seconds radii are the VETs radii. This procedure is done for each of the GUVs radii. As for the GUVs the last step is the storing of results in a .txt file. What occurs to liposome during extrusion is almost unknown. It is not clear if the liposomes disintegrate when they encounter the membrane with pores, or if the phenomenon of extrusion entails a modification of the membrane without it breaks. Due to this consideration we decide to use the maintenance of the total surface as unique control to the simulation.

We have developed a second model to generate a population of VETs, which consider VETs’s radii distributed following an exponential distribution. The choice of use this distribution is completely empirical, done in an attempt to get a better accordance with experimental data. The structure of this second model is similar to the one previously shown, but instead of using the schulz function from the random class, use the exponential function from the same class, to generate the random VET radii in step two

(46)

46

#step 1

GUV_radius = lbs.random.helfrich(1.25e-06, 5e-06, 1e-07, 9000, rm, 5e-08)

#step 2

VET_list = []

cont = 0

for i in GUV_radius:

GUV_surface = lbs.function.sphere_surface(i)

VET_partial_surface = 0

while VET_partial_surface < GUV_surface:

temp1 = lbs.random.exponential(1.25e-07, 1e-06, 1e-08, 1, y, a, b 5e-09)[0]

VET_list.append(temp1)

VET_partial_surface += lbs.function.sphere_surface(temp1)

cont += 1

print cont

risultati = str(VET_list)

risultati=risultati.replace("[","").replace("]","").replace(':','').replace(", ","\n").replace("{","").replace('}','').replace("(","").replace(")","").replace( ",","").replace(".",",")

f0 = open("VET BSA.txt", "wb")

f0.write(risultati)

(47)

47

Generation of GUVs solutes distributions

The behavior of molecules in the entrapment phenomena at this scale of sizes is, as we said before, almost unknown. With no indication about a discrepancy from a pure stochastic event, we decide to simulate this process as an event regulate by a poissonian distribution. Below is shown the model employed.

(48)

48 In the first step the total amount for molecules in the solution bulk is calculated. The initial concentration used for the simulation is the same used in lab experiments, to allow a direct comparison between the experimental data and the theoretical obtained data. Then, as usual a random population of GUVs in generated using the helfrich function from random class in lbstatistica library, and the last two decimal place of each radii is added by taking a random float number with uniform probability. The second step is the filling properly. Each of GUVs’s generated radius is considered at a time. First we generate a random “effective concentration”, which will be used as the mean value for the posissonian distribution. This is done for two reason. First Numpy library seems incapable of accepting such high values, such as the entire number of molecules present in the solution, as incoming argument for the function that generates random numbers distributed according to a Poisson distribution. However, is not reasonable to assume that all the molecules present in the initial solution, enter at the same time in the ballot to be entrapped in the same liposome. Is more reasonable think that the only molecules involved are just those which are near the area where the liposome is generating, so a part of the total amount.

Generation of VETs’s solute distribution

Experimental data for VET size distribution clearly show a deviation from the Schultz distribution. The meaning and the possible causes for this deviation will be discuss in the Results section of this thesis. To prevent this act constitutes a block to the analysis of the process of distribution of solutes during the extrusion of the liposomes, it is preferred to make use of experimental data as a numerical distribution of size. Experimental radii are used instead of randomly generated radii, and the model simulate the filling of experimental VETs by a Poissonian distribution. The first step of this model is load the radii from a .txt file, and store in a list as float number. Then in the second step, we calculate the mean number of molecules per liter, using as mean concentration, the mean concentration measured experimentally for that kind of solute. So this parameter change for every solute and must be set manually. In the third step for each radius, the corresponding volume is calculated. This volume is then used to calculate the mean number for the function of Numpy library, which returns a number distributed following a Poissonian distribution of that mean. Al least all the radii, the number of solutes in every single VET and the corresponding concentration are saved in a .txt file to be anayized. Below the model

(49)

(50)

50

Results

Validation of Helfrich’s and Schulz’s functions

The first step in our analysis was validate Helfrich and Schulz’s function. Since that is not available online certified algorithms that compute these two distributions, are present in additional packages for common tools for data analysis, the validation was numeric. independently were calculated values for the two functions, in a determined interval, both by our algorithm that from a function prepared in Microsoft Excell. The graphs below reported the functions computed by the two software. On the x axis, are reported the values for which the function is calculated, and on the y axis the value of the function.

i Numerical validation for Helfrich distribution

0 50000 100000 150000 200000 250000 3,00 E-06 3,20 E-06 3,40 E-06 3,60 E-06 3,80 E-06 4,00 E-06 4,20 E-06 4,40 E-06 4,60 E-06 4,80 E-06 5,00 E-06 5,20 E-06 5,40 E-06 5,60 E-06 5,80 E-06 6,00 E-06 6,20 E-06 6,40 E-06 6,60 E-06 6,80 E-06 Microsoft Excell Lbstatistica

(51)

51

ii Numerical validation for Schulz distribution

The discrepancy on the right tail in the Schulz distribution between the Excell calculated one, and the Lbstatistica calculated, is due to the different type of variables used in this two software. Excell use a decimal type, while python use a 64 bit float, so there is a difference in the way of approximation. However the agreement between the two curves shows that there are no errors due to how the function is calculated by the Python libraries.

Size distributions analysis

GUVs’s size distribution

Were conducted various simulations to generate the size distribution of GUVs, varying the mean radius of the theoretical distribution to test various distributions and find the best match between experimental and theoretical distribution. Below are reported the bests results obtained by our model. 0 0,000005 0,00001 0,000015 0,00002 0,000025 1,00 E-07 1,30 E-07 1,60 E-07 1,90 E-07 2,20 E-07 2,50 E-07 2,80 E-07 3,10 E-07 3,40 E-07 3,70 E-07 4,00 E-07 4,30 E-07 4,60 E-07 4,90 E-07 5,20 E-07 5,50 E-07 5,80 E-07 6,10 E-07 6,40 E-07 6,70 E-07 7,00 E-07 Microsft Excell Lbstatistica

(52)

52 FITC-BSA