• Non ci sono risultati.

In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information

N/A
N/A
Protected

Academic year: 2021

Condividi "In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information"

Copied!
4
0
0

Testo completo

(1)

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies are encouraged to visit:

http://www.elsevier.com/authorsrights

(2)

Author's personal copy

Operations Research Letters 42 (2014) 137–139

Contents lists available atScienceDirect

Operations Research Letters

journal homepage:www.elsevier.com/locate/orl

Ramsey theory and integrality gap for the independent set problem

Robert D. Carr

a

, Giuseppe Lancia

b,

aSandia National Labs, Albuquerque, NM, United States

bD.I.M.I. University of Udine, Italy

a r t i c l e i n f o

Article history:

Received 16 October 2013 Received in revised form 19 January 2014 Accepted 20 January 2014 Available online 31 January 2014

Keywords:

Independent set problem Ramsey theory Integrality gap ILP formulations

a b s t r a c t

We discuss the effectiveness of integer programming for solving large instances of the independent set problem. Typical LP formulations, even strengthened by clique inequalities, yield poor bounds for this problem. We show that a strong bound can be obtained by the use of the so-called rank inequalities, which generalize the clique inequalities. For some problems the clique inequalities imply the rank inequalities, and then a strong bound is guaranteed already by the simpler formulation.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

The maximum independent set (MIS) is one of the classic prob- lems in combinatorial optimization. Both the cardinality version and the weighted version of MIS have been studied. The literature on this problem – or its twin, the Maximum Clique – is vast and dates back to the beginning of the field. Although its definition is nice and simple, this problem is one of the toughest to solve ex- actly. Many papers over the years have dealt with the exact so- lution of the maximum clique/independent set [9,11,3,5]. From a practical point of view, it is very difficult and beyond the reach of the best algorithms to exactly solve instances on dense graphs of more than a couple hundred nodes [9].

The most successful approach to the exact solution of combi- natorial optimization problems, including the MIS, is probably In- teger Linear Programming (ILP). An ILP formulation is as successful as the strength of its LP bound. That is, if the value of the objective function when the variables are allowed to take fractional values is close to the value when the variables are constrained to be in- teger, then the pruning of the search space will be effective. It is often the case that in order to obtain better bounds the formula- tion is reinforced by the use of additional constraints, called cuts, and the resulting approach is known as branch-and-cut. Cuts are constraints that do not eliminate any feasible integer solution, but

Corresponding author. Tel.: +39 0432558454.

E-mail addresses:rdcarr@sandia.gov(R.D. Carr),giuseppe.lancia@uniud.it (G. Lancia).

make the space of fractional solutions smaller, this way decreasing the value of the LP bound.

The maximum independent set has a natural, nice formulation as an integer programming problem. Unfortunately, this formula- tion gives a terribly bad bound, e.g., the bound can be as big as n

/

2 for an instance with the optimal value of 1. The formulation can be strengthened by the use of clique-inequality cuts. These are constraints that say that each clique can have at most one node in common with any independent set. By using some concepts from the Ramsey theory, we will show that even with clique inequali- ties the gap between the LP value and the optimum can be very bad. In this paper we pinpoint the fundamental constraints for the maximum independent set as the ‘‘rank inequalities’’. Their addi- tion guarantees an O

(

log n

)

gap between the LP bound and the op- timum. This does not contradict the known complexity results for MIS (stated in Section2) since, in general, it is NP-hard to decide whether all clique inequalities are satisfied. However, the theory we develop here can also be useful in practice, since there are some fortunate cases in which a formulation implies the rank inequali- ties and hence we know that the bound will be strong. One such example, arising in bioinformatics, concerns the similarity mea- sure of two protein structures, reduced to a suitable max stable set problem [4].

2. The maximum independent set problem

Given an undirected graph G

= (

V

,

E

)

, with

|

V

| =

n, a subset V

V of nodes is an independent, or stable, set if no two nodes in Vare joined by an edge. The maximum independent set problem

http://dx.doi.org/10.1016/j.orl.2014.01.007 0167-6377/©2014 Elsevier B.V. All rights reserved.

(3)

Author's personal copy

138 R.D. Carr, G. Lancia / Operations Research Letters 42 (2014) 137–139

consists in determining an independent set of maximal weight or cardinality. Given a weight function c

= (

cv

)

on the vertices (where cv

:=

1 for each

v

in the cardinality version), the basic ILP formulation associates a binary variable xvto each node

v ∈

V , and reads

max

v∈V

cvxv

|

xu

+

xv

1

∀{

u

, v} ∈

E

,

xv

∈ {

0

,

1

}∀ v ∈

V

.

(1)

The LP relaxation of the cardinality version of(1)has an optimal value

n

/

2, as the value n

/

2 can be achieved by setting xv

=

1

/

2 for all

v ∈

V . Thus, the LP optimum can be O

(

n

)

times larger than the true optimum: e.g. if G is a clique, the optimum is 1. Since any clique Q

V can have at most one node in common with any independent set, the following clique-inequality constraints can be added to(1):

v∈Q

xv

1

Q clique in G

.

(2)

Adding constraints(2)presents two types of difficulties. First, find- ing cliques in a graph is itself a hard problem. Second, there may be exponentially many cliques. We can overcome these difficulties if we have a separation oracle, i.e., a black box that, given a solution to an LP with only some of the inequalities(2), tells us if the solution is in fact feasible for all the inequalities(2), or else returns a vio- lated clique inequality. In the next section we will argue that even in this case we are not guaranteed that the LP relaxation of(1)and (2)will in fact be a good approximation of the optimum.

The complexity theory results for the MIS paint a bleak picture as far as our ability to approximate this problem. Since the maximum independent set in a graph is the maximum clique in the complement graph, the complexity results are the same as those for the maximum clique problem. If P

̸=

NP, then MIS cannot be approximated to within a factor of O

(

nϵ

)

, where

ϵ

is a fixed positive constant defined for MIS [7,2,1]. Under stronger complexity assumptions, MIS cannot be approximated to within a factor of O

(

n0.5ϵ

)

, [8]. The best approximation factor for MIS found so far is a mere O

(

n

(

log log n

)

2

/(

log n

)

3

)

for the cardinality version [6].

3. Ramsey theory and the integrality gap

We just saw that one cannot expect to ever be able to find a rea- sonable approximation for MIS. A tacit polyhedral combinatorics axiom is that if there is a reasonable approximation for MIS, there should be an LP relaxation for MIS which is solvable in polyno- mial time and has a reasonable-size integrality gap. Such an LP relaxation would not allow any of the clique inequalities to be vi- olated by too large a factor. Call the LP relaxation that consists of the clique inequalities for all of the maximal cliques the clique re- laxation. In order for this relaxation to be polynomially solvable, we need an efficient separation algorithm for these exponentially many clique inequalities. Let xbe our current fractional solution.

With the vertices of G weighted by x, this separation algorithm must be able to find a clique of weight more than 1 if such a clique exists. Hence, our separation algorithm must be powerful enough to solve the maximum weighted clique problem. However, maxi- mum weighted clique is as difficult to approximate as MIS.

Suppose we eliminated this obstacle to a small integrality gap by imagining we have a separation oracle that finds a clique of weight more than 1 if such a clique exists. What can we say about the integrality gap? We argue here that the results from a branch of graph theory called the Ramsey theory make it impossible that the gap is a constant even in this case.

The Ramsey theory in part studies the occurrence of large cliques and large independent sets in arbitrary graphs. An idea in the Ramsey theory is that if a graph is large enough and has no large cliques (independent sets), then it must have large independent sets (respectively, cliques). More formally, there is a smallest integer R

(

k

,

l

)

such that every graph with at least R

(

k

,

l

)

vertices has either a clique of k vertices or an independent set of l vertices.

This number is called the

(

k

,

l

)

-th Ramsey number.

Ramsey numbers are notoriously difficult to compute and seem to grow reasonably quickly. We are particularly interested in their growth when the clique number k

=

3, for which we have the following result.

Theorem 1. The integrality gap between MIS and the clique LP relaxation cannot be bounded by any constant.

Proof. It is known that R

(

3

,

l

)

grows asymptotically as l2

/

log l (see [10]), so that liml→∞R(3,l)

l

= ∞

. Let r

N be given. We now produce a graph where the integrality gap exceeds r

/

2. Choose lr

N such thatR(3l,lr)−1

r1

>

r, and let Grbe a graph having R

(

3

,

lr

) −

1 vertices that has no clique of size 3 and no independent set of size lr. A feasible solution to the clique LP relaxation is xi

=

1

/

2 for all i

Vr. The cost of this solution isR(3,l2r)−1. The MIS solution has cost

lr

1. Hence, this integrality gap exceeds r

/

2, as required.



Ramsey theory thus shows that inequalities other than the clique inequalities are needed in an LP relaxation for there to be a constant integrality gap. The analysis above suggests which ad- ditional inequalities are needed: for each H

V , we have the valid inequality

x

(

H

) ≤

r

(

H

),

(3)

where the rank function r

(

H

)

is the maximum size of any indepen- dent set contained in H. These inequalities are called the rank in- equalities of MIS. We call the LP relaxation consisting precisely of all of the rank inequalities the rank LP relaxation.

Theorem 2. The integrality gap between weighted MIS and its rank LP relaxation is at most log n

+

1.

Proof. Denote a maximizer for the rank LP relaxation by x. We will demonstrate that log nx+1 is less than or equal to a convex combination of incidence vectors of independent sets of G. From an averaging argument, one of these independent sets costs at least

c·x

log n+1, from which our integrality gap result follows.

We first find the smallest integer k so that for every vertex

v ∈

V

,

xv

=

lv

k, where lv

N. We will decompose xin a special way into a linear combination of incidence vectors of independent sets with linear multipliers1k. We start by having x

=

y0

+

z0, where y0

:=

xand z0

:=

0. We successively form yi

,

zifor i

=

1

,

2

, . . .

by taking1k times an incidence vector of an independent set away from yi1and adding it to zi1. Hence, yi

+

zi

=

xfor i

=

1

,

2

, . . .

. We terminate our process at an integer L when yL

=

0 and zL

=

x. We thus have zLexpressed as a linear combination of L many independent sets. Define

Vi

:= { v ∈

V

:

yvi1

>

0

}

for i

=

1

,

2

, . . . .

Consider the incidence vector xi of a maximum cardinality independent set in G

[

Vi

]

, so that r

(

Vi

) =

xi

(

Vi

)

. Define

yi

:=

yi1

1

kxi

,

zi

:=

zi1

+

1

kxi for i

=

1

,

2

, . . . .

Because xsatisfies the rank inequalities for the vertex sets Vi, we have that

x

(

Vi

) ≤

xi

(

Vi

)

for i

=

1

,

2

, . . . .

(4)

Author's personal copy

R.D. Carr, G. Lancia / Operations Research Letters 42 (2014) 137–139 139

We can now determine a lower bound on how much yidecreases in each iteration, and thus an upper bound on how many terms are in our linear decomposition of xinto independent sets. Notice that xi

(

V

) =

xi

(

Vi

) ≥

x

(

Vi

) ≥

yi1

(

Vi

) =

yi1

(

V

)

, from which we have

yi

(

V

) =

yi1

(

V

) −

1

kxi

(

V

) =

yi1

(

Vi

) −

1 kxi

(

Vi

)

yi1

(

Vi

) −

1

kx

(

Vi

) ≤

yi1

(

Vi

) −

1 kyi1

(

Vi

)

=

1

1

k

yi1

(

Vi

) =

1

1

k

yi1

(

V

).

We determine inductively that

yi

(

V

) ≤

1

1

k

i

y0

(

V

) =

1

1

k

i

x

(

V

)

1

1

k

i

n

.

Hence, we have

yk

(

V

) ≤

1

1

k

k

n

n e

,

yk log n

(

V

) ≤

elog nn

1

.

After at most k log n iterations have been done, we are very nearly done. It is easy to see that at most k more iterations are needed to reach the integer L satisfying yL

=

0. This means that L

k

(

log n

+

1

)

. Hence, we have

x

log n

+

1

kx

L

=

1

L

L

i=1

xi

.

As previously discussed, by an averaging argument, one of these xi’s will cost at least as much aslog nc·x+1, which establishes our theorem.

Acknowledgments

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the US Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

References

[1] S. Arora, C. Lund, R. Motwani, M. Sudan, M. Szegedy, Proof verification and hardness of approximation problems, in: Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, 1992, pp. 14–23.

[2] S. Arora, S. Safra, Probabilistic checking of proofs: a new characterization of NP, in: Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, 1992, pp. 2–13.

[3]E. Balas, C.S. Yu, Finding a maximum clique in an arbitrary graph, SIAM J.

Comput. 15 (1986) 1054–1068.

[4]A. Caprara, R. Carr, S. Istrail, G. Lancia, B. Walenz, 1001 optimal PDB structure alignments: integer programming methods for finding the maximum contact map overlap, J. Comput. Biol. 11 (2004) 27–52.

[5]R. Carraghan, P.M. Pardalos, An exact algorithm for the maximum clique problem, Oper. Res. Lett. 9 (1990) 375–382.

[6]U. Feige, Approximating maximum clique by removing subgraphs, SIAM J.

Discrete Math. 18 (2004) 219–225.

[7] U. Feige, S. Goldwasser, L. Lovasz, S. Safra, M. Szegedy, Approximating clique is almost NP-complete, in: Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science, 1991, pp. 2–12.

[8] J. Hastad, Testing of the long code and hardness for clique, in: Proceedings ACM Symposium Theory of Computing, 1996, pp. 11–19.

[9]D.S. Johnson, M.A. Trick (Eds.), Cliques, Coloring, and Satisfiability, in: Dimacs Series in Discrete Mathematics and Theoretical Computer Science, The American Mathematical Society, 1996.

[10]J.H. Kim, The Ramsey number R(3,t)has order of magnitude t2/log t, Random Struct. Algorithms 7 (1995) 173–207.

[11]G.L. Nemhauser, L.E. Trotter, Vertex packings: structural properties and algorithms, Math. Program. 8 (1975) 232–248.

Riferimenti

Documenti correlati

The study carried for 4 years on 18 olive local cultivars grown in the same orchards has shown that the aromatic quality of virgin olive oil depends on the cultivar (genetic

The radiochemical purity of 99m Tc Tetrafosmin, 99m Tc Exametazime, 99m Tc Sestamibi and 99m Tc Oxidronate was evaluated by different thin layer chromatography systems, followed

The modified Lancasterian representation of the service output proposed here allows the examination of innova- tion in services as a time-saving process resulting from the

pointed out, diverse effects of different alcoholic beverages on blood pressure can be mainly related to their different polyphenol content (Chiva- Blanch et al., 2013b), as shown

Being subjected to high levels of fishing pressure resulted in similar responses in these ecosystems, such as: a simplified trophic web topology (relatively linear); ecosystem

Proposition 1 simply tells us that if the expected behaviour of players in the event of a deviation from an ef ficient strategy profile is described by the kinked social norm, then

The research question is: how is the concept of conditionality, as a means to achieve greater tenants’ responsibilisation, shaping recent approaches towards the allocation

We then evaluated the developmental toxicity following acute (5 days) and chronic (15 days) exposure to TCDD starting from late blastula stage, and by real-time PCR (RT-PCR), the