Euclidean Distance Geometry
Leo Liberti
CNRS LIX Ecole Polytechnique, France
IEOR, Columbia University, November 2015
[L., Lavor: Introduction to Euclidean Distance Geometry, in preparation]
Table of contents
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
Applications
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
Clock Synchronization
5 7
3
4
Atomic clock (S) 16:27
Chuck Alice
Bob
↓
16:20 16:22 16:24 16:26 16:28 16:30
A C S B
16:21 16:23 16:25 16:27 16:29 16:31
[Singer, 2011]
Sensor network localization
0
29 0 / 2.50781 36
1 / 1.97906
51 2 / 1.13649
52 3 / 2.92955
80 4 / 2.5548 91
5 / 2.4284
175 / 2.49225 177 / 1.4276
87 176 / 2.73147 211 / 0.989917 214 / 2.31786 49
210 / 2.83291 84
213 / 3.02018
66212 / 1.56417
268 / 1.76662 267 / 2.31042
271 / 1.75417 69 269 / 2.0622 73
270 / 3.10479
85 272 / 1.15897
346 / 1.50056
1 18
6 / 1.90126 37
7 / 2.61352 41
8 / 1.6604
45 9 / 1.92913
57 10 / 2.66407
11 / 0.65886158 83 12 / 1.32064
110 / 1.29792 111 / 0.840347
113 / 2.14071 114 / 2.03576 47
112 / 2.5425
215 / 2.12848
217 / 3.0822 218 / 1.98508 216 / 2.70003
229 / 2.97748 231 / 1.63275
232 / 2.32966 230 / 2.92252
249 / 1.49322 250 / 1.4467
251 / 3.11231
289 / 2.5056
81 290 / 2.54013
291 / 1.97931 2
5
13 / 1.62227
14 14 / 2.01015
17 15 / 2.68422
20
16 / 3.05872
23 17 / 2.48332
18 / 1.0575642
64 19 / 2.21356
67
20 / 2.42912
88 21 / 2.34146 93 22 / 1.0662
28 / 2.06104 29 / 1.80061
31 / 1.46243 33 / 1.38178 35 / 1.01186 33
30 / 3.15392 65
32 / 2.19726 70
34 / 3.17108
82 / 2.05285 83 / 1.9929
84 / 2.90088 85 / 0.965501
87 / 1.40579 88 / 2.51699 86 / 3.19003
98 89 / 2.50718
106 / 1.36517 107 / 2.2605
109 / 2.88336
55 108 / 2.81041
119 / 2.25758 123 / 0.666733 126 / 2.79445
122 / 1.26894 124 / 2.45029
71
125 / 2.61515
127 / 2.5523 56
120 / 2.46168 63
121 / 3.15779
143 / 2.62768 145 / 3.01294
146 / 1.62871
89 147 / 3.16128
144 / 2.60777 233 / 1.59139
234 / 3.07459 235 / 1.62072
316 / 1.57213 318 / 2.54084
92 317 / 3.16387
326 / 2.30061 325 / 2.97621
327 / 2.62621
350 / 3.18927
3 23 / 0.844888
26 / 2.43216 24 / 1.43953 25 / 2.52302
27 / 1.2847
329 / 2.97101 331 / 1.47226
330 / 2.81533 79
339 / 3.17026 195 / 3.02256
196 / 2.72116 194 / 3.16431 53 193 / 0.88823
319 / 1.78484 322 / 3.15941 320 / 1.19371
321 / 2.59083 332 / 2.88171 77
333 / 2.5427
6
36 / 2.6581637 / 2.84506 39 / 3.00913 39 38 / 0.977096
40 / 0.147583 90
41 / 2.54617 223 / 1.11811 225 / 1.74291 72
224 / 2.873
257 / 2.65041 7
10 42 / 1.67711
16 43 / 1.28092 30
44 / 2.60399 44
45 / 1.93798 46 / 0.90946
78 47 / 1.95204 86
48 / 3.1702
97 49 / 2.91638
50 / 2.79803 59 / 2.64882 61 / 3.11897
63 / 2.57151 64 / 0.63344
65 / 1.47059 15
58 / 2.53596 21
60 / 2.72874 59 62 / 2.44092
98 / 2.45974 102 / 2.89582 101 / 3.00823
100 / 2.72757 103 / 1.23805
104 / 2.65248
105 / 1.56149
24 99 / 2.76855 182 / 2.80251
183 / 2.17394 178 / 0.684621
184 / 2.04844 186 / 2.36739
179 / 1.75353 181 / 1.27559 62
180 / 2.82612 185 / 2.02673
244 / 2.90515 245 / 2.54461246 / 1.4908
248 / 2.19914
242 / 2.10075 243 / 1.81286 247 / 2.70375
334 / 2.85895
335 / 2.74921 345 / 0.969838
96 344 / 3.01517
8 52 / 1.59789
53 / 3.1109 13
51 / 2.96353
54 / 2.6062 94 55 / 2.44249 81 / 2.44372
31
80 / 3.09041
352 / 3.17363
95 353 / 1.69606 9 56 / 2.23711
57 / 2.87413
95 / 2.43097 97 / 1.79475 90 / 0.575857 91 / 1.84882
60 92 / 3.11441 61 93 / 0.745853
74 94 / 1.08674
96 / 1.91462
132 / 2.48921 134 / 1.67584 128 / 2.41661
129 / 2.54924 130 / 0.857403
131 / 1.03428
133 / 1.33929 295 / 2.78515 296 / 3.03355
297 / 2.79379 292 / 2.23004 293 / 2.59059 76 294 / 2.25052
11 22 66 / 2.54335
25 67 / 1.59039
26 68 / 2.55644
35 69 / 1.22183 43
70 / 1.26323
46 71 / 0.65784172 / 1.45548 73 / 1.45696 74 / 2.43597
99 75 / 0.710271
135 / 1.03939 137 / 2.64809
138 / 2.36917139 / 2.85603 141 / 1.99344 142 / 0.632129
28 136 / 3.03397 140 / 1.63894
150 / 2.05658 151 / 2.68427
152 / 1.33314153 / 1.85365 155 / 1.0403 156 / 1.22973
157 / 2.29037 154 / 2.67744
158 / 3.05556
159 / 2.34031 160 / 1.82595
162 / 2.70381 163 / 2.43938
161 / 1.90326 203 / 0.991465
204 / 1.81584 206 / 2.66247 207 / 2.45539 208 / 2.2551
209 / 1.2466 48
205 / 3.06568
236 / 1.90515238 / 2.5944 239 / 2.72018 240 / 3.16484
241 / 0.720351 237 / 2.87679
252 / 0.86256 254 / 0.843728 255 / 2.43018
256 / 1.26966
253 / 2.83326 262 / 0.880856 263 / 3.06774
264 / 1.88647
261 / 1.9719 348 / 2.26962
349 / 2.1111 351 / 3.02766
12
78 / 2.35652 76 / 0.500095 77 / 3.13454
79 / 2.02458
173 / 2.85379 172 / 2.85207
174 / 2.12157 281 / 1.77315
280 / 2.13908 187 / 0.741716
188 / 1.28288
304 / 2.67349 298 / 3.19922
301 / 3.10155
303 / 1.2464
68 299 / 2.68553 300 / 2.93307
302 / 2.7056482 306 / 3.166
308 / 2.46248 305 / 0.378469
307 / 1.95331 341 / 2.70349340 / 1.87485
354 / 2.04535
149 / 1.71292
32 148 / 2.63438
19 117 / 2.4561
118 / 1.55824
115 / 1.40904 50 116 / 0.648045
259 / 2.59804
260 / 2.60371
258 / 2.04883 265 / 2.53475 266 / 1.20584
285 / 3.04711 284 / 1.28966 286 / 0.444549
287 / 2.49879 283 / 0.739388 282 / 2.4463
288 / 2.40494
312 / 2.0258 313 / 1.03139
314 / 2.65429 315 / 1.70638
34 189 / 2.92806
38
190 / 2.07339
40 191 / 2.80145 192 / 2.35256
324 / 2.63064 323 / 2.58729
27 171 / 2.23482 167 / 2.38609
164 / 1.92141 165 / 2.6582
166 / 2.28031 168 / 2.41683169 / 0.955716
170 / 0.967808 199 / 2.18351
197 / 0.88776
198 / 1.51159 200 / 0.583974
201 / 2.87535 202 / 1.31584 220 / 2.98526
219 / 1.26904 221 / 0.308754
222 / 1.8407
226 / 1.33417227 / 3.09905 228 / 1.31378 328 / 1.65874
338 / 1.28689 337 / 1.83675
75 336 / 3.11488 347 / 3.11676 310 / 2.37803
309 / 1.9722 311 / 1.13749
276 / 3.05065 277 / 2.40579
273 / 2.81885 275 / 3.02541 274 / 2.49444
54 278 / 3.02172
279 / 1.17834 343 / 1.31891
342 / 2.50522
−→
[Yemini, 1978]
Protein conformation from NMR data
1
2
3 4
5 9
6
8 10
7
11
−→
[Crippen & Havel 1988]
Clock synchronization: solutions
|x
A− x
B| = 7
|x
A− x
C| = 3
|x
A− x
S| = 5
|x
B− x
C| = 4
x
S= 16:27.
57
3
4
PSfrag
x
Ax
Bx
Cx
Sx
S= 16:27
x
A= 16:22
x B = 16:15
x C = 16:19
x B = 16:15
x C = 16:25
x B = 16:29
x C = 16:19
x B = 16:29
x C = 16:25
x
A= 16:32
x B = 16:25
x C = 16:29
x B = 16:25
x C = 16:35
x B = 16:39
x C = 16:29
x B = 16:39
x C = 16:35
Definition
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
Distance Geometry Problem (DGP)
Given: Determine whether ∃:
– a simple graph G = (V, E) – an edge function d : E → R ≥0 – an integer K ∈ N
a realization x : V → R K s.t.
∀{u, v} ∈ E kx u − x v k 2 = d uv
1 2
3
4
1
√ 3 1 1
1
1 1
2
3
4
Let n = |V |
More applications
• Autonomous underwater vehicles [Bahr et al. 2009]
• Statics of rigid structures [Maxwell 1864]
• Matrix completion [Laurent 2009]
• Statistics [Boer 2013]
• Psychology [Kruskal 1964]
[Liberti et al., SIREV 2014]
Complexity primer
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
Definitions
• Decision problem: mathematical YES/NO-type question depending on a parameter vector π
• Instance: same as above with π replaced by given values v
• Certificate: proof that a given answer is true
• P: all decision problems solvable in at most p( |π|) steps where p is a polynomial
• NP: all decision problems with |YES certificate| ≤ p(|π|)
where p is a polynomial
Reductions
• P, Q: decision problems
• If ∃ algorithm A which:
1. reformulates instances ¯ P of P into instances ¯ Q of Q 2. has answer( ¯ P ) = YES iff answer(A( ¯ P )) = YES
3. is polytime in the instance size | ¯ P |
then A is a reduction of P to Q
NP-hardness
• Q is NP-hard if every problem in NP reduces to Q
• Q is NP-complete if it is NP-hard and is in NP
Why does it work?
any P in NP polytime reduction
−−−−−−−−−−−−−−−−−−→ Q: how hard?
• Suppose Q easier than P
• Solve P by reducing to Q in polytime and then solve Q
• Then P as easy as Q, against assumption
• ⇒ Q at least as hard as P
So if Q is NP-hard it is as hard as any problem in NP
⇒ Q is as hard as the hardest problem in NP
NP-hardness proofs
Given a new problem Q, take any known NP-hard problem P and reduce it to Q
Why does it work?
P : NP-hard polytime reduction
−−−−−−−−−−−−−−−−−−→ Q: how hard?
• As before: Suppose . . . (etc.) ⇒ Q at least as hard as P
• Since P is NP-hard, it is hardest in NP, and so is Q
⇒ Q is NP-hard
Complexity of the DGP
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
DGP ∈ NP?
• NP: YES/NO problems with polytime-checkable proofs for YES
• DGP is a YES/NO problem
• DGP 1 ∈ NP, since d uv = |x u − x v | ⇒ (d ∈ Q → x ∈ Q)
• Solutions might involve irrational numbers when K > 1
• Some empirical evidence that DGP 6∈ NP [Beeker et al. 2013]
The DGP is NP-hard
Partition
Given a = (a 1 , . . . , a n ) ∈ N n , ∃ I ⊆ {1, . . . , n} s.t. P
i ∈I a i = P
i 6∈I a i ?
• Reduce (NP-hard) Partition to DGP 1
• a −→ cycle C with V (C) = {1, . . . , n}, E(C) = {{1, 2}, . . . , {n, 1}}
• For i < n let d i,i+1 = a i , and d n,n+1 = d n1 = a n
• E.g. for a = (1, 4, 1, 3, 3), get cycle graph:
1
2 3
4
5 1
4
1
3 3
[Saxe, 1979]
Partition is YES ⇒ DGP 1 is YES
• Given: I ⊂ {1, . . . , n} s.t. P
i ∈I a i = P
i 6∈I a i
• Construct: realization x of C in R 1. x 1 = 0 // start
2. induction step: suppose x i known if i ∈ I
let x i+1 = x i + d i,i+1 // go right
else
x i+1 = x i − d i,i+1 // go left
• Correctness proof: by the same induction
but careful when i = n: have to show x n+1 = x 1
I = {1, 2, 3}
0 1 2 3 4 5 6
x
10 1 2 3 4 5 6 x
1x
21 ∈ I
0 1 2 3 4 5 6
x
1x
2x
31 ∈ I 2 ∈ I
0 1 2 3 4 5 6
x
1x
2x
3x
41 ∈ I 2 ∈ I
3 ∈ I
0 1 2 3 4 5 6
x
1x
2x
5x
3x
41 ∈ I 2 ∈ I
3 ∈ I
4 6∈ I
0 1 2 3 4 5 6
x
6= x
1? x
2x
5x
3x
41 ∈ I 2 ∈ I
3 ∈ I 4 6∈ I
5 6∈ I
Partition is YES ⇒ DGP 1 is YES
(1) = X
i ∈I
(x i+1 − x i ) = X
i ∈I
d i,i+1 =
= X
i ∈I
a i = X
i 6∈I
a i =
= X
i 6∈I
d i,i+1 = X
i 6∈I
(x i − x i+1 ) = (2)
(1) = (2) ⇒ X
i ∈I
(x i+1 − x i ) = X
i 6∈I
(x i − x i+1 ) ⇒ X
i ≤n
(x i+1 − x i ) = 0
⇒ (x n+1 − x n ) + (x n − x n −1 ) + · · · + (x 3 − x 2 ) + (x 2 − x 1 ) = 0
⇒ x n+1 = x 1
Partition is NO ⇒ DGP 1 is NO
• By contradiction: suppose DGP 1 is YES, x realization of C
• F = {{u, v} ∈ E(C) | x u ≤ x v }, E(C)rF = {{u, v} ∈ E(C) | x u > x v }
• Trace x 1 , . . . , x n : follow edges in F ( →) and in E(C) r F (←)
0 1 2 3
-1 -2
-3
x
1x
3x
2x
4x
5X
{u,v}∈F
(x
v− x
u) = X
{u,v}6∈F
(x
u− x
v) X
{u,v}∈F
|x
u− x
v| = X
{u,v}6∈F
|x
u− x
v| X
{u,v}∈F
d
uv= X
{u,v}6∈F
d
uv• Let J = {i < n | {i, i + 1} ∈ F } ∪ {n | {n, 1} ∈ F }
⇒ X
i ∈J
a i = X
i 6∈J
a i
• So J solves Partition instance, contradiction
• ⇒ DGP is NP-hard, DGP 1 is NP-complete
Number of solutions
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
With congruences
• (G, K): DGP instance
• ˜ X ⊆ R Kn : set of solutions
• Congruence: composition of translations, rotations, reflections
• C = set of congruences in R K
• x ∼ y means ∃ρ ∈ C (y = ρx):
distances in x are preserved in y through ρ
• ⇒ if | ˜ X | > 0, | ˜ X | = 2 ℵ 0
Modulo congruences
• Congruence is an equivalence relation ∼ on ˜ X (reflexive, symmetric, transitive)
∼
• Partitions ˜ X into equivalence classes
• X = ˜ X/ ∼: sets of representatives of equivalence classes
• Focus on |X| rather than | ˜ X |
Cardinality of X
• infeasible ⇔ |X| = 0
• rigid graph ⇔ |X| < ℵ 0
• globally rigid graph ⇔ |X| = 1
• flexible graph ⇔ |X| = 2 ℵ 0
• |X| = ℵ 0 : impossible by Milnor’s theorem
Milnor’s theorem implies |X| 6= ℵ 0
• System S of polynomial equations of degree 2
∀i ≤ m p i (x 1 , . . . , x nK ) = 0
• Let X be the set of x ∈ R nK satisfying S
• Number of connected components of X is O(3 nK ) [Milnor 1964]
• If |X| is countable then G cannot be flexible
⇒ incongruent elements of X are separate connected components
⇒ by Milnor’s theorem, there’s finitely many of them
Examples
V
1= {1, 2, 3}
E
1= {{u, v} | u < v}
d
1= 1
x
1x
2x
3ρ congruence in R
2⇒ ρx valid realization
|X| = 1
V
2= V
1∪ {4}
E
2= E
1∪ {{1, 4}, {2, 4}}
d
2= 1 ∧ d
14= √ 2
x
1x
2x
3x
4ρ reflects x
4wrt x
1, x
2⇒ ρx valid realization
|X| = 2 ( , )
V
3= V
2E
3= {{u, u + 1}|u ≤ 3} ∪ {1, 4}
d
1= 1
x
1x
2x
3x
4ρ rotates x
2x
3, x
1x
4by θ
⇒ ρx valid realization
|X| is uncountable
( , , , , . . .)
Mathematical optimization formulations
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
System of quadratic constraints
∀{u, v} ∈ E kx u − x v k 2 = d 2 uv
• Around 10 vertices
• Computationally useless
Quadratic objective
x min ∈R nK
X
{u,v}∈E
( kx u − x v k 2 − d 2 uv ) 2
• Globally optimal value zero iff x is a realization of G
• sBB: 10-100 vertices, exact solutions
• heuristics: 100-1000 vertices, poor quality
[Lavor et al., 2006]
Convexity and concavity
max
x ∈R nK
X
{u,v}∈E
kx u − x v k 2
∀{u, v} ∈ E kx u − x v k 2 ≤ d 2 uv
• Convex constraints, concave objective
• Computationally no better than “quadratic objective”
Pointwise reformulation
x max ∈R nK
X
{u,v}∈E,k≤K
θ uvk (x uk − x vk )
∀{u, v} ∈ E kx u − x v k 2 ≤ d 2 uv
• Convex subproblem in stochastic iterative heuristics
“guess θ and solve”
• 100-1000 vertices, good quality
[L. IOS14/MAGO14(slides)]
SDP formulation
X min 0
X
{u,v}∈E
(X uu + X vv − 2X uv )
∀{u, v} ∈ E X uu + X vv − 2X uv ≥ d 2 uv
• Similar to those of Ye, Wolkowicz — works better for proteins
• 100 vertices, good quality
[D’Ambrosio et al., in progress]
Realizing complete graphs
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
Cliques
2-clique
1 2
3-clique
1 2
3
4-clique
1 2
3
4
(K + 1)-clique = K-clique ⊕ a vertex
Given a realization of the K-clique, find the position of the vertex
Trilateration
1
2
3
1 1
2
−→
1 2 3
0 1 2
Example: realize triangle on a line
• From kx 3 − x 1 k = 2 and kx 3 − x 2 k = 1 get
x 2 3 − 2x 1 x 3 + x 2 1 = 4 (1) x 2 3 − 2x 2 x 3 + x 2 2 = 1. (2)
• (2) − (1) yields
2x 3 (x 1 − x 2 ) = x 2 1 − x 2 2 − 3
⇒ 2x 3 = 4,
• Hence x 3 = 2
Realizing a (K + 1)-clique in R K −1
• Apply trilateration inductively on K
assume x 1 , . . . , x K ∈ R K −1 known, compute y = x K+1
• K quadratic eqns (∀j ≤ K ky − x j k 2 = d 2 j,K+1 ) in K − 1 vars
kyk
2− 2x
1· y + kx
1k
2= d
21,K+1[1]
... ...
kyk
2− 2x
K· y + kx
Kk
2= d
2K,K+1[K]
• Form system ∀j ≤ K − 1 ([j] − [K])
2(x
1− x
K) · y = kx
1k
2− kx
Kk
2− d
21,K+1+ d
2K,K+1[1] − [K]
... ...
2(x
K−1− x
K) · y = kx
K−1k
2− kx
Kk
2− d
2K−1,K+1+ d
2K,K+1[K − 1] − [K]
• This is a (K − 1) × (K − 1) linear system Ay = b Solve to find y
[Dong, Wu 2002]
“Solve”?
1. What if A is singular?
2. Or: A nonsingular but instance is NO
Singularity: rkA = K − 2
One row x j − x K of A depends on the others
K = 2 triangle in R
1x
1− x
2= 0
x
1= x
2x
3? x
3?
K = 3 4-clique in R
2x
1, x
2, x
3on a line
x
1x
2x
3x
4? x
4?
K = 4 5-clique in R
3x
1, . . . , x
4in a plane x
1x
2x
3x
4x
5? x
5?
Trend continues: rk A = K − 2 ⇒ |X| = 2 (see later)
Singularity: rkA = K − 3
Two rows x j − x k depend on the others
K = 3 4-clique in R
2x
1= x
2= x
3x
1= x
2= x
3x
4?
K = 4 5-clique in R
3x
1, . . . , x
4on a line
x
1x
2x
3x
4x
5?
Trend continues: [Hendrickson, 1992]
Thm. 5.8. If a graph G is connected, flexible and has more than K vertices, X
contains almost always a submanifold diffeomorphic to a circle
Hendrickson’s theorem also applies to non-cliques
Nonsingular matrix A with NO instance
• Infeasible quadratic system ∀j ≤ K kx K+1 − x j k 2 = d 2 j,K+1
• Take differences, get nonsingular A and value for x K+1
• . . . but it’s wrong!
Shit happens!
Every time you solve the linear system Ay = b
check feasibility with quadratic system
Algorithm for realizing complete graphs in R K
• Assume:
(i) G = (V, E) complete (ii) |V | = n ≥ K + 2
(iii) we know x 1 , . . . , x K+1
• Increase K: we know how to realize x K+2 in R K
• Use this inductively for each i ∈ {K + 2, . . . , n}
Algorithm for realizing complete graphs in R K
// realize next vertex iteratively for i ∈ {K + 2, . . . , n} do
// use (K + 1) immediate adjacent predecessors to compute x
iif rkA = K then
x
i= A
−1b // A, b defined as above else
x
i= ∞ // A singular, mark ∞ and exit break
end if
// check that x
iis feasible w.r.t. other distances for {j ∈ N(i) | j < i} do
if kx
i− x
jk 6= d
ijthen
// if not, mark infeasible and exit loop x
i= ∅
break end if end for
if x
i= ∅ then break
end if
end for
return x
Complexity of Alg. 1
• Outer loop: O(n)
• Rank and inverse of A: O(K 3 )
• Inner loop: O(n)
• Get O(n 2 K 3 )
• But in most applications K is fixed
• Get O(n 2 )
But how do we find the realization of the first K + 1 vertices?
Realizing (K + 1)-cliques in R K
• Realizing (K + 1)-cliques in R K −1 yields “flat simplices”
(e.g. triangles on lines)
• Use “natural” embedding dimension R K
• Same reasoning as above:
get system Ay = b where y = x K+1 and A j = 2(x j − x K )
• But now A is (K − 1) × K
• Same as previous case with A singular
Almost square
How can you solve the following system Ay = b:
a 11 a 12 . . . a 1K ... ... . . . ...
a K −1,1 a K −1,2 . . . a K −1,K
y 1 ...
y K
=
b 1 ...
b K −1
where A has one more column than rows and rank K − 1?
Basics and nonbasics
• Since rk A = K − 1, ∃ K − 1 linearly independent columns
• B: set of their indices
• N : index of remaining column
• B: (K − 1) × (K − 1) square matrix of columns in B
• ⇒ B is nonsingular
• Can partition columns as A = (B |N)
Column j corresponds to variable y j
• Variables y B are called basic variables
• Variable y N is called nonbasic variable
The dictionary
(B |N)y = b
⇒ By B + N y N = b
⇒ y B = B −1 b − B −1 N y N
Basics expressed in function of nonbasic
One quadratic equation
• From value of y N , can use dictionary to get y B
• Use one quadratic equation
1. Pick any h ∈ {1, . . . , K − 1}, equation is kx h − yk 2 2 = d 2 hK 2. y = (y B |y N ) ⊤
3. Replace y B with B −1 b − B −1 N y N in equation
4. Solve resulting quadratic equation in one variable y N 5. Get 0,1 or 2 values for y N
6. ⇒ Get 0,1 or 2 positions for x K+1
What if B −1 N is zero?
• y B = B −1 b − B −1 N y N reduces to y B = B −1 b
• Use one quadratic equation
1. Pick any h ∈ {1, . . . , K − 1}, equation is kx h − yk 2 2 = d 2 hK 2. y = (y B |y N ) ⊤
3. Replace y B with B −1 b in equation
4. Solve resulting quadratic equation in one variable y N 5. Get 0,1 or 2 values for y N
6. ⇒ Get 0,1 or 2 positions for x K+1
The difference
• B −1 N 6= 0: y N −−−−−−−→ y dictionary B
• Different values y N + 6= y N − −→ y + , y − with different components
• B −1 N = 0: y B quadratic eqn.
−−−−−−−−−−→ y N
• Even if y N + 6= y N − , K − 1 components of y + , y − are equal
aff(x 1 , . . . , x K −1 ) = {y ∈ R K | y N = 0 }
The case of no solutions
• No realizations exist for this (K + 1)-clique in R K
• DGP instance is NO
The case of one solution
• Assume for simplicity: N = K, h = 1, B −1 N 6= 0 Then kx h − yk 2 = d 2 h,K+1 becomes:
λy
K2− 2µy
K+ ν = 0, where
λ = 1 + X
ℓ,j<K
β
ℓj2a
2jKµ = x
1K+ X
ℓ,j<K
β
ℓja
jK(β
ℓjb
ℓ− x
1ℓ)
ν = X
ℓ,j<K
β
ℓjb
ℓ(β
ℓjb
ℓ− 2x
1ℓ) + kx
1k
2− d
21,K+1• (Exactly one solution for y K ) ⇔ µ 2 = λν, not a tautology
• The set of all (K + 1)-clique DGP instances in R K s.t. µ 2 = λν has Lebesgue measure 0
• Ignore them, they happen with probability ∗ 0!
∗
Assuming continuous distributions over the reals. For floating point number, who knows? . . .
but we’ll ignore these instances anyhow
Discriminant > 0, = 0
4
4 4
1
3 1
2
3
1
2
3
2
The case of two solutions
• K spheres S K 1 −1 , . . . , S K K −1 in R K
centered at x
1, . . . , x
Kwith radii d
1,K+1, . . . , d
K,K+1• x K+1 must be at the intersection of S K 1 −1 , . . . , S K K −1
• If T j S K j −1 6= ∅, then | T j S K j −1 | = 2 in general
• will not mention “probability 0” or “in general” anymore
[Coope 2000]
Mirror images
• Let x + = {x 1 , . . . , x K , x + K+1 }, x − = {x 1 , . . . , x K , x − K+1 } assume dim aff(x 1 , . . . , x K ) = K − 1 (†)
• Theorem
x + , x − are reflections w.r.t. hyperplane defined by x 1 , . . . , x K
• Proof
1. x + , x − congruent by construction
2. ∀i ≤ K x i ∈ x + ∩ x − → x + , x − not translations
3. |x + ∩ x − | = K < |x + | = |x − | → x + , x − not rotations by ( †)
4. ⇒ must be reflections
Algorithm for realizing (K + 1)-cliques in R K
// realize 1 at the origin x 1 = (0, . . . , 0)
// realize next vertex iteratively for ℓ ∈ {2, . . . , K + 1} do
// at most two positions in R ℓ −1 for vertex ℓ S = T
i<ℓ
S ℓ i −2 if S = ∅ then
// warn: infeasible return ∅
end if
// arbitrarily choose one of the two points choose any x ℓ ∈ S
end for
// return feasible realization
return x
Complexity of Alg. 2
• Outer loop: O(K)
• Gaussian elimination on A: O(K 3 )
• Some messing about to obtain x + K+1 , x − K+1 : +O(K 2 )
• Get O(K 4 )
• But in most applications K is fixed
• Get O(1)
Back to complete graphs
• Alg. 2: realize 1, . . . , K + 1 in R K : O(1)
• Alg. 1: Realize K + 2, . . . , n: O(n 2 )
• ⇒ O(n 2 )
• What about |X|?
– Alg. 1 is deterministic: one solution from x 1 , . . . , x K+1 – Alg. 2 is stochastic: pick one of two values K times
⇒ |X| = 2 K
Let’s look at sparser graphs
K-laterative graphs
• In Alg. 1 we only need each v > K + 1 to have K + 1 adjacent predecessors in order to find a unique solution for x v
• Determination of x v from K+1 adjacent predecessors: K-lateration
• K-laterative graph:
(i) has a vertex order ensuring this property
(ii) the initial K + 1 vertices induce a (K + 1)-clique the order is called K-lateration order
• Alg. 1 realizes all K-laterative graphs
The DGP restricted to K-laterative graphs in R K is tractable
[Eren et al. 2004]
The story so far
• Lots of nice applications
• DGP is NP-hard
• May have 0, 1, finitely many or 2 ℵ 0 solutions modulo congruences
• Continuous optimization techniques don’t scale well
• Using K + 1 adjacent predecessors, realize K-laterative graphs in R K in polytime
• Do we need K + 1 adjacent predecessors, or can we do with
less?
The Branch-and-Prune algorithm
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP
10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
Fewer adjacent predecessors
• Alg. 2 only needs K adjacent predecessor
• Extend to n vertices: (K − 1)-laterative graphs
• Can we realize (K − 1)-laterative graphs in R K ?
• A small case: graph consisting of two K + 1 cliques
1 1 1
5 5
5 2 2
2
3 3
3 4 4
4
Take a closer look. . .
5
5 2 2
3 3
4 4
• Realization of a K + 1 clique in R K knowing x 1 , . . . , x K
• We know how to do that!
• Consistent with 2 solutions for x 5 , reflected across plane through
x 2 , x 3 , x 4
Discretization and pruning edges
• (K − 1)-laterative graph G = (V, E):
∀v > K ∃U v ⊂ V (|U v | = K ∧ ∀u ∈ U v (u < v) ∧ {u, v} ∈ E)
• Discretization edges:
E D = {{u, v} ∈ E | u, v ≤ K}
| {z }
initial clique
∪ {{u, v} ∈ E | v > K ∧ u ∈ U v }
| {z }
vertex order
• Pruning edges E P = E r E D
10
1
2
3 4
5 6
7 8
9
11
12 13
14 15
16
17
18 19
Role of discretization edges
Missing discretization edge
⇒ non-rigid structure
⇒ X uncountable
Else: X finite
Role of pruning edges
No pruning edges: 8 incongruent realizations in R 2
1
2
3
4
5 1
2
3
4
5 1
2
3 4
5
1
2
3 4
5
1
2 3
4 5
1
2 3
4
5
1
2 3
4 5
1
2 3
4
5
Role of pruning edges
Pruning edge {1, 4}: only 4 realizations remain valid
1
2
3
4
5 1
2
3
4
5 1
2
3 4
5
1
2
3 4
5
1
2 3
4 5
1
2 3
4
5
1
2 3
4 5
1
2 3
4
5
Motivation
Protein backbones
• Total order < on V
• Covalent bond distances: {u − 1, u} ∈ E
• Covalent bond angles: {u − 2, u} ∈ E
• NMR experiments: {u − 3, u} ∈ E
(and other edges {u, v} with v − u > 3)
Generalize “3” to K
[Lavor et al., COAP 2012]
K DMDGP graphs
K = 2 K = 3
1
2
3
4
5
Generalization of protein backbone order:
v > K is adjacent to K immediate predecessors v − 1, . . . , v − K
K DMDGP: Discretizable Molecular Distance Geometry Problem
The Branch-and-Prune (BP) algorithm
BP(v, ¯ x, X):
1. Given v > K, realization ¯ x = (x 1 , . . . , x v −1 ) 2. Compute S = T
u ∈U v
S K u −1
3. For each x v ∈ S s.t. ∀{u, v} ∈ E P (u < v → kx u − x v k = d uv ) (a) let x = (¯ x, x v )
(b) if v = n add x to X, else call BP(v + 1, x, X)
• Recursive: starts with BP(K + 1, (x
1, . . . , x
K), ∅)
• All realizations in X are incongruent
∗• Can be easily modified to find only p solutions for given p
• Applies to all (K − 1)-laterative graphs in R
K• Specialize to KDMDGP graph by setting U
v= {v − 1, . . . , v − K}
∗
with probability 1, and aside from one reflection at v = K + 1
[L. et al. ITOR 2008]
Complexity of BP
• Most operations are O(K h ) for some fixed h ⇒ O(1)
• Distance check at Step 3: O(n)
• Recursion on at most 2 branches at each call: binary tree
• Only recurse when v > K, v < n: 2 n −K nodes
• Overall O(n2 n −K ) = O(2 n )
Worst-case exponential behaviour
Hardness of K DMDGP
• The K DMDGP is NP-hard for each K
– every DGP instance is also DMDGP if K = 1
– reduction from Partition can be extended to any K
• (K − 1)-lateration graphs are NP-hard by inclusion
• No polytime algorithm unless P=NP
Trilaterative graphs in R K are complexitywise borderline at K
Correctness
Thm.
When BP terminates, X contains every incongruent realization of G
Proof.
• Let ¯ y be any realization of G
• Since G has an initial K-clique, can rotate/translate/reflect ¯ y to y[K] = x[K] for all x ∈ X
• BP exhaustively constructs every extension of x[K] which is feasible with all distances, so y ∈ X
for a realization y, y[h] = (y
1, . . . , y
h) is the initial segment of y
Two examples
Empirical observations
• Fast: up to 10k vertices in a few seconds on 2010 hardware
• Precise: errors in range O(10 −9 )-O(10 −12 )
• Number of solutions always a power of 2:
obvious if E P = ∅, but otherwise mysterious
• Linear-time behaviour on proteins:
this really shouldn’t happen
Symmetry in the K DMDGP
1. Applications 2. Definition
3. Complexity primer
4. Complexity of the DGP 5. Number of solutions
6. Mathematical optimization formulations 7. Realizing complete graphs
8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders
12. Approximate realizations
[L. et al. DAM 2014]
Partial reflections
• For each v > K, let
g v (x) = (x 1 , . . . , x v −1 , R v x (x v ), . . . , R x v (x n )) be the partial reflection of x w.r.t. v
x
v−3x
v−2x
v−1• Note: the g v ’s are idempotent operators
• G D = (V, E D ): subgraph of G given by discretization edges
• ∀v > K reflection R x v gives a binary choice in general ∗
• X D ⊂ R nK contains 2 n −K incongruent realizations of G D
∗
subsequent results hold “with probability 1”
Discretization group
• G D = hg v | v > Ki: the discretization group of G w.r.t. K subgroup of a Cartesian product of reflection groups
• An element g ∈ G D has the form N
v>K g v a v , where a v ∈ {0, 1}
• Action of G D on X D : g(x) = g a K+1
K+1 ◦ · · · ◦ g n a n (x)
Commutativity of partial reflections
Lemma A G D is Abelian
Proof Assume K < u < v. Then
g u g v (x) = g u (x 1 , . . . , x v −1 , R x v (x v ), . . . , R v x (x n ))
= (x 1 . . . , x u −1 , R u g
v (x) (x u ), . . . , R u g
v (x) R x v (x v ), . . . , R u g
v (x) R x v (x n ))
= (x 1 . . . , x u −1 , R u x (x u ), . . . , R v g u (x) R u x (x v ) , . . . , R v g u (x) R x u (x n ))
= g v (x 1 , . . . , x u −1 , R x u (x u ), . . . , R u x (x n ))
= g v g u (x)
where equality of these terms holds by a Technical Lemma (next slide)
[L. et al. 2013]
Commutativity of partial reflections
Technical Lemma
(Proof sketch for K = 2) Let y ⊥Aff(x v −1 , . . . , x v −K ) and ρ y = R v x
ρ z
ρ y t
O
z y ρ z y
ρ y t
ρ z t ρ ρ z y
ρ z ρ y t = ρ ρ z y ρ z t
One realization generates all others
Lemma B The action of G D on X D is transitive
K = 2 x
y g
g g
54
3