EUROGRAPHICS 2010 / T. Akenine-Möller and M. Zwicker
(Guest Editors)

Volume 29 (2010), Number 2

Puzzle-like Collage

Stas Goferman Ayellet Tal Lihi Zelnik-Manor

Technion – Israel Institute of Technology

Abstract
Collages have been a common form of artistic expression since their first appearance in China around 200 BC.
Recently, with the advance of digital cameras and digital image editing tools, collages have gained popularity also
as a summarization tool. This paper proposes an approach for automating collage construction, which is based
on assembling regions of interest of arbitrary shape in a puzzle-like manner. We show that this approach produces
collages that are informative, compact, and eye-pleasing. This is obtained by following artistic principles and
assembling the extracted cutouts such that their shapes complete each other.

1. Introduction

A collage is a work of the visual arts, made from an as-
semblage of different forms, thus creating a new whole, of-
ten having a purposeful incongruity. This paper focuses on
photo-collages, which assemble a collection of photographs
by cutting and joining them together. A photo-collage can
be used for art [Ade89] as well as for summarizing a photo
collection, such as a news event, a family occasion, or a con-
cept (e.g., the Beatles’ album cover of “Sgt. Pepper’s Lonely
Hearts Club Band”).

Manually creating a collage is a difficult and time-
consuming task, since the pieces should be nicely cut and
matched. Therefore, automation could be a welcomed tool.
Prior work on automating collage creation extracts rectan-
gular salient regions and assembles them in various fash-
ions [RBHB06,WSQ∗06,BCG∗07,Pic]. This produces beau-
tiful collages, however, since the extracted regions are rect-
angular, the variety of possible compositions is limited and
uninteresting regions are included. In [RBHB06] the rectan-
gular shapes are modified by graph cuts and alpha blending.
This creates nicer transitions between images, however, the
uninteresting regions (typically from the background) can-
not be eliminated.

This approach to assemblage, while informative, does not
match in spirit the way in which many artists construct col-
lages. Artists extract the expressive regions of interest, which
can be of arbitrary shape, as noted by Henri Matisse: “The
paper cutouts allow me to draw with color”. This approach
is expressed in numerous artistic collages, for instance see

the pioneering works of “Just What Is It that Makes Today’s
Homes So Different, So Appealing?” by Richard Hamil-
ton and the “Dada Siegt” by Raoul Hausmann. The critical
boundaries of the important information are considered sig-
nificant and are thus maintained. This artistic form of collage
has gained popularity also among amateurs as can be seen
by the hundreds of collage groups on Flickr and hundreds of
thousands users in Polyvore [Pol], a web-based application
where collages are manually assembled by users.

We propose a method for automating collage creation, which
is inspired by artistic collage work and glues mostly the pur-
poseful cutouts (see Figure 1). The fundamental difference
between prior work and ours is that we compose a puzzle-
like collage of arbitrary shaped images rather than rectangu-
lar ones. A user-study shows that this creates collages that
are often considered more appealing. Moreover, this allows
us to generate space-efficient collages, which are useful for
summarization of image data sets.

The main contribution of this work is a complete system
for image collage whose key idea is to assemble arbitrary-
shaped cutouts of interesting regions. This requires solv-
ing two challenges: extraction of non-rectangular regions-
of-interest, and assembly of arbitrary shapes.

For the former we propose an algorithm that extracts non-
rectangular regions that coincide with the meaningful infor-
mation – object boundaries and salient background features
(Section 4). Image segmentation is a well studied and dif-
ficult problem. Here, we do not claim to solve it but rather
we propose an effective solution for our specific application.

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.
Published by Blackwell Publishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and
350 Main Street, Malden, MA 02148, USA.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

(a) Our puzzle-like collage (b) AutoCollage [RBHB06]

Figure 1: Dark zoo collages created from 9 images. Note how the assembly of the animals in our collage is like a puzzle. This
is both more aesthetic as our user-study shows and uses the canvas more efficiently (hence most of the animals appear larger).

Unlike segmentation, the region-of-interest may include sev-
eral objects and parts of the background to convey the con-
text. Moreover, an important observation is that a perfectly
accurate extraction is not compulsory due to the following
three reasons. First, even when some ROIs are not perfect,
their overall shape is typically close to the correct one and a
placement can be computed. Second, the overlaps between
the ROIs in the collage often conceal the inaccuracies. Fi-
nally, any remaining visible imperfections can then be man-
ually corrected using, for example, Soft Scissors [WAC07].

For the assembly we propose an algorithm that composes
the ROIs while finding a good balance between compact-
ness and informativeness (Section 5). Since the shapes are
non-rectangular, this resembles puzzle-solving. The shapes,
however, cannot perfectly match, as assumed in a standard
puzzle, and some overlap is allowed.

2. Related work

Techniques of collage were first used at the time of the in-
vention of paper in China around 200 BC. Since then, these
techniques have been used in various forms in other cultures.
In spite of its early creation, the term “collage” was coined
at the beginning of the 20th century, by both Georges Braque
and Pablo Picasso. These were the times when the use of col-
lages made a dramatic appearance among oil paintings and
became a distinctive part of modern art.

Methods for automatic creation of photo-collages were pro-
posed only recently. For example, [ADA∗04] make collages
of aligned nearly-identical images, while [Atk04] orga-
nize whole pictures on a page and [GC04] assemble faces.
The works most related to ours are [RBHB06, WSQ∗06,

BCG∗07, Pic]. [RBHB06] define a method, AutoCollage,
for constructing a seamless collage from an image set. In
this work, rectangular salient image regions are stitched to-
gether seamlessly using edge-sensitive blending. In picture
collage [WSQ∗06], a 2D spatial arrangement of rectangu-
lar images is optimized in order to maximize the visibility
of the salient regions. [BCG∗07] propose an improvement
to picture collage, by exploiting semantic and high-level in-
formation in saliency computation and a genetic algorithm
to positioning. Google’s Picasa [Pic] features automatic col-
lage generation of whole (or cropped) images, supporting
different styles of compositions. We propose to use ROI im-
ages of arbitrary shapes, rather than rectangular ones.

Most previous work on ROI extraction was limited to de-
tecting rectangular ROIs by finding a rectangular bound-
ing box of the salient pixels [RKKB05, RBHB06, WSQ∗06,
BCG∗07]. Usually such ROIs include many non-salient pix-
els. In [HXM∗04], more space-efficient ROIs are created by
using the convex hull of the salient points. This reduces
the number of non-salient pixels, but is still not accurate
enough. Methods for image segmentation [RKB04,LSTS04,
LRAL08] aim at accurately extracting foreground pixels by
cutting along image contrast boundaries. In images where
the region-of-interest corresponds to a foreground region,
such segmentation-based methods could suffice for ROI ex-
traction. However, in many images parts of the background
are essential for capturing the message they convey. Our ap-
proach combines saliency and contrast information, thus ad-
dressing both accuracy and informativeness.

Constructing a collage from image fragments of irregular
shapes resembles assembling a 2D puzzle. The jigsaw puz-

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

(a) Input
(b) ROIs

(c) Final collage

Figure 2: Our collage construction framework applied to an image collection describing highlights of the 2008 Olympic games.

zle problem is often approached in two stages (e.g. [RB80,
KK01, SKKC00]). First, local shape matching finds pairs of
fragments that fit perfectly. Then, a global solution is ob-
tained by resolving ambiguities. Collages differ from puz-
zles in that the fragments typically do not match perfectly,
and they are allowed to overlap each other. Therefore, our
assembly algorithm aims at finding an informative, yet com-
pact composition, rather than a unique and exact assembly.

3. Framework

Given a collection of images, we wish to construct an infor-
mative, compact, and visually appealing collage. Figure 2
illustrates the steps of our algorithm on a set of images de-
scribing events from the 2008 Olympic games. Given the im-
ages in Figures 2(a), we first extract the regions-of-interest
(ROI), as shown in Figures 2(b). Note the non-rectangular
ROIs that accurately cut out the salient regions. Figure 2(c)
illustrates the final assembly, in which the runner and the for-
ward spinning diver fit in the crevices created by the back-
flipping diver’s arched back and spread arms, respectively.

To construct informative collages, most of the ROIs should
remain visible after assembly. For many images, the ROI
consists of foreground pixels alone. This is typical for im-
ages with a shallow depth-of-field or images having a blank
background (see Figure 3(top)). For images like these, stan-
dard image segmentation could suffice. However, for many
images, both professional and non-professional, the infor-
mative regions consist of both foreground as well as back-
ground pixels that provide the context. For instance, in Fig-
ure 3(bottom), a meaningful ROI should include the car, the
fire and part of the surrounding, and the kids as well as part
of the bed they are sitting on. The proposed ROI extraction
algorithm incorporates saliency and edge-based information,
thus it can handle both types of images (Figure 3(d,g)).

To generate a collage the user can choose to click a button

(a) Input (b) Polyvore (c) GrabCut (d) Ours

(e) Input (f) GrabCut (g) Ours

Figure 3: ROI extraction. GrabCut is a semi-automatic seg-
mentation algorithm, capable of extracting ROIs in images
with a simple background like those on the top. It is less
successful for complex images like those on the bottom. The
result by Polyvore is fully automatic but less successful even
for simple images. Our method can handle both cases.

and get a fully automatic composition. Alternatively, the user
can do the following to better control the final result:

• Set the desired aspect ratio of the generated collage.
• Tune a parameter controlling the amount of allowed over-

lap between images.

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

• Attach importance weights to images, which will influ-
ence their size in the final collage.

• Manually fix the position of some images.
• "Glue" images together to be treated as one.
When the user selects the automatic mode, the images are
randomly scaled (within a limited range) and default overlap
and aspect ratio are applied. The assembly algorithm gener-
ates a puzzle-like collage by fitting the pieces. Most of the
collages in this paper were created fully automatically, ex-
cept for setting the aspect ratio. We mention explicitly when
user interaction was applied.

4. Region-Of-Interest (ROI) extraction

To obtain informative collages, our goal is to extract from
each image a region of interest (ROI) of an arbitrary shape,
which takes a binary decision at each pixel and labels it as
either "interesting" or "not interesting". To achieve this goal
we define the following desired properties of an ROI:

1. The ROI should enclose most of the salient pixels.

2. The boundary of the ROI should coincide well with the
natural edges of the image.

3. The boundary curve should enable visually appealing
compositions (e.g., jagged boundaries are undesired).

These requirements emphasize the difference between ROI
extraction and foreground-background segmentation. Image
segmentation aims at accurately extracting the foreground
objects, satisfying only Requirement 2. Conversely, ROIs
should include most of the informative regions (Require-
ment 1) and enable pretty compositions later on (Require-
ment 3). In other words, ROIs should include pieces of the
background, when these are helpful for understanding the
picture. Note that not all the background should be included
– only enough of it for conveying the context.

Our algorithm consists of four steps, which incorporate
saliency and edge information, in order to comply with the
requirements, as illustrated in Figure 4. First, a saliency map
is computed and an initial curve is constructed from it. This
curve is propagated towards the natural image edges and
smoothed based on the saliency.

Saliency computation: We first compute the saliency of
each image (Figure 4(b)). Though saliency can be com-
puted using a variety of previously proposed algorithms
[IK01, HZ07, LSZ∗07], we apply our own implementation.
For each pixel we consider the patch surrounding it and find
the patches in the image that are most similar to it. Salient
pixels are those that do not have similar patches elsewhere
in the image. The result of this stage is a saliency map where
each pixel i is assigned a saliency value Si, which indicates
the dissimilarity of the pixel to its K nearest neighbors:

Si = 1−ex p
{
−

1
K

K

∑
k=1

d
(

pi, qk
)}

(a) Input (b) Saliency map

(c) Initial ROI (d) After curve (e) Final ROI
propagation

Figure 4: The 4 steps of our ROI extraction algorithm. The
initial ROI contains the salient pixels, but is not visually ap-
pealing. After curve propagation, the ROI boundary coin-
cides with the image edges, however, it is noticeably jagged.
The final ROI is smooth in regions where the saliency is low
(e.g., above the head), but coincides well with more salient
image edges (e.g., knees and arms).

Here d
(

pi, qk
)

is the appearance difference between patches
pi and qk . This is enhanced by a standard face detec-
tion [VJ01].

Curve initialization: Curve initialization aims at satisfying
the first requirement. Initialized by an empty ROI, pixels are
added to it sequentially, starting from the highest saliency
values to the lowest. This continues until the total saliency of
the included pixels reaches 90% of the total image saliency.
This results in a binary mask of one or more connected com-
ponents of "interesting" regions. The boundary of the largest
connected component serves as an initial contour C(t = 0)
for the next step (Figure 4(c)).

Curve propagation: To satisfy Requirement 2, the initial
curve is propagated towards the image edges, while keep-
ing in mind the saliency map. We base our approach on a
geodesic active contours model [CKS97] and modify it to
incorporate saliency. Here the curve C is represented implic-
itly via a function φ , by C(t) = {(x, y)|φ (t, x, y) = 0}, and
the evolution of the curve is given by the zero-level curve at
time t of the function φ (t, x, y):

∂ φ

∂ t
= λ|∇φ|G(φ ) + γ|∇φ|H(φ ),

G(φ ) = div
(

g(|∇u0|)
∇φ

|∇φ|

)
+ ν g(|∇u0|),

H(φ ) = hκ(φ ). (1)

In Equation (1) u0 is the lightness channel, g(.) is an edge
indicator function, h is a saliency indicator function and ν is
a positive constant pushing the curve inwards. The curvature

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

Figure 5: Examples of our ROIs. Our approach can handle single objects as well as complex scenes. Irrelevant background is
excluded while parts of informative background are kept.

of the level-set function is defined by κ = div(∇φ /|∇φ|).
Note that setting γ = 0 results in a geodesic model, where the
zero-level curve moves in the normal direction with speed G
and stops on the desired boundary, where g vanishes. Setting
λ = 0, we get a saliency-based evolution in the normal direc-
tion with speed H, where the curve stops on salient regions.

The importance of our saliency term H is twofold. First, it
accelerates the curve evolution in non-salient regions. This
is especially important when the curve encounters locally
strong, but non-salient edges, which occur at many back-
ground non-salient pixels. Second, it slows down the evo-
lution in salient regions.

We set g(∇u0) = 1/(1 + |∇Gσ ∗u0|2), where Gσ ∗u0 is a
smoothed version of u0. Gσ is the Gaussian kernel with
standard deviation 1.5. The saliency indicator function is se-
lected by h = ex p{−Ŝ2/σ 2s }, where Ŝ is the saliency map
and σ 2s is its variance.

The level-set evolution is implemented using a numerical
scheme of [LXGF05], which eliminates the need of re-
initialization of φ , where the zero-level curve at t = 0 is
the curve from the previous step. In our implementation, we
used λ = 3, γ = 5, ν = 1.

The evolution continues until either a maximal number of
steps (1000) is reached (i.e., we cannot go too far from the
initial curve) or the sum of the image saliency values in-
side the curve drops below a certain level (50% of the total
saliency).

Curve visual enhancement: As can be seen in Figure 4(d),
the propagated curve bounds the region of interest. However,
the curve itself might be still jagged, since it was pushed to
pass through the image edges (which may be non-smooth).
To satisfy the third requirement on ROIs we next further
smooth the curve in accordance with the saliency, so that
later on it can be nicely matched to other shapes in the col-
lage (see the result in Figure 4(e)).

This is done by applying the level set-evolution with saliency
term only (λ = 0), and enforcing the same stopping crite-
ria. Note that in this formulation, the curve’s curvature is
smoothed while its length barely changes since its propaga-

tion is stopped by high-saliency values. Moreover, the evo-
lution is stronger where the saliency is low, as intuitively it
should be.

Results: Figure 5 presents several extracted ROIs. Our ROIs
are of smooth irregular shapes that bound nicely the re-
gion of interest. When the boundary between foreground and
background is clear, our approach cuts out the foreground
accurately, as can be seen in the pictures of the blue-dressed
Italian and the excited American speaker. In more complex
images, such as the cheering kids and the weight-lifter, the
ROI includes parts of the background, as expected.

5. Collage assembly

The last step of our framework is the assembly of the ROIs.
Our assembly algorithm expects as input a set of n images,
together with their ROIs and saliency maps. A set of addi-
tional, however optional, parameters such as the desired as-
pect ratio of the final collage, the amount of allowed over-
lap, preferences of ROIs (importance weights between 0 and
1), and constraints on their location lets the user control the
creativeness of the final result. We next describe the fully
automatic solution and how user input is incorporated.

Our goal is to generate collages that are: (1) Informative:
The collage should include as much as possible from every
image’s region of interest. (2) Compact: The collage should
utilize the canvas efficiently, subject to a desired aspect ratio.

To achieve this, we first formulate a cost function whose
minimization would produce collages adhering to these
properties. We then propose an algorithm that aims at mini-
mizing the cost function in two steps: initial placement and
local refinement. In the first step, the ROIs are scaled ran-
domly (or according to importance weights when these were
provided) and then efficiently packed in a puzzle-like man-
ner, subject to a desired aspect ratio. To further improve col-
lage properties, we apply local refinements in the second
step, by allowing the following ROI transformations: scal-
ing, shifting, rotating, and changing layering position.

Composition cost: Given a placement of ROIs and their lay-
ering, we define the composition cost C as a combination of

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

informativeness Cin f o and compactness Ccompact cost func-
tions subject to a desired aspect ratio δAR :

C = (α ·Cin f o + (1−α)·Ccompact )δAR . (2)

where 0 < α < 1 controls the amount of overlap.

Overlap: The two cost terms Cin f o and Ccompact aim at oppo-
site goals – a highly informative layout is not compact and
vice versa. The tradeoff between them is controlled by the
parameter α . Setting α to a low value could lead to a cre-
ative composition with high overlap, while informative im-
age parts might be occluded. High values of α could elimi-
nate occlusions at the cost of losing compactness. A typical
value of α = 0.5 is a good compromise between the two cost
terms.

Aspect ratio: Let AR be a user-desired aspect ratio and ar be
a current aspect ratio. To penalize placements of an unde-
sired aspect ratio we define the penalty term δAR as:

δAR = ex p{−
1

2σ 2AR
(AR−ar)2}, (3)

where we set σAR = 0.2.

Informativeness cost: Given a placement of k ROIs and their
layering, we define the informativeness cost as the sum of
occluded salient regions normalized by the total amount of
all salient regions. Let S̃i be the saliency map of image i
normalized by its ROI area, Ri be its ROI, and Rocci be the
occluded part of Ri. Then,

Cin f o =
∑⋃k

i=1 R
occ
i

S̃i

∑⋃k
i=1 Ri

S̃i
. (4)

Compactness cost: The compactness of a given configura-
tion of ROIs can be measured by the amount of empty space
captured in its axis-aligned bounding rectangle. Let bound
be the area of the bounding rectangle. We wish to minimize

Crect = 1−
⋃k

i=1 Ri

bound(
⋃k

i=1 Ri)
. (5)

This guarantees compact layouts, although it might be insuf-
ficient for matching protrusions and depressions of the ROIs
(Figure 6(a)). Therefore, we also minimize the empty space
between the ROIs. Let conv be the area of the convex hull of
the union of ROIs, we minimize the empty space in conv:

Cconv = 1−
⋃k

i=1 Ri

conv(
⋃k

i=1 Ri)
. (6)

Minimizing each term alone does not suffice, since min-
imizing only Cconv could result in diagonal or elongated
shapes which are unappealing (see Figure 6(e)) and mini-
mizing only Crect might create rectangular, yet less puzzle-
like, compositions (Figure 6(d)) . Therefore, we incorporate
both terms and define the compactness cost as:

Ccompact = Crect ·Cωcconv, (7)

(a) First ROI (b) BB & CH (c) Second ROI

(d) Minimizing Crect (e) Minimizing Cconv (f) Combined

Figure 6: The ROI of the current assembly (a) needs to be
combined with a new ROI (c). The bounding rectangle of
the current assembly is marked in gray and its convex hull
marked in yellow (b). Minimizing only Crect (d) can place
the strawberry at the top left corner since it does not change
the bounding rectangle. But, this increases Cconv (in red).
Minimizing only Cconv (e) places the strawberry at the top
right. The increase in Cconv (red) is small but the bounding
rectangle is now much larger (marked in green). The combi-
nation (f), which minimized Ccompact , places the strawberry
at the bottom right. This way the bounding rectangle it not
increased at all and Cconv is increased only a bit and signifi-
cantly less than when minimizing only Crect .

where in our experiments we set ωc = 0.5.

Initial Placement: In search for an assemblage algorithm
that minimizes the defined cost we turned to the puzzle-
solving literature. Their solutions, however, were found in-
adequate, since in our case the shapes do not perfectly match
as they do in puzzles. A more fruitful avenue to follow
was to consider the fundamental problem of 2D bin pack-
ing. Our assembly problem can be viewed as a generaliza-
tion of 2D bin packing, where in our case the parts are
not constrained to be rectangles and overlaps are allowed.
2D bin packing has been shown to be NP-hard [Sle80],
nevertheless, there exists a variety of approximation strate-
gies [LMM02]. We draw inspiration from the general strat-
egy proposed by [Joh73], in which a best-first decreasing
approach is proposed.

The best-first strategy is well suited to our task both since it
provides a near-optimal solution to the bin-packing problem,
and due to its visual implications. Inspired by artistic collage
principles we define the “best” ROI to place as the largest
(i.e. the most important) one. Placing ROIs in a decreasing
order of importance (and size) usually results in nicer com-
positions, where larger ROIs are condensed together near the
center of the image. To realize this approach, the ROIs are
first scaled, then placed in a “best-fit” manner, and finally
layered. We elaborate below.

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

(a) Before (b) After

Figure 7: Local refinement

First, the ROIs are scaled. In spirit of art, collages created
from images of different sizes are usually more visually in-
teresting. When no input is provided by the user we ran-
domly scale the ROIs such that the smallest ROI is at most
4 times smaller than the largest one (we never enlarge an
ROI to maintain high quality). The creative user can manu-
ally set importance weights, wi, to preferred (or all) images,
which serve as scaling weights instead of the random scal-
ing. For example, to create desired compositions we man-
ually assigned high weights to the runner in Figure 2, the
theme images in the fashion collages of Figure 8 and the
portrait of Obama and 2008 logo in Figure 11. The rest of
the images in these collages were scaled randomly.

Then, the ROIs are sorted by their area. After placing the
ROIs the user positioned manually, if any, we start from the
largest and position the ROIs one-by-one. At each iteration
we consider the ROIs already placed as a single ROI. This
reduces the problem to placing one new ROI with respect
to one other. We compute the composition cost function of
Eq. (2), for a subset of possible placements and select the one
yielding minimal cost. We limit the computation of compo-
sition cost function to placements satisfying the following
three conditions: (1) The intersection between the ROIs is
not empty. (2) The aspect ratio of the composition must be
within +/-0.5 from the desired one. (3) Placements are cho-
sen on a sparse grid (10x10 pixels).

Finally, recalling that the merged ROI is a union of the al-
ready placed ROIs, there are multiple layering options for
the new ROI. After a placement has been selected, we layer
the new ROI such that the informativeness cost function,
Cin f o is minimized.

Local refinement: We further refine the assembly via
a random sampling process, which improves the collage
compactness, informativeness and its visual appearance.
Our method is inspired by a Markov chain Monte Carlo
(MCMC) methodology and is based on a hit-and-run algo-
rithm [SC91]. We adopt an effective random sampling pro-
cess that reduces the composition cost function by applying
random transformations (scale, rotation and translation) to
the ROIs and by changing their layering positions.

At each time step we choose uniformly one of the ROIs to

Figure 8: Our view of fashion. Different random initializa-
tions produce different collages.

Figure 9: A collage of the wedding of Kate and Ian (Thanks
to Rosie Parsons for letting us use these beautiful pictures).
A few manual corrections to ROIs were applied at the final
phase. That took less than 10 minutes.

be translated by −→r pixels, rotated by θ degrees, and scaled
by a factor s. These are sampled from normal distributions:
r ∼N(0, 30), θ ∼N(0, 5), s∼N(1, 0.2). With probability 0.5
we also change the ROI’s layer by uniformly sampling a
new layer. We consider only samples where s ∈ [0.5, 2] and
θ ∈ [−30, 30] and accept only those that reduce the compo-
sition cost function Eq. (2). The sampling is stopped when a
cumulative acceptance rate of 5% is reached. Figure 7 illus-
trates the result.

6. Results

Our proposed algorithm deals well with a broad class of im-
ages, taken by professionals and amateurs alike. Figures 1–
2,8–13 illustrate some of our results. Most of the collages
were created using the default parameters. The user needs
only give the requested aspect ratio for each collage. Since
the images are randomly scaled, we created a few different
collages for each set of images (as illustrated in Figure 8) and

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

Figure 10: Parents love their children. The collages of Shelly (left) and the boys (right) are great gifts for their parents. Our
survey (Figure 14) shows most people prefer these over the corresponding AutoCollage results.

Figure 11: A collage of 47 images, created by our system, summarizing news highlights of 2008.

selected the one we liked the most. Figure 9 shows another
collage of images taken by a professional photographer.

Figure 10 shows two collages of everyday images. They con-
sist of photographs of children taken mostly by their parents.
These collages represent the sort of appealing summaries
people would like to create of their home photos. To make
the collage more interesting, we added to one of them im-
ages of Bratz dolls.

Figure 11 shows a collage that summarizes some of the
important events of the year 2008 (the US elections, the
Olympic games, the economic crisis, some disasters, etc.).
This collage is comprised mostly of images of complex
scenes with busy, yet informative backgrounds. Despite its
unusual busyness (47 images, 47 events), most of the col-
lage pixels are informative and the composition stays visu-
ally pleasing and interesting. Our user study (details reported
below) showed that about 70% of people preferred this col-
lage to the alternative. In this collage, we manually selected

the Obama images as the most important ones, together with
the 2008 text-image.

Limitations: Our system is not perfect, due to its automatic
nature. The assembly algorithm always produces an accept-
able composition, however at times the user might find the
result not pleasing enough, as some of our evaluators found
Figure 12(a). This is easily solved by generating a different
collage from the same images (Figure 12(b)). The random-
ness in the system allows to easily create a number of col-
lages, letting the user select the preferred one (this approach
was adopted from AutoCollage and Picasa).

Running time: Our algorithm is implemented in MATLAB
and runs on a 2.4GHz, dual core desktop computer. ROI ex-
traction takes about 90 seconds per image (500 pixels larger
dimension). The initial assemblage takes about 5 seconds to
place a pair of images and the refinement (if applied) takes a
few minutes.

User study: Since beauty is subjective we evaluated our col-

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

(a) a less favored collage (b) a nicer collage generated with a single mouse click

Figure 12: Limitations

(a) Our collage (b) AutoCollage

Figure 13: (a) Our idea of an “advertisement” for the Olympic games. (b) AutoCollage produces a less compact collage of the
same images. The survey of Figure 14 shows ∼50% preference to our collage (a) over the AutoCollage result (b).

lages via a user study. We published on the web 10 pairs of
collages, one created by our system and the other by Auto-
Collage [RBHB06] (in a random order). For each pair of col-
lages, the participants had to mark the one they liked more
or select “can’t decide”.

Figure 13 shows one such example – a collage of the 2008
Olympic games. It demonstrates how the shapes complete
each other, just like a puzzle. Conversely, the collage of Au-
toCollage (Figure 13(b)) places the rectangular regions sur-
rounding the athletes almost on a grid. Our result is more
compact due to the elimination of the pixels that do not be-
long to the ROI and the puzzle-like placement.

160 unbiased people, who were unfamiliar with any of the
approaches, participated in the survey. In 8 out of 10 cases
our collages were favored. For instance, 103 people liked our
collage better and 4 could not decide for the collages in Fig-
ure 13. In one case (Figure 12(a)) the AutoCollage compo-
sition was significantly preferred (58% vs. 42%), and in one
other (“‘trip”) they had roughly the same score. Figure 14
shows the results; the complete survey (which includes col-
lages that are omitted from the paper) is presented in the
supplementary material.

We conclude that our puzzle-like collages would be a wel-
comed addition to the type of collages one can create today
with existing technology.

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.


S. Goferman, A. Tal, L. Zelnik-Manor / Puzzle-like Collage

Figure 14: Survey results

7. Conclusion

This paper presented a framework for producing informative
and pretty collages of exact cutouts of interesting regions in a
puzzle-like manner. The paper makes several contributions.
First, a novel region-of-interest (ROI) extraction algorithm
is presented. It is shown to extract non-rectangular regions
that coincide with the meaningful objects and background
boundaries. Second, the paper describes a composition al-
gorithm that places these non-rectangular pieces together.
Finally, a system is presented, which can generate collages
from everyday as well as professional images.

Our results and user study show that assembling non-
rectangular shapes produces compact and pretty summaries
of image sets. We have created collages of a variety of dif-
ferent scenarios, including sports event, news, and family
photos, which demonstrate the general applicability of the
proposed method.

Acknowledgements: We thank the authors of the following
software that we’ve used, for making them public: Google’s
Picasa, Dirk B. Walther’s Saliency toolbox, Xiaodi Hou’s
spectral residual code, and the trial version of Microsoft’s
AutoCollage 2008. Thanks to Nicolas Evariste for letting us
use his Zoo beautiful images. This work was supported by
the Fund for Promotion of Research at the Technion,.

References

[ADA∗04] AGARWALA A., DONTCHEVA M., AGRAWALA M.,
DRUCKER S., COLBURN A., CURLESS B., SALESIN D., CO-
HEN M.: Interactive digital photomontage. ACM Transactions
on Graphics 23, 3 (2004), 294–302.

[Ade89] ADES D.: Photomontage (World of Art series). Thames
& Hudson, 1989.

[Atk04] ATKINS C.: Adaptive photo collection page layout. In
ICIP (2004).

[BCG∗07] BATTIATO S., CIOCCA G., GASPARINI F., PUGLISI
G., SCHETTINI R.: Smart photo sticking. In Adaptive Multime-
dia Retrieval (2007), IEEE.

[CKS97] CASELLES V., KIMMEL R., SAPIRO G.: Geodesic ac-
tive contours. Int. J. of Computer Vision 22, 1 (1997), 61–79.

[GC04] GIRGENSOHN A., CHIU P.: Stained glass photo collages.
In UIST (2004), pp. 13–14.

[HXM∗04] HU Y., XIE X., MA W., CHIA L., RAJAN D.: Salient
Region Detection Using Weighted Feature Maps Based on the
Human Visual Attention Model. Lecture Notes in Computer Sci-
ence (2004), 993–1000.

[HZ07] HOU X., ZHANG L.: Saliency detection: A spectral resid-
ual approach. In CVPR (2007), pp. 1–8.

[IK01] ITTI L., KOCH C.: Computational modelling of visual
attention. Nature Reviews Neuroscience 2, 3 (2001), 194–204.

[Joh73] JOHNSON D.: Near-optimal bin packing algorithms. PhD
thesis, Massachusetts Institute of Technology, 1973.

[KK01] KONG W., KIMIA B.: On Solving 2D and 3D Puzzles
Using Curve Matching. In CVPR (2001).

[LMM02] LODI A., MARTELLO S., MONACI M.: Two-
dimensional packing problems: A survey. European Journal of
Operational Research 141, 2 (2002), 241–252.

[LRAL08] LEVIN A., RAV-ACHA A., LISCHINSKI D.: Spectral
matting. IEEE Trans Pattern Anal Mach Intell 30, 10 (2008),
1699–712.

[LSTS04] LI Y., SUN J., TANG C., SHUM H.: Lazy snapping.
ACM Trans. Graph. 23, 3 (2004), 303–308.

[LSZ∗07] LIU T., SUN J., ZHENG N., TANG X., SHUM H.:
Learning to Detect A Salient Object. In CVPR (2007).

[LXGF05] LI C., XU C., GUI C., FOX M.: Level Set Evolu-
tion without Re-Initialization: A New Variational Formulation.
In CVPR (2005), p. 430.

[Pic] PICASA:. http://picasa.google.com.

[Pol] POLYVORE:. http://www.polyvore.com.

[RB80] RADACK G., BADLER N.: Jigsaw Puzzle Matching Us-
ing a Boundary-centered Polar Encoding. PhD thesis, Graduate
School of Arts and Sciences, University of Pennsylvania, 1980.

[RBHB06] ROTHER C., BORDEAUX L., HAMADI Y., BLAKE
A.: Autocollage. ACM Trans. Graph. 25, 3 (2006), 847–852.

[RKB04] ROTHER C., KOLMOGOROV V., BLAKE A.: "Grab-
Cut": interactive foreground extraction using iterated graph cuts.
ACM Trans. on Graphics 23, 3 (2004), 309–314.

[RKKB05] ROTHER C., KUMAR S., KOLMOGOROV V., BLAKE
A.: Digital Tapestry. In CVPR (2005), p. 589.

[SC91] SCHMEISER B., CHEN M.: General hit-and-run Monte
Carlo sampling for evaluating multidimensional integrals. Tech.
rep., School of Industrial Eng., Purdue U., 1991.

[SKKC00] SEBASTIAN T., KLEIN P., KIMIA B., CRISCO J.:
Constructing 2D Curve Atlases. In IEEE Workshop on Mathe-
matical Methods in Biomedical Image Analysis (2000), pp. 70–
77.

[Sle80] SLEATOR D.: A 2.5 Times Optimal Algorithm for Pack-
ing in Two Dimensions. Information Processing Letters 10, 1
(1980), 37–40.

[VJ01] VIOLA P., JONES M.: Rapid Object Detection Using a
Boosted Cascade of Simple Features. In CVPR (2001).

[WAC07] WANG J., AGRAWALA M., COHEN M.: Soft scissors:
an interactive tool for realtime high quality matting. ACM Trans.
Graph. 26, 3 (2007).

[WSQ∗06] WANG J., SUN J., QUAN L., TANG X., SHUM H.:
Picture collage. In CVPR (2006), pp. 347–354.

c© 2010 The Author(s)
Journal compilation c©2010 The Eurographics Association and Blackwell Publishing Ltd.