Vis Comput
DOI 10.1007/s00371-013-0786-4

O R I G I N A L A R T I C L E

Ray geometry in non-pinhole cameras: a survey

Jinwei Ye · Jingyi Yu

© Springer-Verlag Berlin Heidelberg 2013

Abstract A pinhole camera collects rays passing through
a common 3D point and its image resembles what would
be seen by human eyes. In contrast, a non-pinhole (multi-
perspective) camera combines rays collected by different
viewpoints. Despite their incongruity of view, their images
are able to preserve spatial coherence and can depict, within
a single context, details of a scene that are simultaneously
inaccessible from a single view, yet easily interpretable by
a viewer. In this paper, we thoroughly discuss the design,
modeling, and implementation of a broad class of non-
pinhole cameras and their applications in computer graphics
and vision. These include mathematical (conceptual) cam-
era models such as the General Linear Cameras and real
non-pinhole cameras such as catadioptric cameras and pro-
jectors. A unique component of this paper is a ray geometry
analysis that uniformly models these non-pinhole cameras
as manifolds of rays and ray constraints. We also model the
thin lens as a ray transform and study how ray geometry is
changed by the thin lens for studying distortions and defo-
cusing. We hope to provide mathematical fundamentals to
satisfy computer vision researchers as well as tools and al-
gorithms to aid computer graphics and optical engineering
researchers.

Keywords Camera models · Ray geometry · Thin lens ·
Catadioptric imaging · Computer vision · Computer
graphics · Computational photography

J. Ye (�) · J. Yu
University of Delaware, Newark, USA
e-mail: jye@cis.udel.edu

1 Introduction

A pinhole camera collects rays passing through a common
3D point, which is called the Center-of-Project (CoP). Con-
ceptually, it can be effectively viewed as a light-proof box
with a small hole in one side, through which light from a
scene passes and projects an inverted image on the opposite
side of the box as shown in Fig. 1. The history of pinhole
cameras can be traced back to Mo Jing, a Mohist philoso-
pher in the fifth century BC in China who described a similar
design using a closed room and a hole the wall. In the 10th
century, Persian scientist Ibn al-Haytham (Alhazen) wrote
about naturally occurring rudimentary pinhole cameras. In
1822, Niepce managed to take the first photograph using the
pinhole camera obscura via lithography. Today, the pinhole
camera is serving as the most common workhorse for gen-
eral imaging applications.

The imaging quality of a pinhole camera relies heavily on
choosing the proper sized pinhole: a small pinhole produces
a sharp image but the image will be dimmer due to insuf-
ficient light whereas a large pinhole generates brighter but
blurrier images. To address this issue, lenses have been used
for converging lights. The goal is to replace the pure pin-
hole model with a pinhole-like optical model that can admit
more light while maintaining image sharpness. For example,
a thin, convex lens can be placed at the pinhole position with
a focal length equal to the distance to the film plane in or-
der to take pictures of distant objects. This emulates opening
up the pinhole significantly. We refer to this thin lens-based
pinhole approximation as pinhole optics.

In computer vision and graphics, pinhole cameras are
dominating imaging model for two main reasons. First, pin-
hole geometry is rather simple. Each pinhole camera can be
uniquely defined by only three parameters (the position of
CoP in 3D). The pinhole imaging process can be decom-
posed into two parts: projecting the scene geometry into rays

mailto:jye@cis.udel.edu


J. Ye, J. Yu

Fig. 1 (a) A pinhole camera
collects rays passing through a
common 3D point (the CoP).
(b) An illustration of the pinhole
obscura

and mapping the rays onto the image plane and they can
be uniformly described by the classic 3 × 4 pinhole camera
matrix [17]. Under homogeneous coordinates, the imaging
process is linear. Second, in bright light, the human eyes act
as a virtual pinhole camera where the observed images ex-
hibit all characteristics as a pinhole image, e.g., points map
to points, lines map to lines, parallel lines converge at a van-
ishing point, etc. Pinhole cameras are therefore also referred
to as perspective cameras in the graphics and vision litera-
ture.

The pinhole imaging model, however, is rare in insect
eyes. Compound eyes, which may consist of thousands
of individual photoreceptor units or ommatidia are much
more common. The image perceived is a combination of in-
puts from the numerous ommatidia (individual “eye units”),
which are located on a convex surface, thus pointing in
slightly different directions. Compound eyes hence possess
a very large view angle and greatly help detect fast move-
ment. Notice that rays collected by a compound eye will no
longer follow pinhole geometry. Rather, they follow multi-
viewpoint or multi-perspective imaging geometry.

The idea of non-pinhole imaging model has been widely
adopted in art: artists, architects, and engineers regularly
draw using non-pinhole projections. Despite their incon-
gruity of views, effective non-pinhole images are still able
to preserve spatial coherence. Pre-Renaissance and post-
impressionist artists frequently use non-pinhole models to
depict more than can be seen from any specific view point.
For example, the cubism of Picasso and Matisse [40] can
depict, within a single context, details of a scene that are
simultaneously inaccessible from a single view, yet easily
interpretable by a viewer. The goal of this survey is to carry
out a comprehensive review on non-pinhole imaging models
and their applications in computer graphics and vision.

Scope On the theory front, this survey presents a unique
approach to systematically study non-pinhole imaging mod-
els in the ray space. Specifically, we parameterize rays to
a 4D ray space using the Two-Plane Parametrization (2PP)
[23, 43] and then study geometric ray structures of non-
pinhole cameras in the ray space. We show that common

non-perspective phenomenon such as reflections, refractions
and defocus blurs can all be viewed as ray geometry trans-
formations. Further, commonly used non-pinhole cameras
can be effectively modeled as special (planar) 2D manifold
in the ray space. The ray manifold model also provides fea-
sible solutions for the forward projection problem, i.e., how
to find the projection from a 3D point to its corresponding
pixel in a non-pinhole imaging system.

On the application sides, we showcase a broad range of
non-pinhole imaging systems. In computer vision, we dis-
cuss state-of-the-art solutions that apply non-pinhole cam-
eras for stereo matching, multi-view reconstruction, shape-
from-distortion, etc. In computational photography, we dis-
cuss emerging solutions that use the non-pinhole camera
modelings for designing catadioptric cameras and projectors
to acquire/project with a much wider Field-of-View (FoV)
as well as various light field camera designs to directly
acquire the 4D ray space in a single image. In computer
graphics, we demonstrate using non-pinhole camera mod-
els for generating panoramas, creating cubism styles, ren-
dering caustics, faux-animations from still-life scenes, ren-
dering beyond occlusions, etc.

This survey is closely related to recent surveys on multi-
perspective modeling and rendering [59] and computational
photography [39]. Yu et al. [59] provides a general overview
of multi-perspective cameras whereas we provide a compre-
hensive ray-space mathematical model for a broader class of
non-pinhole cameras. Raskar et al. [39] focuses mostly on
computational photography whereas we discuss the use of
conceptual and real non-pinhole cameras for applications in
computer vision and computer graphics. Further, our unified
ray geometry analysis may fundamentally change people’s
view on cameras and projectors.

2 Pinhole optics

Pinhole cameras predate modern history. Geometrically, a
pinhole camera collects rays passing through the CoP. Each
pinhole camera, therefore, can be uniquely defined by only


Ray geometry in non-pinhole cameras: a survey

three parameters (the position of CoP). The pinhole imag-
ing process can be decomposed into two parts: projecting
the scene geometry into rays and mapping the rays onto the
image plane. We refer to the first part as projection and the
second as collineation. It has been shown that the projec-
tion and collineation can be uniformly described by the clas-
sic 3 × 4 pinhole camera matrix [17], which combines six
extrinsic and five intrinsic camera parameters into a single
operator that maps homogeneous 3D points to a 2D image
plane. These mappings are unique up to a scale factor, and
similar models can be applied to describe orthographic cam-
eras. In this section, we revisit the pinhole imaging process
via a ray-space analysis.

2.1 Pinhole in ray space

2.1.1 Ray space

We use the Two-Plane Parametrization (2PP) that is widely
used in light field [23] and lumigraph [6, 15] for representing
rays, as shown in Fig. 2(a). Under 2PP, a ray in free space
is defined by its intersections with two parallel planes (Πuv
and Πst ). Usually, Πuv is chosen as the aperture plane (z =
0) whose origin is the origin of the coordinate system. Πst
is placed at z = 1 and chosen to be the default image plane.
All rays that are not parallel to Πuv and Πst will intersect
the two planes at [u, v, 0] and [s, t, 1], respectively, and we
use [u, v, s, t] for parameterizing each ray.

2.1.2 Pinhole ray geometry

Let us consider the pinhole model in ray space. By def-
inition, all rays in a pinhole camera pass through a com-
mon 3D point, i.e., the CoP Ċ = [Cx , Cy , Cz]. For each ray
r = [u, v, s, t], there exist some λ that satisfies
λ[s, t, 1] + (1 − λ)[u, v, 0] = [Cx , Cy , Cz] (1)
We have λ = Cz and{

Czs + (1 − Cz)u = Cx
Czt + (1 − Cz)v = Cy

(2)

This indicates that rays in a pinhole camera obey two linear
constraints, one in s and u and the other in t and v. We call
them the pinhole ray constraints.

If we divide both sides of Eq. (2) by Cz and let Ċ go to
infinity, we have

{
s − u = dx
t − v = dy

(3)

i.e., all rays have identical direction [dx , dy , 1] and the cam-
era degenerates to an orthographic camera. We call Eq. (3)

the orthographic ray constraints; they are also linear in s − u
and t − v.

The pinhole and orthographic models have many nice
properties that are useful to computer graphics and com-
puter vision applications. For instance, all lines in the scene
are projected to lines on the image. Similarly, a triangle in
3D space is projected as a triangle on the pinhole or ortho-
graphic image. Thus, by representing the scene geometry
using triangles, one can efficiently render the entire scene
by projecting the triangles onto the image plane and then
rasterizing the triangles in the image space.

2.1.3 Slit ray geometry

More general ray configurations, however, do not pass
through a common 3D point. To relax the pinhole con-
straints, we study rays that pass through a line or slit l. We
consider the following two cases:

(1) If l is parallel to Πuv and Πst , we can represent it with
a point Ṗ = [Px , Py , Pz] on l and its direction [dx , dy , 0]. If
a ray r = [u, v, s, t] intersects l, there exist some λ1 and λ2
that satisfy

λ1[s, t, 1] + (1 − λ1)[u, v, 0] = [Px , Py , Pz] + λ2[dx , dy , 0]
(4)

It is easy to see that λ1 = Pz and we can obtain a linear
constraint in [u, v, s, t] as
1 − Pz

dx
u − 1 − Pz

dy
v + Pz

dx
s − Pz

dy
t + Py

dy
− Px

dx
= 0 (5)

Yu and McMillan [57] show that it is equivalent to a general
linear constraint Au + Bv + Cs + Dt + E = 0 where A/B =
C/D. We call this parallel slit constraint.

(2) If l is not parallel to Πuv and Πst , it then can be di-
rectly parameterized by a ray under 2PP as [u0, v0, s0, t0].
All rays r = [u, v, s, t] that intersect with l should satisfy

λ1[s, t, 1] + (1 − λ1)[u, v, 0]
= λ2[s0, t0, 1] + (1 − λ2)[u0, v0, 0] (6)

We have λ1 = λ2 and
s − s0
t − t0

= u − u0
v − v0

(7)

This is a bilinear constraint that we call non-parallel slit con-
straint. In the following sections, we will use the parallel and
non-parallel slit constraints to model a broad class of non-
pinhole imaging systems.


J. Ye, J. Yu

Fig. 2 (a) Two-Plane
Parametrization: a ray is
parameterized by its
intersections with two parallel
planes Πuv (z = 0) and
Πst (z = 1). (b) Ray
transformation after passing
through a thin lens: the thin lens
works as a shearing operator

2.2 The thin lens operator

Recall that practical pinhole cameras are constructed by us-
ing a thin lens in order to collect more lights. Although real
lenses are typically a complex assembly of multiple lenses,
they can still be effectively modeled using the Thin Lens
Equation:

1

a
+ 1

b
= 1

f
(8)

where a is the object distance; b is the image distance and
f is the thin lens focal length.

The thin lens can be viewed as a workhorse that maps
each incident ray r = [u, v, s, t] approaching the lens to the
exit ray r′ = [u′, v′, s′, t ′] towards the sensor. Ng [31] and
Ding et al. [13] separately derived the Thin Lens Opera-
tor (TLO) to show how rays are transformed after passing
through a thin lens. By choosing the aperture plane as Πuv
at z = 0 and the image sensor plane as Πst at z = 1, we have
u′ = u, v′ = v. Using Eq. (8), it can be shown that the thin
lens operator L transforms the ray coordinates as

[
u

′
, v

′
, s

′
, t

′] = L([u, v, s, t]) =
[
u, v, s − 1

f
u, t − 1

f
v

]
(9)

This reveals the thin lens L behaves as a linear, or more
precisely, shear operator on rays, as shown in Fig. 2(b).

For a toy case study, let us investigate how a thin lens
transforms a set of incident rays that follow pinhole ge-
ometry. Assume the incident rays originate from the CoP
Ċ = [Cx , Cy , Cz]. By applying TLO (Eq. (9)) to the pinhole
constraints (Eq. (2)), we obtain a new pair of constraints for
the exiting rays [u′, v′, s′, t ′] as
⎧⎨
⎩

Czs
′ + (1 − Cz + Cz · 1f )u′ = Cx

Czt
′ + (1 − Cz + Cz · 1f )v′ = Cy

(10)

If Ċ does not lie on the focal plane ΠL− of the lens at the
world side (Cz �= −f ), then Eq. (10) can be rewritten as⎧⎨
⎩

f Cz
f +Cz s

′ + (1 − f Cz
f +Cz )u

′ = f Cx
f +Cz

f Cz
f +Cz t

′ + (1 − f Cz
f +Cz )v

′ = f Cy
f +Cz

(11)

Therefore, the exiting rays follow a new set of pinhole con-
straints with the new CoP at f

f +Cz [Cx , Cy , Cz].
If Ċ lies on ΠL− (Cz = −f ), then Eq. (10) degenerate to

the orthographic constraints:⎧⎨
⎩

s − u = − Cx
f

t − v = − Cy
f

(12)

In this case, all exiting rays correspond to an orthographic
camera with direction [− Cx

f
, − Cy

f
, 1].

The results derived above are well-known as they can be
directly viewed as the image of a 3D point through the thin
lens. Nevertheless, for more complex cases when the inci-
dent rays do not follow pinhole geometry, the TLO analysis
is crucial for modeling the exit ray geometry [13]. This case
study also reveals that all rays emitting from a 3D scene
point Ċ will generally converge at a different 3D point Ċ′
through the thin lens. The cone of rays passing through
Ċ′ will therefore spread onto a disk of pixels on the sen-
sor. This process is commonly described using the Point
Spread Function (PSF), i.e., the mapping from a 3D point
to a disk of pixels. As shown in Fig. 3, assuming that the
sensor moves Δz away from z = C′z and the lens has circu-
lar aperture with diameter D, then the PSF is a disk of size
Dp = D |Δz|C′z .

3 Non-pinhole imaging models

More general camera models do not follow pinhole camera
geometry, i.e., not all rays collected by the camera need to
pass through a common point. Such cameras are often re-
ferred to as non-pinhole cameras. In contrast to pinhole and


Ray geometry in non-pinhole cameras: a survey

orthographic cameras, which can be uniformly described us-
ing the 3 × 4 camera matrix, non-pinhole camera models are
defined less precisely. In practice, many non-pinhole camera
models are defined by constructions. By this we mean that a
system or process is described for generating each specific
class but there is not always a closed-form expression for the
projection transformation. In this section, we apply pinhole
constraints and slit constraints (parallel and non-parallel) to
study the ray properties in various non-pinhole camera mod-
els.

3.1 Classical non-pinhole cameras

Pushbroom cameras, consisting of a linear sensor, are rou-
tinely used in satellite imagery [16]. The pushbroom sensor
is mounted on a moving rail, and as the platform moves, the
view plane sweeps out a volume of space and forms a push-
broom image on the sensor. Rays collected by a pushbroom
camera should satisfy two constraints: (1) the slit constraint,
where the slit is the motion path of the pushbroom sensor;
(2) all the sweeping rays are parallel to some plane that is
perpendicular to the slit. Assume the common slit is par-
allel to Πuv and Πst and we parameterize it with a point
[x0, y0, z0] on the slit and the slit’s direction [dx , dy , 0], the
two constraints for all rays [u, v, s, t] captured by a pushb-
room camera can be formulated as
⎧⎨
⎩

1−z0
dx

u − 1−z0
dy

v + z0
dx

s − z0
dy

t + y0
dy

− x0
dx

= 0
[s − u, t − v, 1] · [dx , dy , 0]T = 0

(13)

Fig. 3 The thin lens maps a 3D scene point Ċ to Ċ′. By moving the
sensor Δz away from z = C′z, Ċ′ spreads onto a PSF with size Dp

where the first constraint is the parallel slit constraint and the
second corresponds to the parallel sweeping planes, both lin-
ear. In practice, a pushbroom image can be synthesized by
moving a perspective camera along a linear path and assem-
bling the same column of each perspective image as shown
in Fig. 4(a) and (b).

Another popular class of non-pinhole cameras are the
XSlit cameras. An XSlit camera has two oblique (neither
parallel nor coplanar) slits in 3D space. The camera col-
lects rays that simultaneously pass through the two slits
and projects them onto an image plane. If we choose the
parametrization plane parallel to both slits, rays in an XS-
lit camera will then satisfy two parallel slit constraints, i.e.,
two linear constraints. Similar to pushbroom images, XSlit
images can also be synthesized using images captured by a
moving pinhole camera. Zomet et al. [60] generated XSlit
images by stitching linearly varying columns across a row
of pinhole images, as shown in Fig. 4(c) and (d).

3.2 General non-pinhole cameras

The analysis above reveals that pinhole, orthographic, push-
broom, and XSlit all correspond to 2D manifolds in the ray
space (since they are subject to two linear ray constraints).
This is not surprising as a general imaging process entails
mapping 3D geometry onto a 2D manifold of rays, i.e., each
pixel [x, y] maps to a ray. Therefore a general non-pinhole
camera can be viewed as a 2D ray manifold Σ :

Σ(x, y) = [u(x, y), v(x, y), s(x, y), t (x, y)] (14)
To analyze ray geometry, one then approximate the local

behavior of the rays by computing the tangent plane about
any specified ray r . The tangent plane can be expressed as
two spanning vectors d1 and d2 by taking the partial deriva-
tives of [u, v, s, t]:
d1 = [ux , vx , sx , tx ], d2 = [uy , vy , sy , ty ] (15)

This is analogous to modeling a curved 2D surface us-
ing local tangent planes. A local ray tangent plane hence
can then be modeled by three generator rays: r , r + d1 and
r + d2.

Fig. 4 Pushbroom and XSlit.
(a) The stationary column
sampling routine for
synthesizing a pushbroom
panorama (b). (c) The linearly
varying column sampling
routine for synthesizing an XSlit
panorama (d) (courtesy of Steve
Seitz)


J. Ye, J. Yu

Table 1 Characterizing general linear cameras by characteristic equation

Characteristic equation 2 Solutions 1 Solution 0 Solution ∞ Solutions

A �= 0 XSlit Pencil/Pinholea Bilinear ∅
A = 0 ∅ Pushbroom Twisted/Ortho.a EPI
aA GLC satisfying edge-parallel condition is pinhole (A �= 0) or orthographic (A = 0)

Fig. 5 General Linear Camera Models. (a) A pinhole camera. (b) An orthographic camera. (c) A pushbroom. (d) An XSlit camera. (e) A pencil
camera. (f) A twisted orthographic camera. (g) A bilinear camera. (h) An EPI camera. See Sect. 3.2.1 for detailed discussions on each GLC

3.2.1 General linear cameras (GLC)

To study ray geometry of local ray tangent plane, Yu and
McMillan [55] developed a new camera model called the
General Linear Camera (GLC). GLCs are 2D planar ray
manifolds which can apparently describe the traditional pin-
hole, orthographic, pushbroom, and XSlit cameras.

A GLC is defined as the affine combination of three gen-
erator rays ri = [ui , vi , si , ti ], i = 1, 2, 3:
r = α[u1, v1, s1, t1] + β[u2, v2, s2, t2]

+ (1 − α − β)[u3, v3, s3, t3] (16)

For example, in the ray tangent plane analysis, the three
ray generators are chosen as r , r + d1 and r + d2.

To determine the type of the non-pinhole camera for any
GLC specification, they further derived a ray characteris-
tic equation that computes how many singularities (lines or
points) that all rays in the GLC can pass through:

∣∣∣∣∣∣
λ · s1 + (1 − λ) · u1 λ · t1 + (1 − λ) · v1 1
λ · s2 + (1 − λ) · u2 λ · t2 + (1 − λ) · v2 1
λ · s3 + (1 − λ) · u3 λ · t3 + (1 − λ) · v3 1

∣∣∣∣∣∣ = 0 (17)

Equation (17) yields a quadratic equation of the form Aλ2 +
Bλ + C = 0 where

A =
∣∣∣∣∣∣
s1 − u1 t1 − v1 1
s2 − u2 t2 − v2 1
s3 − u3 t3 − v3 1

∣∣∣∣∣∣ , C =
∣∣∣∣∣∣
u1 v1 1
u2 v2 1
u3 v3 1

∣∣∣∣∣∣ ,

B =
∣∣∣∣∣∣
s1 − u1 v1 1
s2 − u2 v2 1
s3 − u3 v3 1

∣∣∣∣∣∣ −
∣∣∣∣∣∣
t1 − v1 u1 1
t2 − v2 u2 1
t3 − v3 u3 1

∣∣∣∣∣∣
(18)

An edge parallel condition is defined to check if all three
pairs of the corresponding edges of the u − v and s − t tri-
angles formed by the generator rays are parallel:

si − sj
ti − tj

= ui − uj
vi − vj

i, j = 1, 2, 3 and i �= j (19)

Given three generator rays, its GLC type can be de-
termined by the A coefficient and the discriminant Δ =
B2 − 4AC of its characteristic equation and the edge par-
allel condition, shown in Table 1. Yu and McMillan [55]
have shown that there are precisely eight types of GLC as
shown in Fig. 5: in a pinhole camera, all rays pass through a
single point; in an orthographic camera, all rays are parallel;


Ray geometry in non-pinhole cameras: a survey

Fig. 6 Curved line images on
specular window surfaces: the
two images of 0.5 m × 1 m
near-flat windows are captured
from 15 m away. Images of
straight lines far away form
interesting conic patterns

Table 2 Conic types observed in general linear cameras

GLC type Pinhole Ortho. XSlit Pushbroom Pencil Twisted Bilinear

Determinant Δ = 0 Δ = 0 Δ > 0 Δ > 0 Δ = 0 Δ = 0 Δ < 0
Conic Type Line Line Hyperbolae Hyperbolae Parabola Parabola Ellipse

in a pushbroom camera [16], all rays lie on a set of parallel
planes and pass through a line; in an XSlit camera [60], all
rays pass through two non-coplanar lines; in a pencil cam-
era, all coplanar rays originate from a point on a line and
lie on a specific plane through the line; in a twisted ortho-
graphic camera, all rays lie on parallel twisted planes and
no rays intersect; in a bilinear camera [32], no two rays are
coplanar and no two rays intersect; and in an EPI camera, all
rays lie on a 2D plane.

To find the projection of a 3D point in a GLC, one can
combine the GLC constraints with pinhole constraints. For
example, considering an XSlit camera that obeys two paral-
lel slit constraints (Eq. (5)) derived in Sect. 2.1.3. Rays pass-
ing through the 3D point obeys another two pinhole linear
constraints (Eq. (2)). We therefore can uniquely determine
the ray from the XSlit that passes through the 3D point. To
calculate the projection of a 3D line in the XSlit camera, one
can compute the projection of each point on the line. Ding
et al. [12] show that line projections can only be lines or con-
ics, as shown in Fig. 6. The complete classification of conics
that can be observed by each type of GLC is enumerated in
Table 2.

3.2.2 Case study 1: reflection on curved mirrors

To demonstrate how to use the GLC analysis to model
general non-pinhole cameras, let us look at a special non-
pinhole camera, a catadioptric camera that combines a pin-
hole camera and a curved mirror. Given the camera position
and the mirror surface, we can map each reflected ray into
the ray space as [u, v, s, t]. Assuming the 3D surface is of
form z(x, y), the reflected ray in [u, v, s, t] can be computed
via the reflection constraint:

r = i − 2(n̂ · i)n̂ (20)
where r = [rx , ry , rz] is the reflected ray; i is the incident
ray and n̂ is unit normal that can be computed by normaliz-
ing [−zx , −zy , 1]. If we choose Πuv (z = 0) to contain the

surface point (x, y, z(x, y)) and tangential to the surface and
set Πst as z = 1, we obtain [u, v, s, t] of the reflected ray as

[u, v, s, t] =
[
x − z · rx

rz
, y − z · ry

rz
, x − (z − 1) · rx

rz
,

y − (z − 1) · ry
rz

]
(21)

All variables r , z, u, v, s and t are functions in x and y,
hence, the set of reflection rays from the 3D surface form
a 2D parametric ray manifold in x and y. We can then use
the tangent GLC analysis (Sect. 3.2.1) to determine the type
of local non-pinhole camera model. Using this approach, Yu
and McMillan [56] have shown that all local reflections ob-
served by a pinhole or an orthographic camera can be char-
acterized as only be XSlit, pushbroom, pinhole, or ortho-
graphic, all are special cases of XSlit. The two slits corre-
spond to the two reflection caustic surfaces and provide a
special set of rulings on the surface. These rulings deter-
mine which rays lie on the local tangent GLC and the local
distortions seen in the reflection.

3.2.3 Case study 2: 3D surfaces

It is also possible to convert a 3D surface to a 2D ray man-
ifold. Yu et al. [58] proposed a normal-ray model to repre-
sent surfaces that locally parameterize the surface about its
normal based on focal surface approximation, as shown in
Fig. 7(a)–(d). Given a smooth surface S(x, y), at each ver-
tex Ṗ , we orient the local frame to align z = 0 with the tan-
gent plane at Ṗ . We further assume Ṗ is the origin of z = 0
plane and set Πuv , Πst at z = 0 and z = 1, respectively.
Under this parametrization, normal rays can be mapped as
n = [u, v, s, t]. The tangent plane can then be represented
by a GLC with three rays: n, n + nx and n + ny .

By using the GLC analysis, one can compute the two slits
for each normal ray GLC from the characteristic equation.
Yu et al. [58] have shown that the two slits are perpendicular


J. Ye, J. Yu

Fig. 7 Estimating focal meshes using the normal ray model. (a) We
orient the local frame to align Πuv (z = 0) with the surface tangent
plane at Ṗ . (b) We choose the second plane Πst to be z = 1. Each
neighboring normal ray can be parameterized by as [u, v, s, t]. (c) Fo-
cal surfaces (curve) formed by the foci of the normal rays of a parabolic
surface. (d) Neighboring normal rays are constrained by the slits (red)

that rule the focal surfaces (blue). (e) The color-coded mean curvature
image illustrates that the normal ray model (right) is less sensitive to
mesh connectivity than [28] (left), especially shown on the wings of
the gargoyle model. (f) Left: the estimated min principal curvature di-
rection using normal ray model. Right: compare the normal ray model
with [7] on different parts of the model

to each other and rule the focal surfaces. Swept by the loci of
the principal curvatures’ radii, the focal surfaces encapsulate
many useful geometric properties of the corresponding ac-
tual surface. For example, normals of the actual surface are
tangent to both focal surfaces and the normal of each focal
surface indicates a principle direction of the corresponding
point on the original surface. In fact, each slit is tangential
to its corresponding focal surface. Since the two focal sur-
faces are perpendicular to each other, one slit is parallel to
the normal of the focal plane that the other slit corresponds
to. Therefore, the two slits give us the principal directions of
the original surface. Besides, the depths of the slits/focal sur-
faces, computed as the roots of the characteristic equation,
represent the principle curvatures of the surface. In Fig. 7(e)
and (f), we shows two results of estimated mean curvature
and min principle curvature using normal ray model, com-
paring to the Voronoi-edge algorithm [7, 28].

4 Non-pinhole camera through the thin lens

Next, we review how ray geometry transforms through the
thin lens. A typical example is reflections observed by a
camera with a wide aperture.

4.1 GLC through a thin lens

Ding et al. [13] studied the transformation for GLCs through
a thin lens. To reiterate, recall that a GLC defined by the
affine combination of three generator rays and the thin lens
operator L(◦) is a linear operator, therefore the affinity of
the GLC will be preserved under the thin lens operator, i.e.,

L(r) = L(αr1 + βr2 + (1 − α − β)r3)
= αL(r1) + βL(r2) + (1 − α − β)L(r3) (22)

Equation (22) reveals that the exit rays form a new
GLC where the three new generator rays are L(r1), L(r2),
and L(r3).

4.2 Slit-direction duality

To determine the type of exit GLCs transformed by thin lens,
it is important to investigate the duality between slits and
directions, which can be derived by applying TLO (Eq. (9))
on the slit ray constraint (parallel or non-parallel).

If the slit does not lie on the focal plane ΠL− (z = −f )
of the lens at the world side, we consider the incident rays
in the following two cases:


Ray geometry in non-pinhole cameras: a survey

Fig. 8 Slit-Direction Duality.
(a) A pushbroom camera with a
slit at focal length transforms to
another pushbroom camera with
a different slit at focal length.
(b) A pencil camera transforms
to twisted orthographic camera

Table 3 General linear camera
transformations through a thin
lens

Incident GLC Exit GLC

XSlit: all rays passing through two
slits l1 and l2

l1 or l2 lies on ΠL−: Pushbroom
Neither slits lies on ΠL−: XSlit

Pushbroom: collect rays parallel to
some plane Π and passing through a
slit l

l lies on ΠL−: Pushbroom
l does not lie on ΠL−: XSlit

Pinhole: all rays passing through
COP Ċ

Ċ lies on ΠL−: Orthographic
Ċ does not lie on ΠL−: Pinhole

Pencil: collect rays on a set of
non-parallel planes that share a line l

l lies on ΠL−: Twisted Orthographic
l does not lie on ΠL−: Pencil

Bilinear Bilinear

Orthographic Pinhole

Twisted Orthographic Pencil

(1) The slit parallel to ΠL− is parameterized by a point
[Px , Py , Pz] and direction [dx , dy , 0]. Since Pz �= −f one
can combine the parallel slit ray constraint (Eq. (5)) with
TLO as

1 − P ′z
dx

u
′ − 1 − P

′
z

dy
v

′ + P
′
z

dx
s
′ − P

′
z

dy
t
′ +

P ′y
dy

− P
′
x

dx
= 0 (23)

where [P ′x , P ′y , P ′z] = ff +Pz [Px , Py , Pz]. Therefore all exit
rays pass through a new slit parameterized by [P ′x , P ′y , P ′z]
and its direction [dx , dy , 0].

(2) The slit is not parallel to ΠL−. In this case, one can
apply the TLO on the bilinear non-parallel slit ray constraint
(Eq. (7)) to obtain a new constraint for [u′, v′, s′, t ′] as

u′ − u0
v′ − v0

=
s′ − s0 + u0f
t ′ − t0 + v0f

(24)

which is also bilinear constraint that indicates that all exit
rays will pass through another slit [u0, v0, s0 − u0f , t0 − v0f ].

These derivations show that all rays that pass through a
slit not lying on ΠL− will be mapped to exit rays through a
new slit at the other side of the lens. We call it the Slit-Slit
Duality(see Fig. 8(a)).

If the slit lies on ΠL− (Pz = −f ), the linear slit con-
straint can be reformulated as the following by applying

TLO:

[
s
′ − u′, t ′ − v′, 1] · [−f dy , f dx , Py dx − Px dy ]T = 0 (25)
It reveals that all exit rays are orthogonal to a vector n =

[−f dy , f dx , Py dx − Px dy ], which is the normal direction
of the plane formed by the slit and the lens optical center.
Therefore, for rays passing through a slit lying on ΠL−, exit
rays correspond to directions. We call it the Slit-Direction
Duality, as shown in Fig. 8(b).

As a reciprocity of the analysis for incident rays through
a slit lying on ΠL−, for those rays parallel to some plane Π
through the lens optical center, all exit rays will pass through
a slit parallel to 2PP and lying on the focal plane ΠL+ of the
lens at the sensor side. Furthermore, the slit can be found
by intersecting Π with ΠL+. This is called the Direction-
Slit Duality. The complete GLC transformations are listed
in Table 3.

4.3 Case study 3: defocus analysis in catadioptric cameras

Based on the GLC-TLO analysis, Ding et al. [13] show-
cased using the theory for characterizing and compensating
catadioptric defocusing. They use the Ray Spread Function
(RSF) to describe how a general set of incident rays spread
to pixels on the sensor. The classical PSF is a special case of
the RSF when the incident rays are from a pinhole camera.


J. Ye, J. Yu

Fig. 9 The formation of RSF in
a catadioptric imaging system:
light from a scene point is
reflected off the mirror,
truncated by the thin lens
aperture, and finally received by
the sensor forming the RSF

Assume a scene point Ṗ and a curved mirror surface
z(x, y), the RSF of Ṗ is formed by rays emitting from Ṗ
reflected off the mirror, then transmitted through the lens
and finally received by the sensor, as shown in Fig. 9. They
then formulate the RSF by composing the thin lens operator
L(◦), the aperture operator A(◦), and the reflection operator
R(z(x, y), Ṗ ). Notice the order between L(◦) and A(◦) is
interchangeable:

RSF(Ṗ ) := K(s, t)
= L(A(R(z(x, y), Ṗ )))
= A(L(R(z(x, y), Ṗ )))
= A(L(u(s, t), v(s, t)))
=

{
nil, G(u(s, t), v(s, t)) > 0
L(u(s, t), v(s, t)), G(u(s, t), v(s, t)) ≤ 0

(26)

where u(s, t) and v(s, t) are determined by the mirror geom-

etry, and G(u, v) = u2 + v2 − D2
2

corresponds to a circular
aperture of diameter D.

Using the reflection analysis in Sect. 3.2.2, one can de-
compose each local reflection patch as an XSlit camera. It
is particulary useful to analyze the RSF of an XSlit GLC.
According to the GLC-TLO transformation, the exit GLC
is also an XSlit with two slits l1 and l2 lying on z = λ1 and
z = λ2, respectively. To simplify the analysis, let us consider
the special case when the two slits are orthogonal to each
other. One can further rotate the coordinate system such that
the slit directions are aligned with the u, v axis. The two slits
constraints can then be rewritten as

{
(1 − λ1)u′ + λ1s′ = 0
(1 − λ2)v′ + λ2t ′ = 0 =⇒

⎧⎨
⎩

u′ = s′
1− 1

λ1

v′ = t ′
1− 1

λ2

(27)

Substituting Eq. (27) into the aperture constraint G(u, v),
we have

(
s′

1
λ1

− 1
)2

+
(

t ′
1
λ2

− 1
)2

≤
(

D

2

)2
(28)

Equation (28) indicates that the RSF of a GLC is of ellip-
tical shape and the major and minor radii of the ellipse are
| 1
λ1

− 1| and | 1
λ2

− 1|. The specific shape and orientation of
the ellipse varies along with the depths of the two slits λ1
and λ2. Various cases for different λ1 and λ2 combinations
are now enumerated and shown in Fig. 10:

1. When 2λ1λ2
λ2+λ1 > 1, the major radius is |

1
λ2

− 1| · D2 , and the
major axis is parallel to the direction of the slit l1.

2. When λ1 = 1, the minor radius | 1λ1 − 1| ·
D
2 = 0, i.e., the

RSF degenerates into a line segment parallel to slit l1,
and the length of the line segment is | 1

λ2
− 1| · D2 .

3. When 2λ1λ2
λ2+λ1 = 1, the RSF shape degenerates into a cir-

cular disk, and the radius of the circle is | λ1−λ2
λ1+λ2 | ·

D
2 .

4. When 2λ1λ2
λ2+λ1 < 1, the major radius is |

1
λ1

− 1| · D2 , and the
major axis is parallel to the direction of the slit l2.

5. When λ2 = 1, the minor radius | 1λ2 − 1| ·
D
2 = 0, i.e., the

RSF degenerates into a line segment parallel to the slit l2,
and the length of the segment is | 1

λ1
− 1| · D2 .

The analysis reveals that the RSF caused by a 3D point in
a catadioptric mirror can only be an ellipse, a circle, or a
line segment. Furthermore, the shape of the RSF depends
on the location of the scene point, size of the aperture and
the camera’s focus setting.

5 Applications of synthetic non-pinhole cameras

Although many of the non-pinhole cameras discussed above
are synthetic models, they have broad applications in graph-
ics and vision.


Ray geometry in non-pinhole cameras: a survey

Fig. 10 Various RSF Shapes:
the RSF shape depends on the
distance w between the aperture
and the sensor planes

Fig. 11 Multi-perspective
panorama from Disney’s
Pinocchio (courtesy of Disney)

5.1 Synthesizing panoramas

A non-pinhole camera can combine patches from multi-
ple pinhole cameras into a single image to overcome the
FoV limits. In image-based rendering, pushbroom and XS-
lit panoramas can be synthesized by translating a pinhole
camera along the image plane and then stitching specific
columns from each perspective image. Pushbroom image
assembles the same column [41] whereas the XSlit lin-
early varies the column [60]. The synthesized panoramas
can exhibit image distortions such as apparent stretching
and shrinking, and even duplicated projections of a single
point [44, 56]. To alleviate the distortions, Agarwala et al.
[3] constructed panorama using arbitrarily shaped regions of
the source images taken by a pinhole camera moving along
a straight path, instead of selecting simple strips. The re-
gion shape in each perspective image is carefully chosen
by using Markov Random Field (MRF) optimization based
on various properties that the panorama desires. Instead of
translating the camera planarly, Shum and Szeliski [42] cre-
ated panorama on a cylindrical manifold by panning a pin-
hole camera around its optical center. They project the per-
spective images to a common cylinder to combine the final
panorama. Peleg et al. [35] proposed a mosaicing method

for more general camera motion. They first determine the
projection manifolds according to the camera motion and
then warp the source images onto the manifolds to stitch the
panorama.

Non-pinhole camera models are also widely used for cre-
ating computer generated panoramas. The 1940 Disney an-
imation Pinocchio [36] opens with a virtual camera flying
over a small village. Instead of traditional panning, the cam-
era rotates at the same time, creating astonishing 3D ef-
fect via 2D painting. In fact, the shot was made by drawing
a panoramic view with “warped perspective” as shown in
Fig. 11 and then showing only a small clip at a time. Wood
et al. [53] proposed to create similar cel animation effect
from 3D models. They combined elements of multiple pin-
hole strips into a single image using a semi-automatic im-
age registration process. Their method relies on optimization
techniques as well as optical flow and blending transitions
between views.

Popescu et al. [37] proposed the graph camera for gen-
erating a single panoramic image that simultaneously cap-
tures/renders regions of interest of a 3D scene from dif-
ferent perspectives. Conceptually, the graph camera is a
combination of different pinhole cameras that sample the
scene. A non-perspective panorama can then be generated


J. Ye, J. Yu

Fig. 12 Non-Perspective
Images. (a) Nusch Eluard by
Pablo Picaso. (b) A
multi-perspective image
rendered using GLC framework
[20]. (c) Extracted images from
a faux-animation generated by
[20]. The source images were
acquired by rotating a ceramic
figure on a turntable.
Multi-perspective renderings
were used to turn the head and
hind quarters of the figure in a
fake image-based animation

by elaborately stitching the boundary of multiple pinhole
images. The viewing continuity with minimum redundancy
is achieved through a sequence of pinhole frustum bending,
splitting and merging. The panoramic rendering can then be
used in 3D scene exploration, summarization and visualiza-
tion.

5.2 Non-photorealistic rendering

Rendering perspectives from multiple viewpoints can be
combined in ways other than panoramas. By making sub-
tle changes in viewing direction across the imaging plane it
is possible to depict more of scene than could be seen from a
single point of view. Such images differ from panoramas in
that they are intended to be viewed as a whole. Neo-cubism
is an example.

Many of the works of Picasso are examples of such non-
perspective images. Figure 12(a) and (b) compare one of
Picasso’s paintings with an image synthesized using the
GLC framework [20]. Starting from a simple layout, it
achieves similar multi-perspective effects. It is also possible
to use multi-perspective rendering to create fake or faux-
animations from still-life scenes. This is particularly useful
for animating image-based models. Figure 12(c) show three
frames from a synthesized animation, each of which cor-
responds to a multi-perspective image rendered from a 3D
light field. Zomet [60] used a similar approach by using a
single XSlit camera to achieve rotation effects.

Mei et al. [27] defined an occlusion camera that can sam-
ple visible surfaces as well as occluded ones in the refer-
ence view, to allow re-rendering new views with correct oc-
clusions. Their occlusion camera bends the rays towards a

central axis (the pole) to sample the hidden surfaces in the
reference view. A 3D radial distortion centered at the pole
allows the occlusion camera to see around occluders along
the pole. Such distortion pulls out hidden samples according
to their depth: the larger the depth, the larger the sample will
be displaced. Therefore, samples that are on the same ray in
a conventional perspective camera are separated to different
locations in the distorted occlusion camera image according
to their depth. In this way, hidden samples that are close to
the silhouette becomes visible in the occlusion camera ref-
erence image.

Hsu et al. [18] recently proposed a multi-scale render-
ing framework that can render objects smoothly at multiple
levels of details in a single image. They set up a sequence
of pinhole cameras to render objects at different scales of
interest and use a user-specified mask to determine regions
to be displayed in each view. The final multi-scale image is
rendered by reprojecting the images of multi-scale cameras
to the one with the largest scale and use Bezier curve-based
non-linear ray casting to ensure coherent transition between
each scale view. Their technique can achieve focus plus con-
text visualization and is useful in scientific visualization and
artistic rendering.

5.3 Stereo and 3D reconstruction

Traditional stereo matching algorithms for pinhole cam-
eras have also been extended to non-pinhole geometry.
Seitz [41] and Pajdla [33] independently studied all possi-
ble non-pinhole camera pairs that can have epipolar geome-
try. Their work suggests that only three varieties of epipo-


Ray geometry in non-pinhole cameras: a survey

Fig. 13 Epsilon stereo
matching on two XSlit cameras.
From top to bottom: (a) shows
one of the two XSlit images;
(b) shows the ground truth
depth map; (c) shows the
recovered disparity map by
treating the two images as a
stereo pair and applying the
graph cut algorithm; (d) shows
the horizontal disparity map
recovered by the epsilon stereo
mapping algorithm

lar geometry exist: planes, hyperboloids, and hyperbolic-
paraboloids, all corresponding to doubly ruled surfaces. Pe-
leg et al. [34] stitched the same column of images from a
rotating pinhole camera to form a circular pushbroom. They
then fused two oblique circular pushbrooms to synthesize
a stereo panorama. Feldman et al. [14] proved that a pair
of XSlit cameras can have valid epipolar geometry if they
share a slit or the slits intersect in four pairwise distinct
points.

However, Seitz and Pajdla’s results also reveal that very
few varieties of multi-perspective stereo pairs exist. Ding
and Yu [8] introduced a new near stereo model, which they
call epsilon stereo pairs. An epsilon stereo pair consists of
two non-pinhole images with a slight vertical parallax. They
have shown that many non-pinhole camera pairs that do not
satisfy the stereo constraint can still form epsilon stereo
pairs. They then have introduced a new ray-space warping
algorithm to minimize stereo inconsistencies in an epsilon
pair using non-pinhole collineations (homograph) which
makes epsilon stereo model a promising tool for synthesiz-
ing close-to-stereo fusions from many non-stereo pairs, as

shown in Fig. 13. Most recently, Kim et al. [21] presented
a method for generating near stereoscopic view by cutting
the light field. They compute the stereoscopy as the optimal
cut through the light field under the depth budget, maximum
disparity gradient, and desired stereoscopic baseline.

A special class of non-pinhole cameras are reflective and
refractive surfaces. One can then view the surface recon-
struction problem as the camera calibration problem. Ding
et al. [9, 12] proposed a shape-from-distortion framework
for recovering specular (reflective/refractive) surfaces by an-
alyzing the local reflection GLCs and curved line images.
In [9], they focused on recovering a special type of sur-
face: near-flat surfaces such as the windows and relatively
flat water surfaces. Such surfaces are difficult to model be-
cause lower-order surface attributes provide little informa-
tion. They divide the specular surface into piecewise trian-
gles and estimate each local reflection GLCs for recovering
high-order surface properties such as curvatures. In [12], the
authors have further shown how to analyze the curving of
lines to recover the GLC parameters and then the surface
attributes.


J. Ye, J. Yu

Fig. 14 (a) A typical
catadioptric image with wide
FoV. (b) Forward Projection:
Given a scene point P , the
mirror surface and the camera,
find its projection in the viewing
camera after reflection. It is
crucial to find the reflection
point on the mirror surface Q

6 Real non-pinhole imaging systems

In previous sections, we have discussed many conceptual
non-pinhole camera models. In this section, we discuss a
number of real non-pinhole cameras that are constructed by
modifying a commodity camera using special optical units.

6.1 General catadioptric cameras

As mentioned above, the most commonly used class of
“real” non-pinhole/multi-perspective cameras are catadiop-
tric cameras. These cameras put a regular pinhole camera in
front of a curved mirror for acquiring images with a much
wider FoV, as shown in Fig. 14(a). A large FoV can benefit
many applications such as video surveillance, autonomous
navigation, obstacle avoidance and panoramic image acqui-
sition.

The core problem in catadioptric cameras is to solve for
forward projection, i.e., given the view camera (a pinhole),
the curved mirror, and a 3D scene point, how to find the
projection of the point in the view camera, as shown in
Fig. 14(b). To resolve this problem, it is crucial to find the re-
flection point on the mirror in order to trace the ray path from
the scene point to the CoP of the pinhole camera. This is a
classical inverse problem and for complex catadioptric sys-
tems with multiple viewpoints, closed-form solution does
not exist. We review recent attempts to address the forward
projection problem using ray geometry analysis.

Centric catadioptric cameras The simplest catadioptric
cameras are designed to maintain a single viewpoint, i.e.,
all the projection rays intersect at one common point (the
effective viewpoint), in order to generate perspectively cor-
rect images from sections of the acquired image. Such sys-
tems are commonly referred to centric catadioptric cameras.
Since all projection rays from scene points form a same pin-
hole camera about the effective viewpoint before reflection,
we can easily resolve the forward projection problem by
projecting the 3D point in the virtual pinhole camera.

Nayar and Baker [5, 30] analyzed all possible classes of
centric catadioptric systems. They derived a fixed viewpoint
constraint that requires all projection rays passing through
the effective pinhole of the camera (after reflection) would
have passed through the effective viewpoint before reflected
by the mirror surface. Since the mirror is rotationally sym-
metric, one can then consider this problem in 2D by tak-
ing a slice across the central axis. Assuming that the ef-
fective viewpoint is at origin [0, 0]; the effective pinhole is
at [0, c] and the mirror surface is of form z(r) = z(x, y),
where r =

√
x2 + y2, the constraint can then be written as a

quadratic first-order ordinary differential equation:

r(c − 2z)
(

dz

dr

)2
− 2(r 2 + cz − z2) dz

dr
+ r(2z − c) = 0 (29)

The solution to Eq. (29) reveals that only 3D mirrors
swept by conic sections around its central axis can satisfy the
fixed viewpoint constraint, therefore, maintaining a single
viewpoint. They have further shown two practical setups of
centric catadioptric cameras: (1) positioning a pinhole cam-
era at the focal point of a hyperboloidal mirror; and (2) ori-
enting an orthographic camera (realized by using a tele-lens)
towards the rotational axis of a paraboloidal mirror. Both de-
signs, however, require highly accurate alignment and pre-
cise assembly of the optical components.

Non-centric catadioptric cameras Relaxing the single
viewpoint constraint allows more general but non-centric
catadioptric cameras. In a non-centric catadioptric camera,
the loci of virtual viewpoints form the caustic surfaces of
the mirror. The centric catadioptric camera is a special case
with its caustic being a point. Swaminathan et al. [44–46]
proposed to use the envelop of these reflection rays for
computing the caustic surface. Yu and McMillan [56] in-
stead decompose the mirror surface into piecewise triangle
patches and model each reflection patch as a GLC, as shown
in Sect. 3.2.2. Recall that local reflection ray geometry
observed by a pinhole or an orthographic camera can only


Ray geometry in non-pinhole cameras: a survey

Fig. 15 Solving for forward projection using GLC decomposition.
(a) We can decompose a curved mirror image using piecewise GLCs.
(b) A multi-resolution hierarchy can be created for querying the image
of a 3D point

Algorithm 6.1: MIRROR_GLCFORWARDPROJECTION(glc, Ṗ )

procedure GETRAY(const GLC &glc, const Point3D &Ṗ )
p[u, v] ← glc.Project(Ṗ );
if p[u, v] /∈ glc.triangle

then
{
return (false)

if isLeaf(glc)
then

{
return (p[u, v])

else

⎧⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎩

bNotFind ← true;
while bNotFind

do

⎧⎪⎪⎨
⎪⎪⎩

x ← glc.subGLCs.getNext();
q[u1, v1] ← x.Project(Ṗ );
if q[u1, v1] ∈ x.triangle

then
{
bNotFind ← false;

return (GetRay(x, Ṗ ))

be one of the four types of GLC: XSlit, pushbroom, pinhole,
or orthographic, all can be viewed as special cases of XSlit
cameras: when the two slits intersect, it transforms into a
pinhole camera; when one of the slits goes to infinity, the
XSlit transforms into a pushbroom; and when both slits go
to infinity, it transforms into an orthographic camera.

6.2 Solutions to the forward projection problem

GLC approximation The key advantage of using this GLC
approximation is that it provides a closed-form solution to
the forward projection problem: one can decompose the mir-
ror into piecewise GLCs, project the 3D point into each
GLC, and verify if the projection location lies inside the
GLC [57]. The result apparently is an approximation to
the real solution and the accuracy depends on the fineness
of triangulation. To improve both efficiency and accuracy,
they have further developed a dynamic tessellation scheme
similar to the Level-of-Detail (LoD) technique in computer
graphics. They first tessellate the reflection surface using a
coarse set of GLCs and then perform standard 1-to-4 subdi-
vision and store the subdivision in a quad tree as shown in
Fig. 15. To forward project a 3D point Ṗ to the camera, they

start from the top level GLCs and compute the image of Ṗ ’s
projection. They then determine which GLC contains the fi-
nal projection and repeat the search on its children GLCs.
The search stops until it reaches the leaf nodes. The detailed
forward projection steps are shown in Algorithm 6.1.

Axial cameras The forward projection problem can also
be addressed using special catadioptric cameras such as the
axial camera. The axial camera is an intermediate class of
cameras that lies between centric and non-centric ones. In an
axial camera, all the projection rays are constrained to pass
through a common axis but not a 3D point. One such model
is a rotationally symmetric mirror with a pinhole camera
viewing from its rotation axis, as shown in Fig. 16(a).

Axial cameras are easier to construct than the centric
catadioptric ones. For example, in a centric hyperbolic cata-
dioptric camera, the optical center of the view camera has
to be placed precisely at the mirror’s foci whereas in an ax-
ial camera the optical center can be placed anywhere on the
mirror axis to satisfy the axial geometry. The fact that all
reflection rays passing through the rotation axis reveals that
local GLCs will map all reflection patches to a group of XS-
lit cameras that share a common slit, i.e., the rotation axis.
Ramalingam et al. [38] proposed a generic calibration al-
gorithm for axial cameras by computing projection rays for
each pixel constrained by the mirror axis. Agrawal et al. [4]
further provided an analytical solution for forward projec-
tion for axial cameras. Given the viewpoint and a mirror,
they compute the light path from a scene point to the viewing
camera by solving a closed-form high-order forward pro-
jection equation. Conceptually, this can be done by exhaus-
tively computing the projection for each centric ring of the
virtual camera. For spherical mirror, they derived that the
projection equation is reduced to 4th degree. This closed-
form solution can be used to effectively compute the epipo-
lar geometry to accelerate catadioptric stereo matching and
to compose multiple axial camera images for forming a per-
spective one [47].

Another special class of axial cameras is the radial cam-
era proposed by Kuthirummal and Nayar [22]. Their goal is
to strategically capture the scene from multiple viewpoints
within a single image. A radial camera consists of a conven-
tional camera looking through a hollow rotationally sym-
metric mirror polished on the inside, as shown in Fig. 16(b).
The FoV of the camera is folded inwards and consequently
the scene is captured both directly and from virtual view-
points after reflection by the mirror, as shown in Fig. 16(c).
By using a single camera, the radiometric properties are the
same across all views. Therefore, no synchronization or cal-
ibration is required. The radial imaging system can also be
viewed as a special axial camera that has a circular locus
of virtual viewpoints. Similar to the regular axial camera,
closed-form solution can be derived for computing the for-
ward projection. Further, this camera has the same epipolar


J. Ye, J. Yu

Fig. 16 Two Examples of Axial
Camera. (a) Rotationally
symmetric mirror with a
viewing camera lying on the
rotation axis. (b) A radial
catadioptric camera can capture
the same 3D point from
different perspectives in a single
image. (c) A multi-perspective
image captured by (b) (courtesy
of Shree Nayar)

Fig. 17 Top: A panoramic
catadioptric projector system
that combines a regular
projector with curved plastic
mirror. Bottom: The final
projection uses the projector’s
full resolution (1024 × 768) and
displayed on a 16 m × 3 m wall

geometry as the cyclographs [41] and therefore can be ef-
fectively used for omni-directional 3D reconstruction, ac-
quiring 3D textures, sampling and estimating the surface
reflectance properties such as the Bidirectional Reflectance
Distribution Functions (BRDF).

6.3 Catadioptric projectors

Finally, one can replace the viewing pinhole camera with a
projector. Ding et al. [10] proposed the catadioptric projec-
tor by combining a digital commodity projector with spe-
cially shaped reflectors to achieve an unprecedented level of
flexibility in aspect ratio, size, and FoV, as shown in Fig. 17.
Their system assumes unknown reflector geometry and does
not require accurate alignment between the projector and the
optical units. They then use the inverse light transport tech-
nique to correct geometric distortions and scattering.

The main difference between the catadioptric camera and
catadioptric projector is that the camera uses a near-zero
aperture whereas the projector requires a wide aperture to
achieve bright projections. However, the wide aperture may

cause severe defocus blurs. Due to the non-pinhole nature
of reflection rays, the defocus blurs are much more com-
plicated, e.g., the blur kernels are spatial-varying and non-
circular shaped. Therefore, traditional image precondition-
ing algorithms are not directly applicable.

The analysis in Sect. 4.3 shows that the catadioptric de-
focus blur can range from an ellipse to a line segment, de-
pending on the aperture setting and the projector focal depth.
To compensate for defocus blurs, Ding et al. [10] adopt
a hardware solution: they change the shape of the aper-
ture to reduce the average size of the defocus blur kernel.
Conceptually, one can use a very small aperture to emulate
pinhole-type projection. However, small apertures block a
large amount of light and produce dark projections. Their so-
lution is then to find the appropriate aperture shape that can
effectively reduce the blurs without sacrificing the bright-
ness in projection. In their approach, they first estimate the
blur kernel by projecting a dotted pattern onto the wall and
fit an ellipse to each captured dot. They then compute the
average major and minor radii across all dots as a′ and b′.
Using the analysis in Sect. 4.3, they prove that the major and


Ray geometry in non-pinhole cameras: a survey

Fig. 18 Different light field camera designs. Lenslet-based light field
camera places the lenslet array on the image plane of the main lens.
(a) In Lytro [25], the sensor is located at the focal plane of the mi-
crolenses. (b) In Lumsdaine et al. [24], the microlenses focus on the
sensor to trade angular resolution for spatial resolution. (c) The het-

erodyne light field camera puts a narrowband 2D cosine mask near
the sensor (courtesy of Ramesh Raskar). (d) Catadioptric light field
camera uses a view camera facing an array of mirrors (courtesy of
Yuichi Taguchi)

minor radii a and b of the optimal aperture should produce
a circular shaped defocus kernel. They have shown that the

optimal aperture should be an ellipse with a = D2
√

a′
b′ and

b = D2
√

b′
a′ , where D is the diameter of the actual aperture.

6.4 4D ray sampler: light field cameras

The most general non-pinhole should be able to sample the
complete 4D ray space and then reconfigure the rays at will.
This requires using the generalized optics that treats each
optical element as a 4D ray-bender that modifies the rays in
a light field [2, 24, 31]. The collected ray bundles can then be
regrouped into separate measurements of the plenoptic func-
tion [1, 26]. The most straightforward scheme is to move a
camera along a 2D path to sample the 4D ray space [19, 23].
Although this method is simple and easy to implement, it
is only suitable for acquiring static scenes. Wilburn et al.
[52] instead built a camera array to capture the light field.
Constructing such a light field camera array, however, is ex-
tremely time and effort consuming and requires substantial
amount of engineering. The latest developments are the light
field cameras.

Lenslet based light field camera Recent advances in op-
tics manufacturing has enabled the light field to be captured
using a single camera in one shot. Ng [31] designed a hand-
held plenoptic camera to record the light field within a single

shot by placing a lenslet array in front of the camera sensor
to separate converging rays. Each microlens focuses at the
main aperture plane. Since the size of the main lens is sev-
eral magnitude larger than the lenslet, it can be treated as
infinity to the lenslet. The sensor is placed at the focal plane
of the lenslet array for simplification. In Ng’s design, the F-
numbers of the main lens and each microlens are matched
to avoid cross-talk among microlens images. By parameter-
izing the in-lens light field with a 2PP of Πuv at the main
aperture and Πst at the lenslet array, the acquired ray space
is uniformly sampled. This design has led to the commercial
light field camera, Lytro [25], as shown in Fig. 18(a).

Lumsdaine et al. [24] introduced a slightly different de-
sign by focusing the lenslet array on a virtual plane inside
camera. In this case each microlens image will capture more
spatial samples but less angular samples on the focused vir-
tual plane. This design is capable of producing higher reso-
lution results when focusing near the sampled image plane.
However, the lower angular resolution leads to more se-
vere ringing artifacts at the out-of-focus regions, as shown
in Fig. 18(b).

Mask based light field camera Instead of using a lenslet ar-
ray to separate light arriving at the same pixel from different
directions, Veeraraghavan et al. [49] used a non-refractive
patterned attenuation mask to modulate the light field in the
frequency domain. By placing the mask on the light path
between the lens and sensor, it attenuates light from differ-


J. Ye, J. Yu

ent directions accordingly, as shown in Fig. 18(c). Consid-
ering the process in the frequency domain, we can view it as
heterodyning the incoming light field. The attenuation mask
needs to be reversible to ensure that demodulation can be
performed. To recover the light field, they first transform the
captured 2D image into Fourier domain and then rearrange
the tiles of the 2D Fourier transform into 4D space. Finally,
the light field of the scene is computed by taking the inverse
4D Fourier transform. Further, they can insert the mask at
different location along the optical path of the camera to
achieve dynamic frequency modulation. However, the mask
partially blocks out the incoming light and greatly reduces
light efficiency.

Mirror based light field camera It is also possible to ac-
quire the light field using a catadioptric mirror array, as
shown in Fig. 18(d). Unger et al. [48] combined a high res-
olution tele-lens camera and an array of spherical mirrors to
capture the incident light field. The use of mirror arrays in-
stead of lenslet arrays has its advantages: it avoids chromatic
aberrations caused by refraction, it does not require elabo-
rate calibration between the lenslet array and the sensor, it
captures images at a wide FoV, and it is less expensive and
reconfigurable. The disadvantages are two-fold. First, each
mirror image is non-pinhole and therefore requires conduct-
ing forward projection for associating the reflection rays
with 3D points. Second, the sampled the light field is non-
uniform.

Two notable examples of these systems are the spherical
mirror arrays by Ding et al. [11] and Taguchi et al. [47]. In
[11], the authors applied the GLC-based forward projection
(Sect. 6.2) on multi-view space carving for reconstructing
the 3D scene. Taguchi et al. [47] developed both a mirror ar-
ray and a refractive sphere array and applied the axial cam-
era formulation (Sect. 6.2) to compute the closed forward
projection. They have shown various applications including
distortion correction and light field rendering.

Light field probes Analogous to catadioptric cameras vs.
catadioptric projectors, the duality to the light field camera
is the light field probe, i.e., replacing the sensor with a pro-
jector. The real light field probe has been implemented by
using a backlight, diffuser, pattern, and a single or array of
lenslet, as shown in Fig. 19. Similar to Lytro, the pattern is
placed at the focal plane of the lenslet array to simulate an
array of projectors projecting towards infinity. The light field
probe is apparently a multi-view display. It is also particu-
larly useful for acquiring transparent surfaces. Notice that
the light field probe can directly estimate ray–ray correspon-
dences as the view camera can associates each pixel to a ray.
Ye et al. [54] used a single lens probe (a Bokode [29]) for
recovering dynamic fluid surfaces. They presented a robust
feature matching algorithm based on the Active Appearance

Fig. 19 Light Field Probes. (a) A light field probe combines a lenslet
array and a special projection pattern. (b) Similar to Lytro, each lenslet
acts as a view-dependent pixel

Model (AAM) to robustly establishing ray–ray correspon-
dences. The ray–ray correspondences then directly provide
the surface normal and they derive a new angular-domain
surface integration scheme to recover the surface from the
normal field. Wetzstein et al. [50, 51] also used the light
field probe for reconstructing complex transparent objects.
They encode both spatial and angular by using specially de-
signed color pattern. Specifically, they use gradients of dif-
ferent color channels (red and blue) to encode the 2D inci-
dent ray direction and the green channel to encode the 1D
(vertical) spatial location of the pattern. The second (hori-
zontal) spatial location can be recovered through geometric
constraints. Their approach is able to achieve highly accu-
rate ray–ray correspondences for reconstructing surface nor-
mals of complex static objects.

7 Future directions

There are several directions for future research related to
non-pinhole cameras.

The ray geometry theory may lead to new acquisition de-
vices for many image-based rendering (IBR) and compu-
tational photography applications. For example, it will be
useful to design specially curved mirrors that can efficiently
capture the light field. The pinhole-based and mirror-based
light field cameras have provided one way to sample the ray
space. The current spherical mirror array suffers from arti-
facts such as non-uniform sampling and image distortions.
Alternatively, special-shaped mirrors may be able to more
evenly sample the ray space, e.g., via a different type of ray
subspace (e.g., using the GLC-type mirrors).

In addition to non-pinhole cameras, one can potentially
develop non-pinhole light sources by replacing the viewing
camera in a catadioptric camera with a point light source.


Ray geometry in non-pinhole cameras: a survey

Previous image-based relighting, surface reflectance sam-
pling, and shape recovering algorithms are restricted by ge-
ometric constraints of the light source. By strategically de-
vising a different type of lighting condition, one can im-
prove the way for measuring and sampling the radiance
off the surface. In addition, a non-pinhole light source will
cast special-shaped shadows. In particular, the shadow of a
3D line segment can be a curve under a non-pinhole light
source. This may lead to the development of new shape-
from-shadow algorithms which determine the depth of the
object by analyzing the shape of the shadow at the silhou-
ettes.

Finally, it is possible to develop a new theoretical frame-
work based on computational and differential geometry to
characterize and catalog the structures of ray space. For ex-
ample, it can be highly useful to model the algebraic ray
subspaces (e.g., ray simplices) and analyze how ray geome-
tries are related to specific types of non-pinhole distortions.
Further, by correlating the geometric attributes of the re-
flector/refractor surface with these distortions, we can ex-
plore novel shape-from-caustics, shape-from-distortion, or
depth-from-defocus algorithms for recovering highly com-
plex specular surfaces.

References

1. Adelson, E.H., Bergen, J.R.: The plenoptic function and the ele-
ments of early vision. In: Computational Models of Visual Pro-
cessing, pp. 3–20 (1991)

2. Adelson, E., Wang, J.: Single lens stereo with a plenoptic camera.
IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 99–106 (1992)

3. Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., Szeliski, R.:
Photographing long scenes with multi-viewpoint panoramas. In:
ACM SIGGRAPH, pp. 853–861 (2006)

4. Agrawal, A., Taguchi, Y., Ramalingam, S.: Analytical forward
projection for axial non-central dioptric and catadioptric cameras.
In: Proceedings of the 11th European Conference on Computer
Vision, pp. 129–143 (2010)

5. Baker, S., Nayar, S.K.: A theory of single-viewpoint catadioptric
image formation. Int. J. Comput. Vis. 35(2), 1–22 (1999)

6. Buehler, C., Bosse, M., McMillan, L., Gortler, S., Cohen, M.:
Unstructured lumigraph rendering. In: Proceedings of the 28th
Annual Conference on Computer Graphics and Interactive Tech-
niques, SIGGRAPH ’01, pp. 425–432 (2001)

7. Cohen-Steiner, D., Morvan, J.-M.: Restricted Delaunay triangula-
tions and normal cycle. In: Proceedings of the Nineteenth Annual
Symposium on Computational Geometry, pp. 312–321 (2003)

8. Ding, Y., Yu, J.: Epsilon stereo pairs. In: Proceedings of the British
Machine Vision Conference, September, pp. 10–13 (2007)

9. Ding, Y., Yu, J.: Recovering shape characteristics on near-flat
specular surfaces. In: Computer Vision and Pattern Recognition,
June (2008)

10. Ding, Y., Xiao, J., Tan, K.-H., Yu, J.: Catadioptric projectors. In:
Computer Vision and Pattern Recognition, June, pp. 2528–2535
(2009)

11. Ding, Y., Yu, J., Sturm, P.: Multiperspective stereo matching
and volumetric reconstruction. In: Proceedings. 12th IEEE Inter-
national Conference on Computer Vision, Oct., pp. 1827–1834
(2009)

12. Ding, Y., Yu, J., Sturm, P.: Recovering specular surfaces using
curved line images. In: Computer Vision and Pattern Recognition,
June, pp. 2326–2333 (2009)

13. Ding, Y., Xiao, J., Yu, J.: A theory of multi-perspective defocus-
ing. In: Computer Vision and Pattern Recognition, June (2011)

14. Feldman, D., Pajdla, T., Weinshall, D.: On the epipolar geometry
of the crossed-slits projection. In: Proceedings. Ninth IEEE Inter-
national Conference on Computer Vision, Oct. (2003)

15. Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The lumi-
graph. In: Proceedings of the 23rd Annual Conference on Com-
puter Graphics and Interactive Techniques, SIGGRAPH ’96, pp.
43–54 (1996)

16. Gupta, R., Hartley, R.I.: Linear pushbroom cameras. IEEE Trans.
Pattern Anal. Mach. Intell. 19(9), 963–975 (1997)

17. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer
Vision, 2nd edn. Cambridge University Press, Cambridge (2003)

18. Hsu, W.-H., Ma, K.-L., Correa, C.: A rendering framework for
multiscale views of 3d models. In: Proceedings of the SIGGRAPH
Asia Conference, pp. 131:1–131:10 (2011)

19. Isaksen, A., McMillan, L., Gortler, S.J.: Dynamically reparame-
terized light fields. In: Proceedings of the 27th Annual Conference
on Computer Graphics and Interactive Techniques, SIGGRAPH,
pp. 297–306 (2000)

20. Jingyi, Y., Leonard, M.: A framework for multiperspective ren-
dering. In: Proceedings of Rendering Techniques, Eurographics
Symposium on Rendering (2004)

21. Kim, C., Hornung, A., Heinzle, S., Matusik, W., Gross, M.: Multi-
perspective stereoscopy from light fields. ACM Trans. Graph.
30(6), 190:1–190:10 (2011)

22. Kuthirummal, S., Nayar, S.K.: Multiview radial catadioptric imag-
ing for scene capture. In: ACM SIGGRAPH 2006, pp. 916–923
(2006)

23. Levoy, M., Hanrahan, P.: Light field rendering. In: Proceedings of
the 23rd Annual Conference on Computer Graphics and Interac-
tive Techniques, SIGGRAPH ’96, pp. 31–42 (1996)

24. Lumsdaine, A., Georgiev, T.: The focused plenoptic camera. In:
Proceedings. IEEE International Conference on Computational
Photography, pp. 1–8 (2009)

25. Lytro. www.lytro.com
26. McMillan, L., Bishop, G.: Plenoptic modeling: an image-based

rendering system. In: Proceedings of the 22nd Annual Conference
on Computer Graphics and Interactive Techniques, SIGGRAPH,
pp. 39–46 (1995)

27. Mei, C., Popescu, V., Sacks, E.: The occlusion camera. Comput.
Graph. Forum 24, 335–342 (2005)

28. Meyer, M., Desbrun, M., Schröder, P., Barr, A.H.: Discrete
differential-geometry operators for triangulated 2-manifolds. In:
Proc. Visisualization and Mathematics, pp. 35–57 (2002)

29. Mohan, A., Woo, G., Hiura, S., Smithwick, Q., Raskar, R.:
Bokode: imperceptible visual tags for camera based interaction
from a distance. In: ACM SIGGRAPH (2009)

30. Nayar, S.: Catadioptric omnidirectional camera. In: Computer Vi-
sion and Pattern Recognition, June, pp. 482–488 (1997)

31. Ng, R.: Fourier slice photography. In: ACM SIGGRAPH 2005 Pa-
pers, pp. 735–744 (2005)

32. Pajdla, T.: Stereo with oblique cameras. In: IEEE Workshop on
Stereo and Multi-Baseline Vision, pp. 85–91 (2001)

33. Pajdla, T.: Geometry of two-slit camera. Research Report CTU-
CMP-2002-02

34. Peleg, S., Ben-Ezra, M.: Stereo panorama with a single camera.
In: Computer Vision and Pattern Recognition (1999)

35. Peleg, S., Rousso, B., Rav-Acha, A., Zomet, A.: Mosaicing
on adaptive manifolds. IEEE Trans. Pattern Anal. Mach. Intell.
22(10), 1144–1154 (2000)

36. W.D. Productions: Pinocchio, 1940. Movie
37. Popescu, V., Rosen, P., Adamo-Villani, N.: The graph camera. In:

ACM SIGGRAPH Asia (2009)

http://www.lytro.com


J. Ye, J. Yu

38. Ramalingam, S., Sturm, P., Lodha, S.K.: Theory and calibration
for axial cameras. In: Asian Conference on Computer Vision, vol.
1, pp. 704–713 (2006)

39. Raskar, R., Tumblin, J., Mohan, A., Agrawal, A., Li, Y.: Compu-
tational photography, pp. 1–20. Eurographics Association (2006)

40. Rucker, R.: The Fourth Dimension: Toward a Geometry of Higher
Reality. Houghton Mifflin, Boston (1984)

41. Seitz, S.M., Kim, J.: The space of all stereo images. Int. J. Comput.
Vis. 48(1), 21–38 (2002)

42. Shum, H.-Y., Szeliski, R.: Construction of panoramic image mo-
saics with global and local alignment. Int. J. Comput. Vis. 48, 151–
152 (2002)

43. Soler, C., Subr, K., Durand, F., Holzschuch, N., Sillion, F.: Fourier
depth of field. ACM Trans. Graph. 28, 1–12 (2009)

44. Swaminathan, R., Grossberg, M., Nayar, S.: Caustics of catadiop-
tric cameras. In: Proceedings. Eighth IEEE International Confer-
ence on Computer Vision, vol. 2, pp. 2–9 (2001)

45. Swaminathan, R., Grossberg, M.D., Nayar, S.K.: A perspective
on distortions. In: Computer Vision and Pattern Recognition, pp.
594–601 (2003)

46. Swaminathan, R., Grossberg, M.D., Nayar, S.K.: Non-single
viewpoint catadioptric cameras: geometry and analysis. Int. J.
Comput. Vis. 66(3), 211–229 (2006)

47. Taguchi, Y., Agrawal, A., Veeraraghavan, A., Ramalingam, S.,
Raskar, R.: Axial-cones: modeling spherical catadioptric cameras
for wide-angle light field rendering. In: ACM SIGGRAPH Asia,
pp. 172:1–172:8 (2010)

48. Unger, J., Wenger, A., Hawkins, T., Gardner, A., Debevec, P.:
Capturing and rendering with incident light fields. In: EGRW,
pp. 141–149 (2003)

49. Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., Tumblin,
J.: Dappled photography: mask enhanced cameras for heterodyned
light fields and coded aperture refocusing. In: ACM SIGGRAPH
(2007)

50. Wetzstein, G., Raskar, R., Heidrich, W.: Hand-held schlieren pho-
tography with light field probes. In: Proceedings. IEEE Interna-
tional Conference on Computational Photography, April, pp. 1–8
(2011)

51. Wetzstein, G., Roodnick, D., Raskar, R., Heidrich, W.: Refractive
shape from light field distortion. In: Proceedings. 13th IEEE Inter-
national Conference on Computer Vision (2011)

52. Wilburn, B., Joshi, N., Vaish, V., Talvala, E.-V., Antunez, E.,
Barth, A., Adams, A., Horowitz, M., Levoy, M.: High perfor-
mance imaging using large camera arrays. In: ACM SIGGRAPH,
pp. 765–776 (2005)

53. Wood, D.N., Finkelstein, A., Hughes, J.F., Thayer, C.E., Salesin,
D.H.: Multiperspective panoramas for cel animation. In: Proceed-
ings of the 24th Annual Conference on Computer Graphics and
Interactive Techniques, SIGGRAPH, pp. 243–250 (1997)

54. Ye, J., Ji, Y., Li, F., Yu, J.: Angular domain reconstruction of dy-
namic 3d fluid surfaces. In: Computer Vision and Pattern Recog-
nition (2012)

55. Yu, J., McMillan, L.: General linear cameras. In: ECCV (2004)
56. Yu, J., McMillan, L.: Modelling reflections via multiperspective

imaging. In: Computer Vision and Pattern Recognition, pp. 117–
124 (2005)

57. Yu, J., McMillan, L.: Multiperspective projection and collineation.
In: Proceedings. 10th IEEE International Conference on Computer
Vision (2005)

58. Yu, J., Yin, X., Gu, X., McMillan, L., Gortler, S.: Focal surfaces
of discrete geometry. In: Proceedings of the Fifth Eurographics
Symposium on Geometry Processing, pp. 23–32 (2007)

59. Yu, J., McMillan, L., Sturm, P.: Multi-perspective modelling,
rendering and imaging. Comput. Graph. Forum 29(1), 227–246
(2010)

60. Zomet, A., Feldman, D., Peleg, S., Weinshall, D.: Mosaicing
new views: the crossed-slits projection. IEEE Trans. Pattern Anal.
Mach. Intell. 25(6), 741–754 (2003)

Jinwei Ye received her B.E. de-
gree from the Department of Elec-
trical Engineering, Huazhong Uni-
versity of Science and Technologry
in 2009. She is now a Ph.D. student
at the Department of Computer and
Information Sciences, University of
Delaware. Her research interests in-
clude computational photography,
and ray geometry.

Jingyi Yu is an associate profes-
sor at Computer and Information
Science Department at the Univer-
sity of Delaware. He received his
B.S. from Caltech in 2000 and M.S.
and Ph.D. degree in EECS from
MIT in 2005. His research interests
span a range of topics in computer
graphics, computer vision, and im-
age processing, including computa-
tional photography, medical imag-
ing, nonconventional optics and
camera design, tracking and surveil-
lance, and graphics hardware.


	Ray geometry in non-pinhole cameras: a survey
	Abstract
	Introduction
	Scope

	Pinhole optics
	Pinhole in ray space
	Ray space
	Pinhole ray geometry
	Slit ray geometry

	The thin lens operator

	Non-pinhole imaging models
	Classical non-pinhole cameras
	General non-pinhole cameras
	General linear cameras (GLC)
	Case study 1: reflection on curved mirrors
	Case study 2: 3D surfaces


	Non-pinhole camera through the thin lens
	GLC through a thin lens
	Slit-direction duality
	Case study 3: defocus analysis in catadioptric cameras

	Applications of synthetic non-pinhole cameras
	Synthesizing panoramas
	Non-photorealistic rendering
	Stereo and 3D reconstruction

	Real non-pinhole imaging systems
	General catadioptric cameras
	Centric catadioptric cameras
	Non-centric catadioptric cameras

	Solutions to the forward projection problem
	GLC approximation
	Axial cameras

	Catadioptric projectors
	4D ray sampler: light field cameras
	Lenslet based light field camera
	Mask based light field camera
	Mirror based light field camera
	Light field probes


	Future directions
	References