key: cord-0464451-u4gcj5xh
authors: Sinclair, Sean R.; Banerjee, Siddhartha; Yu, Christina Lee
title: Sequential Fair Allocation: Achieving the Optimal Envy-Efficiency Tradeoff Curve
date: 2021-05-11
journal: nan
DOI: nan
sha: af44cf3229f309f15ee4a86058c8df1c3e1598af
doc_id: 464451
cord_uid: u4gcj5xh

We consider the problem of dividing limited resources to individuals arriving over $T$ rounds. Each round has a random number of individuals arrive, and individuals can be characterized by their type (i.e. preferences over the different resources). A standard notion of `fairness' in this setting is that an allocation simultaneously satisfy envy-freeness and efficiency. The former is an individual guarantee, requiring that each agent prefers their own allocation over the allocation of any other; in contrast, efficiency is a global property, requiring that the allocations clear the available resources. For divisible resources, when the number of individuals of each type are known upfront, the above desiderata are simultaneously achievable for a large class of utility functions. However, in an online setting when the number of individuals of each type are only revealed round by round, no policy can guarantee these desiderata simultaneously, and hence the best one can do is to try and allocate so as to approximately satisfy the two properties. We show that in the online setting, the two desired properties (envy-freeness and efficiency) are in direct contention, in that any algorithm achieving additive counterfactual envy-freeness up to a factor of $L_T$ necessarily suffers a efficiency loss of at least $1 / L_T$. We complement this uncertainty principle with a simple algorithm, HopeGuardrail, which allocates resources based on an adaptive threshold policy and is able to achieve any fairness-efficiency point on this frontier. In simulation results, our algorithm provides allocations close to the optimal fair solution in hindsight, motivating its use in practical applications as the algorithm is able to adapt to any desired fairness efficiency trade-off.

Our work here is motivated by a problem faced by a collaborating food-bank (Food Bank for the Southern Tier of New York (FBST) [26] ) in operating their mobile food pantry program. Recent demands for food assistance have climbed at an enormous rate, and an estimated fourteen million children are not getting enough food due to the COVID-19 epidemic in the United States [36, 12] . With sanctions on operating in-person stores, many foodbanks have increased their mobile food pantry services. In these systems, the mobile food-bank must decide on how much food to allocate to a distribution center on arrival without knowledge of demands in future locations. This model also extends as a representation of broader stockpile allocation problems (such as vaccine and medical supply allocation) and reservation mechanisms.

As a simplified example (see Section 3 for the full model, including multi resources and individual types), every day, the mobile food pantry uses a truck to deliver B units of food supplies to individuals over T rounds (where each round can be thought of as a distribution location: soup kitchens, pantries, nursing homes, etc). When the truck arrives at a site t (or round t), the operator observes N t individuals and chooses how much to allocate to each individual (X t ∈ R Nt ) before moving to the next round. The number of people assembling at each site changes from day to day, and the operator typically does not know the number of individuals at later sites (but has a sense of the distribution based on previous visits).

In offline problems, where the number of individuals at each round (N t ) t∈ [T ] are known to the principal in advance, there are many well-studied notions of fair allocations of resources. One individual guarantee, envy-freeness, requires that each individual prefers their own allocation over the allocation of any other. In contrast, efficiency is a global property, requiring that the allocations clear the available resources. For divisible resources, the above desiderata are simultaneously achievable for a large class of utility functions, with multiple resources, and is easily computed (via a convex program) by maximizing the Nash Social Welfare (NSW) objective subject to allocation constraints [51, 23] . As an example, in this (simplified) setting, the fair allocation is easily computed by allocating X opt = B N to each individual, where N = t∈[T ] N t is the total number of people across all rounds. This allocation is clearly envy-free (as each individual receives an equal allocation), and is efficient (as all of the resources are exhausted); its also easy to see that this is the only allocation that satisfies these two properties simultaneously.

Many practical settings, however, operate more akin to the FBST mobile food pantry, where the principal makes allocation decisions online with incomplete knowledge of the demand for future locations. However, these principals do have access to historical data allowing them to generate histograms over the number of individuals for each round (or potentially just first moment information). Designing good allocation algorithms in such settings necessitates harnessing the (Bayesian) information of future demands to ensure equitable access to the resource, while also adapting to the online realization of demands as they unfold to ensure efficiency.

Satisfying any one of these properties is trivially achievable in online settings. The solution that allocates X t = 0 to each individual at location t satisfies hindsight envy-freeness as each individual is given an equal allocation. The solution that allocates X 1 = B/N 1 to individuals at the first location and X t = 0 for t ≥ 2 satisfies efficiency as the entire budget is exhausted at the first location. A more difficult challenge in this setting is achieving low counterfactual envy, ensuring that the allocations made by the algorithm (X t ) are close to what each individual should have received with the fair solution in hindsight (B/N ). Here we tackle these important challenges by defining meaningful notions of approximately-fair online allocations and develop algorithms which are able to utilize distributional knowledge to achieve allocations that strike a balance between the competing objectives of envy-freeness and efficiency. 

Achievable via Guarded-Hope (3) Envy (b) Envy -∆ efficiency Figure 1 : Graphical representation of the major contribution (Informal Theorem 1). Here, the xaxis denotes ∆ EF (the maximum difference between utility individuals receive from the algorithm and the fair allocation in hindsight) or Envy (the maximum difference between utility individuals receive from the algorithm), and the y-axis denotes ∆ efficiency , the remaining resources. The dotted line represents the impossibility due to statistical uncertainty in the optimal allocation, and the region below the solid line represents the impossibility due to the envy-efficiency uncertainty principle. In the trade-off between Envy and ∆ efficiency we have an additional discrepancy between the lower and upper bounds (represented in yellow) which reduces via O( 1 T ).

In sequential settings, one way to measure the (un) fairness of any online allocation (X alg ) is in terms of its counterfactual distance (for both envy and efficiency) when compared to the optimal fair allocation in hindsight (i.e., offline allocation X opt ). Another measure is hindsight envy (when compared only to allocations made by the algorithm). In particular, we define the counterfactual envy as ∆ EF = u(X opt θ , θ) − u(X alg θ , θ) ∞ to be the maximum difference in utility between the algorithm's allocation and the offline allocation where agents are characterized by their type θ, define the hindsight envy as Envy = max t,t ,θ,θ u(X alg t ,θ , θ) − u(X alg t,θ , θ) to be the maximum difference between the utility individuals would have received if given someone else's allocations, and let ∆ efficiency = B − t N t X alg t be the algorithm's total leftover resources. These are all very stringent metrics, akin to the notion of regret in online decision-making settings, which subsume many other objectives.

In these settings with competing objectives, practitioners often resort to ad-hoc rules of thumb, heuristics, and trial-and-error adjustments of the system to attempt to manage the balance between objectives. How these criteria interact and trade-off amongst one another is often not well understood or characterized, and furthermore there typically does not exist a single best "ranking", or a clear single objective function that determines which tradeoffs are better than others. In fact, minimizing some combination of (∆ EF , Envy, ∆ efficiency ) can be formulated as a Markov decision process (MDP); however, as these metrics depend on the entire allocation, the complexity of finding the optimal policy is exponential in the number of rounds, and also, may be difficult to interpret [39] . Moreover, it is much harder to use MDP formulations to explore the tradeoff between the objectives.

Our main technical contribution is to provide a complete characterization of the achievable pairs of (∆ EF , Envy, ∆ efficiency ). Our results hold in expectation and with high probability, under multiple divisible resources, and with a finite set of individual types with linear utilities. In particular, we show the following informal theorem (see Fig. 1 for a graphical representation). 2b. (Hindsight Envy-Efficiency Uncertainty Principle): Any online allocation algorithm necessarily suffers ∆ efficiency min{ √ T , 1/Envy}.

For any choice of L T , with probability at least 1 − δ, Guarded-Hope with parameter L T achieves:

In short, our results show that both definitions of envy and waste must be inversely proportional to one another such that decreasing the envy requires increasing the waste and vice versa. The lower bounds (1 and 2) are established using anti-concentration arguments alongside understanding the fundamental gap in ensuring enough resources to allocate close to the estimated optimal solution while simultaneously trying to eliminate waste.

Furthermore, we provide a simple algorithm, Guarded-Hope, which achieves the correct tradeoff between envy and waste, matching the lower bound in terms of T up to logarithmic factors. Given an input of L T , our algorithm satisfies a hindsight envy bound of Envy L T , counterfactual envy bound of ∆ EF max{1/ √ T , L T }, with waste bounded by ∆ efficiency max{ √ T , 1/L T } with probability 1 − δ. Our algorithm achieves this using novel concentration arguments on the optimal Nash Social Welfare solution, utilizing a sensitivity argument on the solution to the optimization problem instead of the objective (as commonly used for competitive ratio guarantees) to learn a lower guardrail on the optimal solution in hindsight. Given this, we construct an upper guardrail to satisfy the desired ∆ EF and Envy bound. We then achieve the proper trade-off by carefully balancing allocating to the established lower guardrail with the upper guardrail while simultaneously ensuring the algorithm never runs out of budget.

To get some intuition into the envy-efficiency uncertainty principle, consider the simple food bank example described above for a single resource (with arrivals N t in each location, and X opt = B/N where N = t∈[T ] N t ). For convenience we temporarily assume that each agents utility is directly proportional to their allocation (i.e. u(X, θ) = X)). Consider allocation X 1 at the first location: via standard concentration arguments, one can find a high probability lower confidence bound for B/N with a half-width on the order of 1/ √ T . Now it's not hard to argue that allocating according to the lower confidence bound at all locations achieves counterfactual envy of ∆ EF ≈ 1/ √ T , Envy = 0, and ∆ efficiency ≈ √ T . This corresponds to the cusp of the efficiency-envy trade-off curves in Fig. 1 . Now if we relax the ∆ EF or Envy constraint to ≈ 1/T 1/3 and use the naive static policy of always allocating via the lower confidence bound on that order, we get a waste of T · T −1/3 = T 2/3 . Our algorithm instead takes a different approach, using the lower confidence bound of order 1/ √ T as the lower guardrail allocation, and sets the upper guardrail allocation to be the lower one plus the desired bound on ∆ EF or Envy. If we were to establish that the algorithm always allocates within the guardrails, we automatically have the desired bound on ∆ EF and Envy. The main additional factor in achieving the tradeoff for ∆ efficiency is ensuring we properly allocate according to the upper threshold while ensuring we do not run out of budget to ensure the lower threshold allocation. With this Guarded-Hope achieves ∆ efficiency ≈ T 1/3 , which furthermore is the best possible. Moreover, we complement our theoretical results with experiments highlighting the empirical performance of different algorithms (both on synthetic settings, as well as a dataset based on mobile food pantry operations), which shows that Guarded-Hope has much lower waste and envy compared to static under-allocation, as well as other natural heuristics.

While fairness in resource allocation is well-studied in offline and adversarial settings, fairness metrics for the sequential stochastic setting are poorly understood (especially when individuals are arriving online). Our proposed metrics and results give a novel way of extending Varian's definitions of fairness to the sequential setting. Moreover, ours is the first result to provide guarantees for fair online resource allocation with high probability for multiple resource and multiple type settings. Most existing work aims to show competitive ratio or additive guarantees on the Nash social welfare objective [9] or focus on the max-min objective [37, 39] . Such guarantees are dangerously misleading in that the resultant allocations may exhibit clear unfairness in hindsight. Similarly, an ex ante or probabilistic guarantee may also be perceived as unfair -both allocating 1 unit with certainty, and allocating 10 units with probability 1/10, give the same ex ante guarantee. In contrast, our chosen metrics and theoretical results provides a firm basis for counterfactual and ex-post individual fairness guarantees. While we do not believe our work gives a final answer in the theoretical and practical understanding of fairness in online allocation, we hope it will add to the conversation of incorporating ethics into sequential AI algorithms. More discussion on the advantages and disadvantages of our proposed fairness metrics is in Appendix C.

In addition to the mobile food pantry allocation problem which forms the focus of our work, we believe our ideas can prove useful in several other settings: Stockpile Allocation. In many healthcare systems or resource allocation problems, government mechanisms decide how to allocate critical resources to states, individuals, or hospitals. For example, the US federal government was tasked with distributing Remdesivir, an antiviral drug used early in the panedemic for COVID-19 treatment [38] . More recently a propros, states and government organizations have been deciding how to allocate COVID-19 (or influenza) vaccines to various population demographics across several rounds [55, 32] . In these scenarios on a monthly basis, each state is given a fixed amount of the resource (say COVID-19 vaccinations) and is tasked with distributing these to individuals across various distribution locations. While the primary goal is to develop efficient allocations (remarking the popularized term of getting "shots in arms"), an alternative objective may be to ensure equitable access to the resource [48, 22, 39] . Reservation Mechanisms. These are key for operating shared high-performance computing (HPC) systems [30] . Cluster centers for HPC receive numerous requests online with varying demands for CPUs and GPUs. Algorithms must allocate resources to incoming jobs with only distributional knowledge of future resource demands. Important to these settings is the large number of resources (number of GPUs, RAM, etc available at the center), requiring algorithms that scale to higher-dimensional problems.

The model and assumptions are introduced in Section 3. The statistical uncertainty principle and the envy-efficiency uncertainty principle are presented in Section 4. Sensitivity and concentration results on the counterfactual fair allocation is in Section 5. Our proposed allocation algorithm Guarded-Hope and its theoretical guarantees are in Section 6 and Section 7. The experiments are in Section 8.

Fairness in resource allocation and the use of Nash Social Welfare was pioneered by Varian in his seminal work [51, 52] . Since then researchers have investigated fairness properties for both offline and online allocation, in settings with divisible or indivisible resources, and when either the individuals or resources arrive online. We now briefly discuss some related works; see [3] for a comprehensive survey. What distinguishes our setting from many of the previous works is that we consider the online Bayesian setting with a known distribution. Many previous works are either limited to offline or non-adaptive algorithms, or consider adversarial online arrivals. Trade-offs between various fairness metrics has been considered previously in the literature, but for classification-based fairness metrics on protected attributes instead of allocation based ones [35] . Food Bank and Health Care Operations: There is a growing body of work in the operations research literature addressing logistics and supply chain issues in the area of humanitarian relief, health-care, and food distribution [47, 44, 5, 55, 32] . The research focuses on designing systems which balance efficiency, effectiveness, and equity. In [24] they study the logistical challenges of managing vehicles with limited capacity to distribute food and provide routing and scheduling protocols. In [37, 39] they consider sequential allocation with an alternative objective of maximizing the minimum utility (also called the leximin in the literature [42] ). We instead consider sequential allocation of resources under the objectives of achieving approximate fairness notions with regards to envy and efficiency. Cake Cutting: Cake cutting serves as a model for dividing a continuous object (whether that be a cake, advertisement space, land, etc) [17, 46] . Under this model, prior work considers situations where individuals arrive and depart during the process of dividing a resource, where the utility of an agent is a set-function on the interval of the resource received. Researchers analyze the offline setting to develop algorithms to allocate the resource with a minimal number of cuts [18] , or online under adversarial arrivals [54] . Our model instead imposes stochastic assumptions on the number of arriving individuals and characterizes probabilistic instead of sample-path fairness criteria. Online Resources: One line of work considers the resource (here to be thought of as the units of food, processing power, etc) are online and the agents are fixed [13, 4, 41, 40, 2, 9, 10, 16] . In [56] they study the tradeoffs between fairness and efficiency when items arrive under several adversarial models. Another common criteria is designing algorithms which are envy-free up to one item, where researchers design algorithms that can reallocate previously allocated items, but try to minimize these adjustments [31, 8] . These problems are in contrast to our model where instead the resources are fixed and depleting over time, and individuals arrive online. Online Individuals: The other setting more similar to our work considers agents as arriving online and the resources as fixed. In [33] they consider this setting where the resources are indivisible with the goal of maximizing utilitarian welfare (or the sum of utilities) which provides no guarantees on individual fairness. Another approach in [29] considers a scheduling setting where agents arrive and depart online. Each agent has a fixed and known arrival time, departure time, and demand. The goal then is to determine a schedule and allocation which is Pareto-efficient and envy-free. Another line of work [28, 20, 27] considers fair division with minimal disruptions on previous allocations. Their fairness ratio can be viewed as a competitive ratio form of our counterfactual envy definition (Definition 3.4). Non-adaptive Allocations: A separate line of research considers fairness questions for resource allocations in a similar setting where the utilities across groups are drawn from known probability distributions [22, 25] . They investigate probabilistic versions of fairness, where the goal is to quantify the discrepancy between the objectives of ensuring the expected utilization of the resources is large (ex-ante Pareto-optimal), while the probability of receiving the resource is proportional across groups (ex-ante proportional). However, they consider algorithms which decide on the entire allocation for each agent upfront before observing the demand rather than adaptive policies. Adaptive Allocations: In contrast, we consider a model where the principal makes decisions on how much of the resources after witnessing the number of individuals in a round. Most similar to our work is recent work analyzing a setting where individuals arrive over time and do not depart, so that the algorithm can allocate additional resources to individuals who arrived in the past [34] . We instead consider a stochastic setting where individuals arrive and depart in the same step with the goal of characterizing allocations that cannot reallocate to previous agents. Other papers either seek competitive ratios in terms of the Nash Social Welfare objective [7, 11, 9] , or derive allocation algorithms which perform well in terms of max-min [37, 39] . Our work differs from these in that we impose additional distribution assumptions (notably that the variance of the demand is smaller order than its mean, more common in real-world scenarios). The results in [39] can be viewed as highlighting a trade-off between efficiency and the max-min objective, although achieving efficiency of zero is more trivial in that setting as the algorithm designer is not penalized for giving all leftover resources at the last location. In contrast, under our setting eliminating the resources at the final round penalizes the algorithm in terms of both ∆ EF and Envy, requiring a more nuanced discussion on the trade-off between efficiency and envy.

We use R + to denote the set of non-negative reals, and X ∞ = max i,j |X i,j | to denote the matrix maximum norm, and cX to denote entry-wise multiplication for a constant c. When comparing vectors, we use X ≤ Y to denote that each component X i ≤ Y i .

A principal is tasked with dividing K divisible resources among a population of individuals who are divided between T distinct rounds -these could represent T locations visited sequentially by the principal (for example, food distribution sites visited by a mobile pantry), or T consecutive time periods (for example, days over which a hospital must stretch some limited medical supply before it is restocked).

Each resource k ∈ [K] has a fixed initial budget B k that the principal can allocate across these rounds. Each round has a (possibly random) set of distinct individuals arriving to request a share of the resources. Individuals are characterized by their type θ ∈ Θ, corresponding to their preferences over the K resources, where individuals of type θ receive utility u(x, θ) : R K × Θ → R for an allocation x. We henceforth assume that the set of possible types has finite cardinality |Θ|, and denote (N t,θ ) θ∈Θ to be the vector containing the number of arrivals of each type in round t, where N t,θ denotes the number of type-θ arrivals. Each (N t,θ ) θ∈Θ,t≤T is drawn from some known distribution F; note that these distributions across rounds need not be identical.

In the ex-post or offline setting, the number of individuals per round (N t,θ ) t∈[T ],θ∈Θ are known in advance and can be used by the principal to choose allocations X ∈ R T ×|Θ|×K for individuals in each round t of type θ. In the online setting the principal considers each round sequentially in a fixed order t = 1, . . . , T , is informed of the number of individuals (N t,θ ) θ∈Θ in that round, chooses allocation X alg t ∈ R |Θ|×K before continuing on to the next round, where X alg t,θ,k denotes the allocation of resource k earmarked for each of the N t,θ individuals of type θ in that round. This assumption includes not only i.i.d demands, but can also be extended to distributions that arise from Markov chains or latent variable models (see Section 7 for more details). We impose the additional assumption that the algorithm allocates the same allocation to each of the N t,θ individuals of type θ. This is without loss of generality, as one of the primary goals of the paper is to investigate envy, whereby one out of any two individuals of type θ in round t will envy the other unless their allocations are the same. Allocation decisions are irreversible, and must obey the overall budget constraints.

Assumptions: For convenience we assume that for every t ∈ [T ] and θ ∈ Θ we have that N t,θ ≥ 1 almost surely (although our techniques work more broadly so long as P(N t,θ = 0) does not scale with T ). We also assume that N t,θ are independent, with variance Var[N t,θ ] = σ t,θ > 0, and mean absolute deviation

, and assume that σ 2 min , σ 2 max , µ max are given constants. These assumptions are strictly for ease of notation; in particular, our results only depend on mild conditions on the expectation and tails of the sums of future arrivals t >t N t ,θ of each type. Extensions will be discussed in Section 7. We define

as the average resource per individual; for ease of understanding, β avg can be viewed as being a constant, but our results hold for any β avg .

We also primarily focus on utility functions that are linear, i.e., where u(x, θ) = w θ , x , where the latent individual type θ is characterized by w θ ∈ R K >0 as a vector of preferences over each of the different resources. Here again, we believe that our techniques extend to more general homothetic preferences, but for ease of notation (and given the richness of linear utilities), choose to focus on these. More details on modeling individual utilities are in Section 8. While the algorithm development and upper bounds hold more broadly under homothetic preferences, the lower bounds require more analysis.

Finally, we assume that our resources are divisible, in that allocations can take values in R K + . In our particular regimes of interest where we scale the number of rounds and budgets, this is easy to relax to integer allocations with vanishing loss in performance.

Additional Notation: We use B = (B 1 , . . . , B K ) the budget vector. For any location t and type θ, we use N ≥t,θ to denote t ≥t N t ,θ . If the subscript t is omitted we use N θ = T t=1 N t,θ to denote the total number of individuals of type θ. We additionally letρ ≥t,θ = 1 T −t t ≥t ρ t,θ and similarly forσ 2 ≥t,θ andμ ≥t,θ . A table with all our notation is provided in the Appendix.

The assumption that latent types Θ are finite is common in decision-making settings, as in practice, the set of possible types is approximated from historical data. One limiting assumption is that in the online setting, the principal only knows the number of individuals from one location at a time. In reality the principal could have some additional information about future locations, e.g. via calling ahead, that could be incorporated in deciding an allocation. Our algorithmic approach naturally incorporates such additional information. Additionally we assume a distinct set of individuals across each round, and consider the rounds t as fixed and distinct locations.

To define an ex-post fair allocation, i.e., with known number of individuals (N t,θ ) t∈[T ],θ∈Θ across rounds in [T ], we adopt an approach proposed by [51] (commonly referred to as 'Varian Fairness'), which is widely used in the operations research and economics literature. We will refer to this as fairness for brevity; for a more detailed discussion on the advantages and limitations of this model, see [50] or Appendix C.

Definition 3.1 (Fair Allocation). Given types Θ, number of individuals of each type (N t,θ ) t∈[T ],θ∈Θ and utility functions (u(·, θ)) θ∈Θ , an allocation X = {X t,θ ∈ R K + | T t=1 θ∈Θ N t,θ X t,θ ≤ B} is said to be fair if it simultaneously satisfies the following:

1. Envy-Freeness (EF): For every pair of rounds t, t and types θ, θ , we have u(X t,θ , θ) ≥ u(X t ,θ , θ).

for some round t and type θ, there exists some other round t and type θ such that u(Y t ,θ , θ ) < u(X t ,θ , θ ).

While the three properties form natural desiderata for a fair allocation, the power of this definition lies in that asking for them to hold simultaneously rules out many natural (but unfair) allocation policies. In particular, allocation rules based on maximizing a global function such as utilitarian welfare (sum of individual utilities) or egalitarian welfare (the maximin allocation, or more generally, the leximin allocation [15, 37, 39] where one maximizes the minimum utility, and subject to that the second minimum, and so on) are Pareto-efficient, but tend to violate individual envy-freeness, as they focus on global optimality rather than per-individual guarantees. A remarkable exception to this, however, is the Nash Social Welfare, whose maximization leads to an allocation that is Pareto-efficient, envy-free, and proportional, and hence fair.

Proposition 3.2 (Theorem 2.3 in [51] ). For allocation X, its Nash Social Welfare is defined as:

An allocation X that maximizes N SW (X) is Pareto-efficient, envy-free, and proportional (hence fair).

In addition to simultaneously ensuring PE, EF and Prop properties, the NSW maximizing solution can also be efficiently computed via the following convex program called the Eisenberg-Gale program [23] , obtained by taking the logarithm of the Nash Social Welfare:

Important to note in our setting is that the optimal fair allocation in hindsight, which solves Eq. (2) with a given number of individuals of each type across all rounds (N t,θ ) t∈[T ],θ∈Θ , does not depend on the round t. Indeed, any envy-free allocation can be formulated so X t,θ = X t ,θ (by setting X θ = 1 t t X t,θ ) and so we can instead consider the solution to:

where we use N θ = t∈[T ] N t,θ to denote the total number of individuals across all rounds of type θ. The fact that the optimal solution in hindsight does not depend on the round t forms the basis for our algorithm Guarded-Hope.

Recall that in our online setting the principal allocates resources across each round in a fixed order t = 1, . . . , T , whereupon at round t the principal sees (N t,θ ) θ∈Θ and decides on an allocation before continuing to the next round. A natural (albeit naive) approach in this setting could be to try and obtain allocations which satisfy Pareto-efficiency and envy-freeness on all sample paths. However, such an approach is not feasible even in the simplest online setting, as the optimal solution in hindsight is often a unique function of the realized number of individuals across each rounds. Proof. Let F 2 ∼ 1 + Bernoulli(p) with p ∈ (0, 1). For any value of N 1 with probability p the optimal solution is X opt = B/(N 1 + 1), else X opt = B/(N 1 + 2). As any algorithm must decide how much to allocate at round t = 1 without knowledge of N 2 , no algorithm can match the ex-post fair solution almost surely. Proposition 3.3 shows that trying to simultaneously achieve ex-post envy-freeness and Paretoefficiency is futile, and hence we need to consider approximate fairness notions. To this end, we define counterfactual envy, hindsight envy, and efficiency. Definition 3.4 (Counterfactual Envy, Hindsight Envy, and Efficiency). Given individuals with types Θ, sizes (N t,θ ) t∈[T ],θ∈Θ , and resource budgets (B k ) k∈[K] , for any online allocation (X alg t,θ ) t∈[T ],θ∈Θ ∈ R k , we define:

• Counterfactual Envy: The counterfactual distance of X alg to envy-freeness as

where X opt is the optimal fair allocation in hindsight, i.e. the solution to Eq.

• Hindsight Envy: The hindsight distance of X alg to envy-freeness as

• Efficiency: The distance to efficiency as

Our algorithm also provides ex-post guarantees on hindsight proportionality [49] , defined via ∆ prop max t,θ u B t,θ N t,θ , θ − u(X alg t,θ , θ). These approximate fairness definitions are motivated by the problems faced by the FBST. Hindsight envy measures the algorithm's ability to ensure individuals are not envious of the allocations given to any other. While this might serve as a natural first step at a definition, an algorithm achieving low hindsight envy does not necessarily imply that individuals are eager to participate. In particular, the algorithm which allocates X t,θ = 0 for all t and θ trivially achieves hindsight envy of zero. The food bank is instead concerned with ensuring allocations are close to what they should have been given based on the observed information along the trajectory. Our measure of counterfactual envy addresses that, penalizing allocation algorithms based on how close they were at addressing individual's utility versus the optimal solution in hindsight. In fact, this metric has been considered in the literature in a competitive ratio instead of additive sense [28, 27] . Lastly, efficiency is a natural yardstick for measuring an algorithm in order to ensure all of the resources which can be utilized are used.

Motivated by this discussion, we primarily consider the trade-off between (∆ EF , ∆ efficiency ). However, we also include the discussion on (Envy, ∆ efficiency ) for completeness. We also note that all of these metrics are much stronger than the existing metrics in the literature because we provide hindsight guarantees that hold with high probability with respect to the distribution as opposed to weaker ex-ante guarantees that only hold in expectation. Moreover, most approaches in the literature focus on defining a single optimization problem with a specified objective embodying 'fairness' that attempt to capture desired goals. This is fundamentally flawed as the definition of the objective or choice of metric itself biases the outcomes towards a particular point along the tradeoff curve between different criteria. The important challenge in this setting then is considering meaningful trade-offs between these metrics in the online setting (see Fig. 1 ) and designing algorithms which achieve any point along the tradeoff curve.

As highlighted earlier, the definition of counterfactual envy and efficiency are related. By using the fact that the optimal solution in hindsight X opt is efficient, we can naively bound ∆ efficiency using ∆ EF ,

This naive bound is loose, with unnecessary dependence on the number of locations T (see the Counterfactual Envy-Efficiency Uncertainty Principle in Section 4).

In this section we show (1), (2a), and (2b) from Informal Theorem 1 concerning a lower bound on the achievable ∆ EF , and the relationship between ∆ EF , Envy, and ∆ efficiency due to the envyefficiency uncertainty principle. In all of these proofs we consider the case of a single resource, single type, and assume that u(X, θ) = X for brevity and clarity in the presentation. However, the proofs extend directly to multiple resources where one considers the setting with |Θ| = K and each type θ desires a unique resource. We begin with (1), the statistical uncertainty principle on the optimal fair allocation in hindsight, showing that no online algorithm is able to achieve counterfactual envy smaller than order 1/ √ T . This arises due to the uncertainty in the number of individuals arriving in the future, forcing the algorithm to make a non-trivial decision on the allocation made to to individuals in the first round. where C is an absolute constant. Then with probability at least α any online algorithm must incur

Proof. We use the generalized Berry-Esseen theorem [14] . Recall that for all t, Var

and let Φ be the CDF of a standard normal. Using Berry-Esseen it holds that for an absolute constant C, for all z ∈ R,

Taking z = −y and using the lower bound we have that with probability at least Φ(−y) − Cρ

.

Taking z = y and using the upper bound we have that with probability at least Φ(−y) − Cρ

.

Note that these intervals are non-overlapping for y > 0. As the algorithm must decide on a value X alg 1 to allocate for the first round then with probability at least Φ(−y) − Cρ

.

which is positive then we get with probability at least α that

We next show the first part of the Envy-Efficiency Uncertainty principle (2a), highlighting that any online algorithm which achieves a factor of L T on counterfactual envy necessarily suffers efficiency of at least 1/L T . This result follows from the statistical uncertainty in the number of individuals arriving in the final L −2 T rounds, and the fact that ensuring a bounded envy requires any online algorithm to save enough budget to allocate a minimum allocation to all future arriving individuals. 

Proof. In order for the algorithm to guarantee that ∆ EF = X alg − X opt ∞ ≤ L T with probability at least 1 − α it must limit all allocations made to the interval [ B N − L T , B N + L t ] as X opt = B N . Moreover, a straightforward application of Hoeffding's inequality shows that |N − E[N ]| ≤c √ T with probability 1 − α wherec = 2ρ 2 max log(T /α). Using this, algebraic manipulations, and simplifying, one can show that the following event

occurs with probability at least 1−2α. We interpret these lower and upper thresholds on allocations made by the algorithm as guardrails.

Recall that we use the notation B alg t to denote the budget remaining for the algorithm at the start of round t. We begin by defining three events for a fixed round t ≤ T and constant

By Berry-Esseen Theorem [14] we know that P

First consider the event ¬C ∩ B. We show that ¬C ∩ B implies ¬D (or equivalently by taking the contrapositive that D implies that B implies C) which gives us that P(¬C ∩ B) ≤ P(¬D). These two conditions (C and B) dictates that the algorithm must have a lot of budget by allocating within the guardrails based on the number of individuals arriving in the future being small. Indeed, under events D and B we have that

Moreover, due to the fact that the algorithm is non-anticipatory, we know that the events C and B are independent. Thus we have that P(¬C ∪ B) = P(¬C)P(B). Using the bound on P(B) and the fact that P(¬C ∪ B) ≤ P(¬D) ≤ 2α we get that P(¬C) ≤ 2 3 . Now we consider the event C ∩ A ∩ D. Using that the allocations must be bounded by event D we have that the waste will be at least:

The inequality follows from lower bounding B alg t with the amount required to be reserved up to location t (i.e. event C), and upper bounding the maximum amount of budget that can be expended for locations i ≥ t when N ≥t ≤ E[N ≥t ] (i.e. event A).

Recall that E[N ≥t ] = (T − t + 1)μ ≥t , so that while the first term increases with (T − t + 1), the second term decreases with (T − t + 1). Solving for the maximum value in terms of t yields

The probability of this event is lower bounded by P(C∩A∩D) ≥ P(C∩A)−P(¬D)

Plugging in the value of z and simplifying terms yields the final result.

Lastly we show the second part of the Envy-Efficiency Uncertainty principle (2b), highlighting that any online algorithm which achieves a factor of L T on hindsight envy necessarily suffers efficiency of at least min{ √ T , 1 L T }. This result follows from the previous lower bound (2a), combined with an almost sure relationship between ∆ EF and Envy. We start with this brief lemma relating the two notions of envy. 

Proof. The upper bound follows immediately from applying the triangle inequality around X opt = Proof. Suppose the online algorithm achieves Envy ≤ L T with probability at least 1 − α. However, using Lemma 4.3 we get that ∆ EF − 1 N ∆ efficiency ≤ Envy ≤ L T . Hence we have that ∆ EF ≤ L T + 1 N ∆ efficiency with probability at least 1 − α. Denote byC as the terms on the right hand side of Theorem 4.2 and applying the result there for the case the 1/L T term attains the minimum we get that ∆ efficiency ≥C

Rearranging the inequality gives us that ∆ efficiency ≥ 

The lower bounds presented in Section 4 highlight a key facet of algorithm design in this setting: generating lower and upper guardrail allocations. Suppose we were able to construct envy-free allocations (X θ ) θ∈Θ and (X θ ) θ∈Θ such that X θ − X θ ∞ L T for a given parameter L T . If the algorithm was able to ensure that all of the allocations made to individuals of type θ are within [X θ , X θ ], it is not difficult to show that Envy L T . However, if we additionally desire a bound of ∆ EF L T , the same philosophy requires that we are able to establish that with high probability:

Motivated by these two use-cases, we turn our attention to sensitivity and concentration properties on solutions to the Eisenberg-Gale program. Unfortunately, the true Eisenberg-Gale program for the counterfactual optimal fair allocation depends on the unknown vector of number of individuals of each type (N t,θ ) t∈[T ],θ∈Θ . As such, our algorithms are motivated by solving information relaxed versions of the Eisenberg-Gale program, appealing to sensitivity and concentration on the optimizers of the program, instead of the objective value as is typically done in competitive ratio analysis. For the time being, we assume that we are given concentration inequalities of the form: with probability at least 1 − δ we have that for every t and θ, |E[N >t,θ ] − N >t,θ | ≤ Conf t,θ . As this concentration only depends on the assumptions on the variables N t,θ , we include a simple form of Conf t,θ scaling as √ T − t using Hoeffding's inequality in Lemma 5.3, but see Section 7 for extensions. Note that while the concentration on X opt only needs Conf t,θ for t = 0, by allocating according to X θ in all rounds, one can ensure that the algorithm does not run out of budget with probability at least 1 − δ.

Consider the Eisenberg-Gale program from Section 3 with multiple types θ and K resources as specified in Eq. (3) . Recall that the dual variables corresponding to the budget feasibility constraint p k , can be thought of as prices for the corresponding resources [43] . We start with a lemma showing properties of the optimal solution to the Eisenberg-Gale program with various number of individuals of each type vectors (N θ ) θ∈Θ . . Let x((N θ ) θ∈Θ ) and p((N θ ) θ∈Θ ) denote the optimal primal and dual solution to the Eisenberg-Gale program (Eq. (3)) for a given vector of individuals of each type (N θ ) θ∈Θ . Then we have that:

1. Scaling: IfÑ θ = (1 + ζ)N θ for every θ ∈ Θ and ζ ≥ 0 then we have that:

2. Monotonicity: If N θ ≤Ñ θ for every θ ∈ Θ then we have

These Lipschitz properties follow from the Fisher market interpretation of the Eisenberg-Gale optimum, which corresponds to market-clearing allocations in a setting with |Θ| agents, each with an endowment or budget of N θ . Note that these are all sensitivity properties in terms of the optimal solution instead of the objective value, as is typically done for competitive ratio guarantees. See Appendix E for the full proof.

Recall that our goal is to construct lower and upper threshold allocations about X opt = x((N θ ) θ∈Θ ) where N θ = t N t,θ is the true (random) number of individuals of type θ arriving over all rounds. First suppose we were able to construct n θ ≥ N θ for all θ and set n θ = (1+γ)n θ for some constant γ chosen based on L T . Setting the guardrails as X θ = x((n θ ) θ∈Θ ) and X θ = x((n θ ) θ∈Θ ) will be envy free (by Proposition 3.2), and for the correct value of γ satisfy the bounds needed to ensure Envy L T (by (1) of Lemma 5.1). However, for a large enough γ (or equivalently a large enough L T ) we can additionally ensure that n θ ≤ N θ and appeal to the monotonicity property (2) of Lemma 5.1 to ensure Eq. (4). These assumption includes not only i.i.d. demands, but also demand distributions that arise from Markov chains, latent variable models, or known cumulative sums, from different Chernoff style arguments (see Section 7 for more details). The following lemma shows the final construction of our guardrails by appropriately choosing n θ and n θ and appealing to Lemma 5.1.

. Let X opt = x((N θ ) θ∈Θ ) denote the optimal solution to the Eisenberg-Gale program for a given vector of individuals of each type (N θ ) θ∈Θ . Further suppose that with probability at least 1 − δ we have for all θ ∈ Θ: |N θ − E[N θ ]| ≤ Conf θ . Given any L T ≥ 0 and setting:

then almost surely we have that:

then with probability at least 1 − δ we have:

, and so n θ = 1+γ 1−c n θ . Using Lemma 5.1 and Lemma E.2 (from the Appendix) it follows that the difference in utilities for the guardrails and any θ is bounded by:

Moreover, again using Lemma 5.1 we can bound the difference in allocations by

The upper bound is similar where we use x((n θ ) θ∈θ ) ∞ ≤ B ∞ .

Lastly we further suppose that L T ≥ 2

First notice that via our concentration guarantees:

By construction then we have that n θ ≥ N θ . Simple algebraic manipulations and the assumption on L T shows that c ≥ max θ

and so n θ ≤ N θ as well. Moreover from the monotonicity property in Lemma 5.1 we have for all θ ∈ Θ, u(x((n θ ) θ∈Θ ) θ , θ) ≤ u(X opt θ , θ) ≤ u(x((n θ ) θ∈θ ) θ , θ). Next we give an example of the desired form of the concentration inequalities to obtain γ. For almost surely bounded demands, under the assumptions outlined in Section 3, a simple application of Hoeffding's inequality gives: Lemma 5.3. With probability at least 1 − δ, for every θ ∈ Θ and t ∈ [T ] we have that |N >t,θ − E[N >t,θ ]| ≤ Conf t,θ where Conf t,θ = 2(T − t + 1)ρ 2 max log(T |Θ|/δ). Proof. Let t and θ be fixed. By assumption we know that N t,θ ∈ [E[N t,θ ] − ρ t,θ , E[N t,θ ] + ρ t,θ ]. From a simple application of Hoeffding's inequality:

Setting the right hand side equal to δ, and relabeling δ to δ/T |Θ| via a union bound shows the result.

Using this form of Conf θ and Theorem 5.2, we notice that this construction ensures that we are able to guarantee a bound of L T on the difference in utilities for any 

Here we define our algorithm Guarded-Hope. The algorithm takes as input a budget B, expected number of each type (E[N θ ]) θ∈Θ , confidence terms (Conf t,θ ) θ∈Θ , and a desired bound L T on the ∆ EF and Envy. Assuming the lower and upper threshold allocations are constructed such that we can guarantee the results from Theorem 5.2 our algorithm is able to achieve any envy-efficiency tradeoff as developed in Theorem 4.2. The algorithm relies on two main components, both of which we believe to be necessary in developing an algorithm to achieve the envy-efficiency uncertainty principle (as removing any one of them leads to breakdowns -as will be discussed in the Section 7). We start by describing the high level ideas needed in the algorithm before describing the pseudocode (with full algorithm description in Algorithm 1). The proof that Guarded-Hope achieves the desired bounds will be deferred to Section 7.

As a result of Theorem 4.1 we saw that no online algorithm can guarantee ∆ EF 1 √ T . Moreover, the proof highlighted that any algorithm which satisfies a bound on ∆ EF or Envy L T must limit allocations based on guardrails with high probability. As such, our algorithm uses the construction from Section 5 to obtain estimates X = x((n θ ) θ∈Θ ) and X = x((n θ ) θ∈Θ ) which are both envy-free and satisfy that max t,θ |u(X t,θ , θ) − u(X t,θ , θ)| ≤ L T . If in addition L T 1/ √ T we have that u(X θ , θ) ≤ u(X opt θ , θ) ≤ u(X θ , θ). The allocations X θ and X θ are used by the algorithm as guardrails, where all allocations made by the algorithm for a type θ are forced to fall within {X θ , X θ }. With this requirement, on sample paths where we do not run out of budget, then we trivially have an upper bound on Envy L T and ∆ EF ≤ max{1/ √ T , L T }. Thus 'accepting' the first round loss in envy-freeness allows us to limit all future allocations to the guardrails generated by that uncertainty.

Once the guardrails X θ and X θ are found to verify a bound on the approximate envy up to a factor of L T , we change the focus to instead try and minimize the loss of efficiency. Thanks to our guardrails, we develop the algorithm to match a clairvoyant benchmark policy which minimizes the resource waste with the knowledge of (N t,θ ) t∈[T ],θ∈Θ while simultaneously limiting the allocations to lie between {X θ , X θ }. This can be thought of as solving an online stochastic packing problem (whose objective is to maximize efficiency, or minimize waste) with the addition of guardrail constraints (i.e. our minimum and maximum allocation constraints). In this setting, after writing down the stochastic packing optimization program, a competitive algorithm arises naturally by ensuring that the budget remaining for the algorithm is enough to satisfy a high probability bound on the resources required to allocate X to every individual arriving in the future. This idea is formalized in Section 7, and takes motivation in recent developments on Bayesian prophet benchmarks for online bin packing problems [53] .

Let B alg t,k denote the budget remaining to the principal for resource k at iteration t, i.e. B k − t <t θ N t ,θ X t ,θ,k . Assume the algorithm is given the expected demands (E[N θ ]) θ∈Θ and confidence terms Conf t,θ such that |N >t,θ − E[N >t,θ ]| ≤ Conf t,θ with high probability. Let γ = max θ

. Given a desired bound on envy L T , the algorithm computes the guardrails by

.

where x(·) denotes the solution to Eq. (3) . Note that as long as L T log(|Θ|T /δ)/T , the utility of the optimal allocation is sandwiched by the utilities of X and X according to Section 5.

Our algorithm allocates to type θ according to these thresholds X θ and X θ in order to ensure the guarantee of ∆ EF of at most L T , while simultaneously trying to eliminate as much waste as possible. At each time t, for each resource k ∈ [K], 1. (Insufficient budget): If B alg t,k ≤ N t,θ X θ,k then divide the resources equally among all remaining individuals for this round.

3. (Insufficient budget to promise lower threshold): Otherwise set X alg t,θ,k = X θ,k for each θ ∈ Θ.

Our algorithm is easy to implement in practice -in particular, it requires solving the Eisenberg-Gale program Eq. (3) only twice to obtain X and X. In contrast, other versions of modelpredictive control and algorithms from [49] require frequent resolves of the Eisenberg-Gale. This allows Guarded-Hope to scale easily to multiple resources and larger number of types (as it only involves solving for the 'optimistic' and 'pessimistic' allocation rules which are done offline with historical data). Moreover, it allows practitioners to leverage work on poly-time algorithms for solving the Eisenberg-Gale program [21] . It also extends easily to more complex information structures (see Section 7 for a discussion).

We are now ready to show the bound on ∆ EF , Envy, and ∆ efficiency for Guarded-Hope, relying on the construction of X θ and X θ from Section 5. We note that these guarantees match the envyefficiency uncertainty principles from Section 4 up to problem-dependent constants and logarithmic terms in T . Afterwards we comment on extending the distribution assumptions on N t,θ to more robust settings.

Theorem 7.1. Given budget B, expected number of types (E[N θ ]) θ∈Θ , and confidence terms (Conf t,θ ) θ∈Θ such that with probability at least

Guarded-Hope with parameter L T is able to achieve with probability at least 1 − δ (where drops poly-logarithmic factors of T , o(1) terms, and absolute constants):

By assumption we know that P(E) ≥ 1 − δ. The following lemma shows the algorithm ensures it has enough budget to allocate according to the lower threshold for everyone arriving in the future. Recall that B alg t,k denotes the budget remaining at the start of round t for resource k. 

As a result, the algorithm is able to guarantee that at every iteration X alg t,θ,k is either X θ,k or X θ,k .

Proof. We show the first statement by induction on t. The second statement follows immediately. Base Case t = 1. Here we have that B alg 1 = B and by construction of X θ we have that:

Step Case t − 1 → t. We split into two cases based on the allocation. If X alg t−1,θ,k = X θ,k , then by the induction hypothesis

where (a) holds by the condition for allocating X θ,k , and (b) holds under event E.

The next lemma shows that the algorithm is adaptively cautious, i.e., after some point, Guarded-Hope switches to allocating according to the lower threshold. Lemma 7.3. For each resource k, let t 0,k be the last time that X alg t,θ,k = X θ,k (or else 0 if the algorithm always allocates according to X θ,k ). Then under the event E for some c =Θ(1) we have that for all k, t 0,k ≥ max{0, T − 2cL −2 T }.

Proof. We will show the result by contradiction (for the setting when T − 2cL −2 T > 0). For some resource k, assume that 0 < t 0,k < T − 2cL −2 T . By definition of t 0,k , it must be that the algorithm allocated X θ,k at time t 0,k and allocated X θ,k for all subsequent times. Given the assumption, it must be that for any t > t 0,k Using the fact that X θ,k − X θ,k ≤ L T B ∞ βavg min w min w ∞ , and plugging in an upper bound for the demand at time t 0,k , a lower bound on the expected demand, and the confidence terms from Lemma 5.3, the right hand side of the above inequality can be bounded above by

Moreover, the left hand side can be bounded below under event E via

Plugging in the value of t = T − cL −2 T and by assumption that t 0,k < T − 2cL −2 T , it follows that for

Relabeling x = √ c to show a contradiction we need to find a value of x such that

Noting that the cusp of the quadratic is at a 2 a 1 we see that taking the constant

(independent of L T ) suffices to show the contradiction (⇒⇐).

The main result follows from the above two lemmas. We start by giving the upper bound on ∆ efficiency . Following from Lemma 7.3, which states that if t 0,k denotes the last time that X alg t,θ,k = X θ,k , then for all k, t 0,k ≥ max{0, T − 2cL −2 T } with high probability. This implies that for all k,

where (a) follows from the fact that by the definition of t 0,k the algorithm allocated the lower allocation at time t 0,k and the upper allocation for all t > t 0,k , and (b) follows from the condition in the algorithm for allocating the lower allocation at time t 0,k , which upper bounds B alg t 0,k ,k .

However, under E we know that E N >t 0,k ,θ − N >t 0,k ,θ ≤ 2Conf t 0,k ,θ . Plugging in the definition of Conf t 0,k ,θ and the bound on (X θ,k − X θ,k ) from Theorem 5.2 we have:

Taking this and plugging in the value of t 0,k we get that:

Note that L 2 T = o(1) such that the second term is dominated by the first. Next we show the desired bound on Envy. Consider an arbitrary t, θ, t , θ . Then we have that:

Where in (a) we used that X is envy free we know the second pair is bounded above by zero, (b) we used the definition of the utilities, and (c) the fact that the algorithm allocates according to guardrails, and (d) the bound in Theorem 5.2. Taking max over t, t , θ, θ gives the result.

Next we show the bound on ∆ EF . First consider the setting when L T ≥ 2 w 2 ∞ w min βavg min 2ρ 2 max log(T |Θ|/δ) T such that we satisfy properties four and five of Theorem 5.2. Using that the algorithm always allocates according to X θ or X θ with probability at least 1 − δ, we get:

However, even for the case that L T < 2

then for any θ we have either u(X θ , θ) ≤ u(X opt θ , θ) ≤ u(X θ , θ) as in property five, or u(X θ , θ) ≤ u(X θ , θ) ≤ u(X opt θ , θ). If the first case holds then

Otherwise then we can considerX θ to be the upper guardrail solution via the construction from Section 5 with L T = 2 Lastly we show the bound on ∆ prop . Recall that Guarded-Hope satisfies that ∆ EF max{1/ √ T , L T }. However, by definition of ∆ EF this ensures that for any round t and type θ that |u(X alg t,θ , θ) − u(X opt θ , θ)| max{1/ √ T , L T }. Using this and the fact that X opt is proportional we see that

Taking the max over t and θ gives the desired bound on ∆ prop .

Randomization and the Convex Envelope: As stated in Theorems 4.2, 4.4 and 7.1, the given upper and lower bounds are precise up to o(1) factors. Recall that the performance guarantee on Guarded-Hope with parameter L T is ∆ efficiency min{ √ T , 1/L T }. We can improve the performance of the algorithm for values of L T ∈ [0, 1/ √ T ] by randomization. Indeed, let π L T denote the Guarded-Hope allocation policy with parameter L T . Consider π(α) to be the allocation policy which picks π 0 with probability (1 − α) and π 1/ √ T with probability α, playing that policy once chosen across all rounds. It is easy to see that this policy achieves:

This improves the performance on ∆ efficiency for values of L T smaller than 1/ √ T . Generalizing Distributional Assumptions: Theorem 7.1 considers the setting when N t,θ ∼ F t is independent across θ, and a time-dependent process with bounded mean absolute deviation and finite variance. However, it is simple to see that the proofs only require a bound on the following event:

with the scaling of Conf t,θ being on the order of √ T − t. The main modification to Guarded-Hope is to condition on the observed sequence of N ≤t,θ thus far, particular in step two of the algorithm:

2. (Sufficient budget to promise lower threshold): If B alg t,k ≥ X θ,k N t,θ +X θ,k (E[N >t,θ | N ≤t,θ ]+ Conf t,θ (N ≤t,θ )) then set X alg t,θ,k = X θ,k ∀θ ∈ Θ

The concentration arguments and scaling of Conf t,θ are used in three different sections:

1. Construction of the lower and upper guardrails (X θ and X θ ). This uses concentration of E[N θ ] which is given by Conf 0,θ .

2. Ensuring the algorithm doesn't run out of budget by saving resources to allocate X θ to every individual (as in Lemma 7.2). This utilizes the confidence intervals on N >t,θ which would be given by Conf t,θ (N ≤t,θ ).

3. Construction of the time point after which Guarded-Hope switches to allocating to the lower threshold (as in Lemma 7.3) uses the scaling of Conf t,θ as √ T − t.

Each of these steps would be given by t,θ E t,θ and so our approach works under distributional assumptions which yield Chernoff bounds on each event E t,θ :

• N t,θ ∼ F t is a time-dependent process where each N t,θ has bounded mean absolute deviation and finite variance (as in Lemma 5.3). • N t,θ ∼ F t is a time-dependent process where each N t,θ is sub-Gaussian.

• N t,θ are conditionally independent and sub-Gaussian given a latent variable Z. This model naturally encompasses dependence on the weather, other local events, etc.

• N θ = t N t,θ is known for each θ. In this setting the algorithm can simply take

• Each N t,θ evolves independently across θ according to different ergodic Markov Chains. Concentration bounds for these processes can be constructed using recent work on Chernoff-Hoeffding bounds for Markov Chains (Theorem 3.1 in [19] ).

Here we complement the theoretical developments on the performance of Guarded-Hope with an empirical study motivated by the challenges faced by the Food Bank of the Southern Tier when operating their mobile food pantry program. We first describe the synthetic and datadriven experiments conducted and which aspects of the problem faced by the FBST they model, before later comparing the effectiveness of Guarded-Hope (with various choices of L T ) to other algorithms on our established metrics (Definition 3.4) . These experiments serve as a proof of concept of the performance for the algorithms. All of the code for the experiments is available at https://github.com/seanrsinclair/Online-Resource-Allocation.

Our experiments are motivated by the problems faced by the Food Bank of the Southern Tier (FBST) on operating their mobile food pantry program. Due to recent increase in demands for food assistance from the COVID-19 epidemic in the United States [36] and sanctions on operating in-person stores, many food banks have increased their mobile food pantry services to meet the increased demand. In these systems, the mobile food pantry must decide on how much food to allocate to a distribution center on arrival, without knowledge of demands in future locations.

To be more specific, the FBST operates out of seventy different distribution locations (see Fig. 2 for a map) across the Southern Tier of New York State. Due to its relatively remote location, throughout the year they receive infrequent shipments of a large amount of resources (hence necessitating modeling a large T for the number of rounds before a new shipment arrives). While they receive more frequent donations from individuals, the time between larger consistent donations can span up to two or three weeks. A certain amount of these resources are dedicated to the mobile food pantry, and used to service the seventy different distribution locations until the arrival of the next shipment. Moreover, the schedule to which the mobile food pantry visits the different distribution locations is fixed and known in advance (and is generated primarily due to time constraints in the locations used for the distribution locations).

We interpret the total number of rounds T as the number of distribution locations the mobile food pantry will visit until the arrival of the next shipment of food resources. Important to note, is that in our simulation experiments we do not consider the additional challenges faced when trying to design a schedule (i.e. an order to visit the distribution locations) so as to achieve further notions of fairness and equity. Each round t then corresponds to a visit to a specific distribution location (for example the Tompkins County Community College in Dryden, NY) where the food pantry decides on an allocation of resources to allocate to the (random) number of individuals congregating at that location.

As such, we conduct four experiments, named Single-Synthetic, Single-FBST, Multi-Synthetic and Multi-FBST where the distributions F t dictating the number of people of each type at a distribution location, and preferences w θ ∈ R K are chosen as follows:

• Single-Synthetic: In this experiment we consider a single resource and single type, where the preference w = 1 (much like the simulation described in the lower bounds presented in Section 4) results in u(X, θ) = X. We pick the arrival distribution F t ∼ 1 + Poisson(1.5).

• Single-FBST: Similar to Single-Synthetic we consider a single resource and single type, with preference w = 1. In each experiment we first pick a random collection of T locations taken from the 2019 historical dataset collected by the FBST and set F t ∼ Normal(µ t , σ 2 t ) where µ t and σ 2 t are the mean and variance of the demand at the t'th distribution location.

• Multi-Synthetic: In this experiment we consider the setting of five types and three resources. The weights w θ for the different types θ are listed in Table 3 . The arrival distribution is chosen to be F t ∼ 1 + Poisson(1.5, 2.5, 3.5, 4.5, 5.5).

• Multi-FBST: Here we consider the setting of five resources (corresponding to cereal, pasta, prepared meals, rice, and meat) and three types (corresponding to vegetarians, carnivores, and "prepared-food only" individuals). Similar to Single-FBST we first pick a random collection of T locations from the 2019 FBST data, and set F t ∼ D θ Normal(µ t , σ 2 t ) where µ t and σ 2 t are the mean and variance of the demand of the t'th distribution location, and D θ = [. 25, .3, .45] corresponds to the fraction of the population with each of those preferences. The weights w θ for the products are chosen from the historical prices used in the market mechanism to distribute food resources to food pantries across the United States [45] (see Table 2 for the full table of weights).

In the FBST style simulations we use a provided dataset which only includes mean and variance instead of per-visit information. We hope that these experiments help highlight the applicability of the algorithm and to serve as a 'proof of concept' of the algorithm with realistic parameters.

For each of the experiments, we compare the performance of Guarded-Hope with L T = T −1/2 and T −1/3 to a Fixed-Threshold algorithm which always allocates according to X until running out of resources (also can be interpreted as Guarded-Hope with L T = 0). In Guarded-Hope we use the natural confidence bounds garnered via Hoeffding or Bernstein inequalities. We also compare against Hope-Full and Hope-Online from [49] , which solve the Eisenberg Gale program (Eq. (3)) at every round t using either the current remaining budget, and future sizes replaced with their expectation (for Hope-Online) or the initial budget with past sizes as their realizations and future sizes replaced with their expectations (for Hope-Full).

We believe that these experiments both help highlight the theoretical performance of Guarded-Hope, while simultaneously serving as an example of the practicality of our algorithm for broader stockpile allocation problems.

For the setups described above, we numerically evaluate the performance of various policies. In each simulation we set the total budget B to be t,θ E[N t,θ ] so that the scaling of β avg remains as a constant as we vary the number of rounds T . We compare the algorithms in terms of • ∆ EF = u(X opt t,θ , θ)−u(X alg t,θ , θ) ∞ , the maximum difference between utility individuals receive for the two allocations as we scale the number of rounds T (Definition 3.4)

• ∆ efficiency = k B k − θ,t N t,θ X alg t,θ,k , the leftover resources as we scale the number of rounds T (Definition 3.4)

• ∆ + EF = max t,θ E |u(X opt t,θ , θ) − u(X alg t,θ , θ)| , the ex-ante maximum difference between utility individuals receive for the two allocations as we scale the number of rounds T • Envy = max t,θ,t ,θ u(X alg t ,θ , θ) − u(X alg t,θ , θ), the maximum hindsight envy between any two agents (Definition 3.4)

• ∆ prop = max t,θ u(B/N, θ) − u(X alg t,θ , θ), the maximum hindsight envy between an agent and equal allocation (Definition 3.4)

, the Nash Social Welfare for the resulting allocation from the algorithm (Eq. (1)).

For each of the simulations conducted we include three plots (as seen in Figs. 3 and 4 ). The first of which highlights the measure of ∆ EF , ∆ efficiency and ∆ + EF as we vary the number of rounds T . The second and third fixes a value of T and compares the algorithms on each of the six metrics, where the values are normalized and larger scores corresponds to better performance. We make three key observations:

1. Fixed-Threshold vs Guarded-Hope: In Fig. 3 we see that our Guarded-Hope algorithms (for varying values of L T ) are able to outperform Fixed-Threshold in terms of efficiency, as our algorithms greedily allocate the upper threshold while ensuring budget compliance.

2. Guarded-Hope for varying L T : In Fig. 3 we see that Guarded-Hope for larger values of L T achieves better efficiency performance, but worse performance in terms of ∆ EF (and illustrating Theorem 7.1).

3. Guarded-Hope vs algorithms from [49] : While Hope-Full and Hope-Online improve upon Guarded-Hope in terms of ∆ efficiency (as both algorithms use all available resources at the last round), we see that Guarded-Hope is able to achieve a more nuanced tradeoff between the other metrics. In each of the simulations we observe that Guarded-Hope is able to outperform a Fixed Threshold algorithm in terms of ∆ efficiency , while Guarded-Hope simultaneously achieves various trade-offs between ∆ EF and ∆ efficiency as we vary L T . Fig. 4 we compare each of the algorithms on several different metrics. Here we see that Guarded-Hope performs competitively with respect to Fixed-Threshold on the metrics of interest (in particular, ∆ EF and ∆ efficiency ), with the ability to tune the performance by varying the parameter L T .

These simulations help illustrate the theoretical performance of Guarded-Hope (as outlined in Theorem 7.1) and the Envy-Efficiency Uncertainty Principle (as in Theorem 4.2). Moreover, they help illustrate the truly multi-objective landscape of determining a fair allocation algorithm, and the benefits of varying Guarded-Hope to tune L T and achieve a desired performance across all of the benchmarks.

In this paper we considered the problem of dividing limited resources to individuals arriving over T rounds, where each round can be thought of a distribution location. In the offline setting (where the number of individuals arriving to each location is known), achieving a fair allocation scheme is found by maximizing the Nash Social Welfare objective subject to budget constraints. However, in online settings, no online algorithm can achieve fairness properties ex-post. We instead consider the objective of minimizing ∆ EF (the maximum difference between the utility individuals receive from the allocation made by the algorithm and the counterfactual optimal fair allocation in hindsight), Envy (the maximum difference between the utility individuals receive from the allocation made by the algorithm and the allocation given to a different individual), and ∆ efficiency (the additive excess of resources).

However, we show that this objective leads to a the envy-efficiency uncertainty principle, an exact characterization between the achievable (∆ EF , Envy, ∆ efficiency ) pairs. In particular, our result shows that any algorithm must necessarily suffer ∆ efficiency min{ √ T , 1 ∆ EF }. With this analysis, we show that it leads to a simple algorithm, Guarded-Hope, which is obtained by solving the Eisenberg-Gale program with unknown quantities replaced with their expectation to generate guardrails used in the allocation, combined with an adaptive algorithm aimed at minimizing waste. Through experiments we showed that Guarded-Hope is able to obtain allocations which achieve any fairness -efficiency tradeoff, with desirable fairness properties compared to several benchmarks and prior work.

Several open questions remain, including extending the analysis to more general utility functions (including homothetic, another common model of preferences over resources). We also believe much of the theoretical results apply to settings where the budget B is instead a stochastic process, accounting for external donations and depletions of the resources independent of the allocations made by the algorithm. Moreover, we leave the question of matching the upper and lower bounds in terms of problem dependent constants, and the issue of determining the schedule to visit locations as future work.

A Specification of an individual's type in Θ u(x, θ) : R k → R + Utility function for individuals of type θ, here assumed to be x, w θ N t,θ

Number of individuals of type θ in round t F t Known distribution over

The respective maximum and minimum value of each quantity ρ max , µ min , µ max

The respective maximum and minimum value of each quantity

Optimal fair allocation in hindsight, i.e. solves Eq. (3) and allocation by algorithm 

In this section we discuss some of the limitations of our definitions as well as comparisons to competitive ratio guarantees. We note that these results have been presented at various workshops [49] .

Economists, computer scientists, and people in the operations research literature have become increasingly interested in questions of fairness [50, 51, 52] . One particular concept of fairness which has gained wide circulation due to its ease in computability is the so-called notion of 'Varian fairness' pioneered by Hal Varian in the 1970s taken in Definition 3.1. This theory of fairness provides three criteria for judging a given allocation of resources: envy-freeness, Pareto-efficiency, and proportionality, all defined with respect to utilities an individual has for different allocations. These criteria serve as more of a classification than an optimization perspective, as each of them merely provides a true/false criteria for an allocation to satisfy fairness rather than a way an allocation can approach fairness. Numerous other researchers have proposed other definitions of fairness, including α-fairness obtained by maximizing [42, 6] :

In this definition, taking α = 1 recovers utilitarian welfare, or maximizing the sum of utilities. Taking α → −∞ also recovers the leximin objective, and α → ∞ the leximax objective.

One primary critique of 'Varian fairness' is that a Varian fair allocation may not exist at all. Moreover, the implication is that if a Varian-fair allocation exists, then it has special merit. While we specifically consider settings where a 'Varian fair' allocation always exists (and is remarkably found as a result of optimizing the Nash Social Welfare objective), it is important to consider some of the several downsides of this model.

Paramount to Varian's definition of proportionality and envyfreeness is that each individual is treated symmetrically. This ignores systemic factors that inhibit particular individual's access to the resource.

Scale Invariance: The concept of fairness is strictly operational, in the sense that it requires no more information than what is contained in an individual's utility function. Settings like matching students to local schools via school choice require definitions which measure the 'utility of replacement' [1] . As an example, a student with preferences (School A, School B, School C) in descending order gets matched to School B. How can we measure the overall gain to society when the student is instead matched to School A, or School C in comparison to another students list of preferences? In Varian's definition of fairness, utility functions are only used to exhibit an ordering on preferences, rather than a relative value on different outcomes.

We believe the settings considered in Section 1.2 are well suited to Varian's model on fairness.

In the example motivated with the Food Bank of the Southern Tier of New York, rounds correspond to distribution sites, whether that be a soup kitchen, a drop-off location for the mobile food bank, etc. In these settings, individuals have use for all resources with strictly increasing utility with respect to the resource allocated. This motivates using scale invariant measures, as every individual will be able to use all available food allocated to them. Considering processor assignment in cloud computing platforms, each individual request coming in should be treated independently and symmetrically.

One approach on obtaining fairness guarantees for an online algorithm could be in the form of a competitive ratio. These results find allocation algorithms X alg to which you can construct a bound on the competitive ratio for the Nash Social Welfare (or its logarithm): θ u(X alg θ , θ) N θ θ u(X opt θ , θ) N θ or θ N θ log(u(X alg θ , θ)) θ N θ log(u(X opt θ , θ))

.

While theoretically interesting, these results provide no immediate individual fairness guarantees. The motivation for the Eisenberg-Gale program arises from the fact that fairness is a byproduct.

To some extent, the actual objective value of an allocation is meaningless, and the objective is only taken as it serves as a proxy to obtain fair allocations. In many applications of resource allocation, stakeholders are more interested in obtaining individualized guarantees than global guarantees on social welfare. This motivated our alternative approach of designing algorithms with individualized guarantees in mind. In fact, our guarantees are related to the fairness ratios developed in [28, 20, 27] , presented additively instead of multiplicatively.

Metrics Included: In the line plots we include three plots of the following:

• E[∆ EF ] = E u(X opt t,θ , θ) − u(X alg t,θ , θ) ∞ , the expected maximum difference between utility individuals receive for the two allocations as we scale the number of rounds T (Definition 3.4)

• E[∆ efficiency ] = E k B k − θ,t N t,θ X alg t,θ,k , the expected leftover resources as we scale the number of rounds T (Definition 3.4) Table 2 : Weights w k for the different products considered in the Multi-FBST experiments. Here we use the weights taken from the historical prices used in the market mechanism to distribute food resources to food pantries across the United States [45] . • ∆ + EF = max t,θ E |u(X opt t,θ , θ) − u(X alg t,θ , θ)| , the ex-ante maximum difference between utility individuals receive for the two allocations as we scale the number of rounds T In the radar plots we include:

• E[∆ EF ] = E u(X opt t,θ , θ) − u(X alg t,θ , θ) ∞ , the expected maximum difference between utility individuals receive for the two allocations (Definition 3.4)

• E[∆ efficiency ] = E k B k − θ,t N t,θ X alg t,θ,k , the expected leftover resources (Definition 3.4)

• E[Envy], the expected maximum envy between any two agents (Definition 3.4)

• E[∆ prop ], the expected maximum envy between an agent and equal allocation (Definition 3.4)

• E N SW (X alg ) , the expected Nash Social Welfare of the allocations made by the algorithm (Eq. (1)) Experiment Setup: Each experiment was run with 200 iterations where the relevant plots are taking the mean of the related quantities. In all experiments the budget B = t,θ E[N t,θ ] so that β avg scales as a constant as we vary the number of rounds T . All randomness is dictated by a seed set at the start of each simulation for verifying results.

Empirical Trade-off Curves: In Fig. 5 we show empirical trade-off curves for the performance of Guarded-Hope for various values of L T . These plots serve to highlight an empirical version of Fig. 1 highlighting the empirical uncertainty principles exhibited by the algorithm.

Computing Infrastructure: The experiments were conducted on a personal computer with an AMD Ryzen 5 3600 6-Core 3.60 GHz processor and 16.0GB of RAM. No GPUs were harmed in these experiments. 

We start by giving a full proof for the Lipschitz properties of the optimal solution to the Eisenberg Gale program.

Lemma E.1 (Sensitivity of solutions to the Eisenberg-Gale Program (Lemma 5.1 of main paper)). Let x((N θ ) θ∈Θ ), p((N θ ) θ∈Θ ) denote the optimal primal and dual solution to the Eisenberg-Gale program (Eq. (3)) for a given vector of individuals of each type (N θ ) θ∈Θ . Then we have that:

1. Scaling: IfÑ θ = (1 + ζ)N θ for every θ ∈ Θ and ζ ≥ 0 then we have that:

x((Ñ θ ) θ∈Θ ) =

x((N θ ) θ∈Θ ) 1 + ζ p((Ñ θ ) θ∈Θ ) = (1 + ζ)p((N θ ) θ∈Θ ).

Moreover, we have that u(X((N θ ) θ∈Θ ) θ , θ) − u(X((Ñ θ ) θ∈Θ ) θ , θ) = 1 − 1 1 + ζ max k w θ,k p((N θ ) θ∈Θ ) k .

2. Monotonicity: If N θ ≤Ñ θ for every θ ∈ Θ then we have

Proof. For notational brevity we will use x = x((N θ ) θ∈Θ ), p = p((N θ ) θ∈Θ ) andx,p for the solutions corresponding to N θ andÑ θ respectively. We start off by taking the KKT conditions of the Eisenberg-Gale program. Using variables p k for the constraint that θ∈Θ N θ X θ,k ≤ B k we have the following conditions:

1. Primal Feasibility: θ∈Θ N θ X θ,k ≤ B k 2. Dual Feasibility: p k ≥ 0 3. Complementary Slackness: p k > 0 implies that θ∈Θ N θ X θ,k = B k

Roles for computing in social change

Monotone and online fair division

Online fair division: A survey

Online fair division: Analysing a food bank problem

A unified framework for efficient, effective, and fair resource allocation by food banks using an approximate dynamic programming approach

Social choice and individual values

How to allocate goods in an online market?

Control of fair division

Online nash social welfare maximization via promised utilities

Online vector balancing and geometric discrepancy

Fair resource allocation in a volatile marketplace

About 14 million children in the us are not getting enough to eat

How to make envy vanish over time

The accuracy of the gaussian approximation to the sum of independent variates

A new solution to the random assignment problem

On the fair division of a random object

An envy-free cake division protocol

Fair Division: From cake-cutting to dispute resolution

Chernoff-hoeffding bounds for markov chains: Generalized and simplified

Mechanism design for fair division: Allocating divisible items without payments

Market equilibrium via a primal-dual-type algorithm

Fairness and utilization in allocating resources with uncertain demand

Aggregation of utility functions

The humanitarian pickup and distribution problem

Fair algorithms for learning in allocation problems

Dynamic fair division with minimal disruptions

Controlled dynamic fair division

Fair online allocation of perishable goods and its application to electric vehicle charging

Scott Shenker, and Ion Stoica. Dominant resource fairness: Fair allocation of multiple resource types

Achieving a fairer future by changing the past

Optimal control of vaccination dynamics during an influenza epidemic

A social welfare optimal sequential allocation procedure

No agent left behind: Dynamic fair division of multiple resources

Inherent trade-offs in the fair determination of risk scores

never seen anything like it': Cars line up for miles at food banks

Sequential resource allocation for nonprofit operations

How feds decide on remdesivir shipments to states remains mysterious

Fair dynamic rationing. Available at SSRN 3775895

Mechanisms for online organ matching

Fairness in deceased organ matching

Fair division and collective welfare

Algorithmic Game Theory

Achieving equity, effectiveness, and efficiency in food bank operations: Strategies for feeding america with implications for global hunger relief

How food banks use markets to feed the poor

Cake cutting: not just child's play

Modeling for the equitable and effective distribution of food donations under stochastic receiving capacities

Health equity and covid-19: global perspectives

Sequential fair allocation of limited resources under stochastic demands

Is fairness good? a critique of varian's theory of fairness

Equity, envy, and efficiency

Two problems in the theory of fairness

The bayesian prophet: A low-regret framework for online decision making

Online cake cutting

Fairness versus efficiency of vaccine allocation strategies

Fairness-efficiency tradeoffs in dynamic fair division

Gradient Condition: For every θ and k

Part of this work was done while Sean Sinclair and Christina Yu were visiting the Simons Institute for the Theory of Computing for the semester on the Theory of Reinforcement Learning. We also gratefully acknowledge funding from the NSF under grants ECCS-1847393, DMS-1839346, CCF-1948256, and CNS-1955997, and the ARL under grant W911NF-17-1-0094. We would also like to thank the Food Bank of the Southern Tier and the Cornell Mathematical Contest in Modeling for their collaboration.

w θ,k p k ≤ w θ , X θ with equality whenever X θ,k > 0.Scaling: Suppose thatÑ θ = (1 + ζ)N θ for every θ ∈ Θ. Setp k = (1 + ζ)p k for each resource k ∈ [K] andx θ,k = x θ,k /(1 + ζ). Verifying the KKT conditions for the prices and solutionsp andx verifies that these are in fact the optimal primal and dual variables assuming that p and x are. To show the last property we note thatTo show monotonicity with respect to the dual prices we use the tâtonnement algorithm developed in [43] to solve the Eisenberg-Gale program. At a high level, the algorithm starts off with initial prices p 0 k for k ∈ [K] which satisfy a tight set invariant (found by representing the allocations made via a network flow graph between resources and types). The tight set invariant is defined as:The algorithm then raises the prices p 0 k until all of the types have no surplus money. As such, consider initializing the algorithm to solve for the dual pricesp with the prices p. To show monotonicity with respect for the dual prices it suffices to show that the prices p satisfy the tight set invariant when the number of individuals of each type isÑ θ . However, this is trivially true due to the optimality of the prices p (as they must satisfy the tight set invariant with respect to N θ ). Indeed, for any S ⊂ [K] we must have that: As a result we see that the prices p k satisfy the invariant and can serve as an initialization to the algorithm. As the algorithm only increases the prices until reaching optimality, it gives us that p k ≤p k for every k ∈ [K].With this result we are able to show the monotonicity guarantee with respect to the utilities as:Lemma E.2. Suppose that n θ ≤ E[N θ ] and X = x((n θ ) θ∈Θ . Then we have that:Proof. (1): For the first property we again use the tâtonnement algorithm developed in [43] . Initially in the algorithm the prices are set as p 0However, if there is a resource which no type is interested in at the price p 0 k then the prices is reduced to p 1 k = max θ∈ΘThus we have thatw min w max β avg min = w min w max β avg min .(2): This property follows as θ n θ X θ = B and Cauchy-Schwarz inequality.(3): This property follows simply as the maximum allocation for any resource k is bounded above by B k .