key: cord-0105068-lz69qhn0
authors: Sinclair, Sean R.; Jain, Gauri; Banerjee, Siddhartha; Yu, Christina Lee
title: Sequential Fair Allocation of Limited Resources under Stochastic Demands
date: 2020-11-29
journal: nan
DOI: nan
sha: a56c843a122386a1df9d6000065a35be26114299
doc_id: 105068
cord_uid: lz69qhn0

We consider the problem of dividing limited resources between a set of agents arriving sequentially with unknown (stochastic) utilities. Our goal is to find a fair allocation - one that is simultaneously Pareto-efficient and envy-free. When all utilities are known upfront, the above desiderata are simultaneously achievable (and efficiently computable) for a large class of utility functions. In a sequential setting, however, no policy can guarantee these desiderata simultaneously for all possible utility realizations. A natural online fair allocation objective is to minimize the deviation of each agent's final allocation from their fair allocation in hindsight. This translates into simultaneous guarantees for both Pareto-efficiency and envy-freeness. However, the resulting dynamic program has state-space which is exponential in the number of agents. We propose a simple policy, HopeOnline, that instead aims to `match' the ex-post fair allocation vector using the current available resources and `predicted' histogram of future utilities. We demonstrate the effectiveness of our policy compared to other heurstics on a dataset inspired by mobile food-bank allocations.

Our work here is motivated by a problem faced by a collaborating food-bank (Food Bank for the Southern Tier of New York (FBST) [21] ) in operating their mobile food pantry program. Every day, the FBST uses a truck to deliver food supplies directly to distribution sites (soup kitchens/pantries/individuals/etc.). When the truck arrives at a site, the operator observes the demand there and chooses how much to allocate before moving to the next site. The number of people assembling at each site changes from day to day, and the operator typically does not know the demand of later sites (but has a sense of the demand distribution based on previous visits). Finally, the amount of food in the truck is usually insufficient to meet the total demand, and so the operator must under-allocate at each site, while trying to be fair across all sites. The question we ask is:

What is a fair allocation here, and how can it be computed ?

In offline problems where demands (more generally, utility functions or agent types) for all agents are known to the principal, there are many well-studied notions of fair allocation of limited resources. A relevant notion in our context is that a fair allocation is one satisfying three desiderata: Pareto-efficiency (for any agent to benefit, another must be hurt), envy-freeness (no agent prefers an allocation received by another), and proportionality (each agent prefers the allocation received versus equal allocation). This definition draws its importance from the fact that in many allocation settings it is known to be achievable. In particular, when goods are divisible, then for a large class of utility functions, an allocation satisfying both is easily computed (via a convex program) by maximizing the Nash Social Welfare (NSW) objective subject to allocation constraints [43, 18] .

Many practical settings, however, operate more akin to the FBST mobile food pantry, in that the principal makes allocation decisions online, with incomplete knowledge of demands (more generally, utility functions) of future agents. However, these principals do have access to historical data allowing them to generate histograms over utility functions for each agent. Designing good allocation algorithms in such settings necessitates harnessing the (Bayesian) information of future demands to ensure equitable access to the resource, while also adapting to the online realization of demands as they unfold. Guaranteeing Pareto-efficiency, envy-freeness, and proportionality simultaneously turns out to be impossible in such settings (cf. Lemma 2.3); the challenge thus is in defining meaningful notions of approximately-fair online allocations, and developing algorithms which utilize distributional knowledge to achieve such allocations.

Mobile Food Pantry. Recent demands for food assistance have climbed at an enormous rate, and an estimated fourteen million children are not getting enough food due to the COVID epidemic in the United States [29, 10] . With limitations on operating in-person stores, many foodbanks have increased their mobile food bank services. In these systems, the mobile food-bank must decide on how much food to allocate to a distribution center on arrival, without knowledge on the demands for locations to come.

Stockpile Allocation. In many healthcare systems, states decide how to assign critical resources to patients [17, 25] ; for example, the US federal government has recently been tasked with distributing Remdesivir, an antiviral drug used for COVID-19 treatment [31] . Another example is assigning patients to psychiatric beds, which have become more and more scarce in recent times [36] . These decisions are made online, and aim to satisfy individual demands while efficiently using available resources.

Reservation Mechanisms. These are key for operating shared high-performance computing (HPC) systems [23] . Cluster centers for HPC receive numerous requests online with varying demands for CPUs and GPUs. Algorithms must allocate resources to incoming jobs, with only distributional knowledge of future resource demands. Important to these settings is the large number of resources (number of GPUs, RAM, etc available at the center), requiring algorithms that scale to higher-dimensional problems.

We first formalize the online stochastic fair allocation problem described above, and demonstrate that in the online setting, there are distributions for which no policy can achieve Pareto-efficiency, envy-freeness, and proportionality over all realizations. This motivates studying approximate notions.

For any allocation, a natural (un)fairness score is the maximum (alternately, weighted sum) of the deviation of the realized utilities in terms of envy-freeness, (normalized) Pareto-efficiency, and proportionality. In the online setting, any such score gives rise to a natural policy to minimize the expected value of this score, which can be formulated as a Markov decision process (MDP). However, since these metrics depend on the entire allocation vector, the complexity of finding the optimal policy is exponential in the number of agents, and also, is difficult to interpret in practice. 2 Our main conceptual contribution is an alternate objective for online fair allocation, wherein we aim to minimize E X opt − X alg max , the maximum difference between the allocation X alg made by any algorithm, and the offline (i.e., ex post) fair solution X opt . An approximation for this objective gives a c -approximation for the fairness scores defined above (for some problem-specific constant c; see Lemma 2.7). The usefulness of this reformulation, though, is not immediately apparent, as it is still a high-dimensional objective. However, we then show how this reformulation allows us to harness recent ideas in model-predictive control to come up with simple algorithms with strong empirical performance.

Our proposed online allocation policy, Hope-Online, is based on re-solving information-relaxed optimization problems, where all future randomness is replaced with expected histograms. Hope-Online is simple, scales to multiple resources, and in experiments, generates allocations close to the optimal fair allocation in hindsight. Moreover, it is balanced across agents, in that the per-agent difference in allocations between earlier and later arriving agents is uniformly small. Thus, we believe Hope-Online is a promising candidate for practical online fair allocation. We do not believe that our work gives the final answer in defining fairness in sequential settings, but hope it will start conversation on how to formally incorporate ethics in sequential AI algorithms.

Before proceeding, we discuss some closely related work -a more extensive survey is provided in Appendix A.

Fairness in resource allocation, and the use of Nash Social Welfare, was pioneered in seminal work by Varian [43, 44] . Since then, researchers have investigated fairness properties for both offline and online allocation, in settings with divisible and/or indivisible resources, and when either the agents or resources arrive online; for a comprehensive survey, see [3] . Our work focuses on online multi-resource allocation in Bayesian settings; in this context, previous work is mostly limited to non-adaptive algorithms, or consider adversarial arrivals. More importantly, we target additive approximations for individual metrics, instead of approximating global objectives (Eg. maxmin/Nash social welfare) which do not directly give any individual guarantees (see Appendix B.2).

The most common line of work in online fair allocation considers settings where agents are static and items arrive over time [16, 47, 24, 9, 6] ; under stochastic arrivals, these tend to be easier as, intuitively, future allocations can be used to correct past imbalances. Closer to our setting are work on online cake cutting; this though is primarily under adversarial arrivals [46, 26, 42] . Finally, recent work considers upfront allocation of indivisible resources for stochastic demands [17, 20] ; these study similar tradeoffs between global objectives and individual guarantees as us, but are essentially static problems.

In terms of modeling, the closest work to ours is that of [30] , who consider sequential allocation with stochastic demands (arising from similar practical problems with foodbank operations), and propose heuristics for maximizing the minimum utility. Their policies are defined only for single resource settings 3 . Using maxmin utility as an objective however leads to some instabilities in allocation policies (for example, a high demand upfront may lead to all future agents getting very small allocations); we demonstrate how our approach improves on this in our experiments.

A principal is tasked with dividing K divisible resources among n agents. Each resource k ∈ [K] has a fixed budget B k that the principal can allocate. Each agent i ∈ [n] has an endowment (or size) S i ∈ R and utility function u(X i , θ i ) where θ i ∈ Θ is a latent type or preference of agent i, and X i ∈ R K denotes the normalized allocation of resources received by agent i (i.e., overall agent i receives S i X i units 4 ). We assume the set of types Θ is finite, and the utility functions u(X, θ) are L-Lipschitz, concave, and strictly increasing with respect to the allocation X. Finally, we use S = n i=1 S i to denote the 'effective size' of the population. In the ex-post or offline setting, agents' types {θ i } i∈ [n] are known in advance and can be used by the principal to choose allocations X ∈ R n×K for each agent. In the online setting the principal visits each agent sequentially in a fixed order i = 1, . . . , n. Upon visiting agent i, the principal learns their latent type θ i drawn from a known distribution F i , and must choose allocation X i ∈ R K before continuing to the next agent. Allocation decisions are irreversible, and must obey the overall budget constraints.

Notation: We use R + to denote the set of non-negative reals, and X max = max i,k |X i,k | to denote the matrix maximum norm, and cX to denote entry-wise multiplication for a constant c. For allocation X ∈ R n×K + , we use X i = (X i,1 , . . . , X i,K ) to denote agent i's normalized allocation, and B = (B 1 , . . . , B K ) the budget vector. When comparing vectors, we use X ≤ Y to denote that each component X i ≤ Y i . Hence, budget constraints can be written as n i=1 S i X i ≤ B. Choices of Utility Functions: In the context of food-bank allocations, we will consider two utility functions of interest. With a single resource, a common utility function is the so-called filling-ratio u(X, θ) = min X θ , 1 [30] . This corresponds to agents having linear utility until their allocation reaches their demand level θ, after which the utility is capped. While these utility functions are not strictly increasing, we show that all of the results extend to these functions in Appendix D.

For multiple resources, a common choice of utility functions are linear utilities where u(x, θ) = θ, x . Now the latent agent type θ ∈ R K + denotes a vector of preferences over each of the different Figure 1 : The waterfilling solution for maximizing NSW with a single divisible resource and fillingratio utilities (agents on x-axis, demands/allocations on y-axis). The optimal NSW solution finds a threshold such that the sum of the areas below the demand and threshold equals the budget B.

resources. More details on modeling agent utilities and sizes are in Section 4 and Appendix E.

The assumption that latent types Θ are finite is common in decision-making settings, as in practice, this distribution over types is approximated from historical data. One limiting assumption is that in the online setting, the principal only knows the latent type of one agent at a time. In reality the principal could have some additional information about future types (via calling ahead, etc) that could be incorporated in deciding an allocation. Our algorithmic approach naturally incorporates such additional information.

To define a fair allocation in the offline setting (i.e., with known types {θ i } i∈[n] ), we adopt an approach proposed by Varian [43] , which is widely used in the OR and economics literature and commonly referred to as 'Varian Fairness'. We will refer to this as fairness for brevity, but for a more detailed discussion on the advantages and limitations of this model, see Appendix B. 

is said to be fair if it simultaneously satisfies the following:

1. Envy-Freeness (EF): For every pair of agents i, j, we have u(X i , θ i ) ≥ u(X j , θ i ).

Y = X such that u(Y i , θ i ) > u(X i , θ i ) for some agent i, there exists some other agent j such that u(Y j , θ j ) < u(X j , θ j ).

While the three properties form natural desiderata for a fair allocation, the power of this definition lies in that asking for them to hold simultaneously rules out many natural (but unfair) allocation policies. For example, in the food-bank setting with a single resource and filling-ratio utilities, any allocation that either exhausts the budget or meets the total demand ( i θ i ) is Pareto-efficient. One example of this is a Greedy algorithm that assigns X i = θ i , until running out of resources. However, the algorithm is not envy-free as any agent who receives no resources will be envious of an agent who does. On the other hand, the Equal-Allocation algorithm which assigns X i = B S units of resource to each agent trivially achieves EF and PO, but is not necessarily PE (e.g. when θ i < B/S for some i and θ j > B/S for some j). More generally, allocation rules based on maximizing a global function such as utilitarian welfare (sum of agent utilities) or egalitarian welfare (the maximin allocation, or more generally, the leximin allocation [12, 30] where one maximizes the minimum utility, and subject to that the second minimum, and so on) are Pareto-efficient, but tend to violate envy-freeness, as they focus on global optimality rather than per-agent guarantees. A remarkable exception to this, however, is the Nash Social Welfare n i=1 u(X i , θ i ) S i /S : Proposition 2.2 (Theorem 2.3 in [43] ). An allocation that maximizes the Nash Social Welfare is Pareto-efficient, envy-free, and proportional (hence fair).

In addition to simultaneously ensuring PE, EF and Prop properties, the NSW maximizing solution can also be efficiently computed via the following convex program called the Eisenberg-Gale program [18] , obtained by taking the logarithm of the Nash Social Welfare:

For a single divisible resource with filling-ratio utilities, the optimal solution to this program is found via a waterfilling algorithm, illustrated in Fig. 1 (see Section 4 for more details). Under more general utility functions and multiple resources, the optimal solution is more complex, but can be efficiently computed via standard convex programming techniques [13] . These two properties (that a maximizing allocation for NSW is easy to compute, and satisfies the fairness criteria simultaneously) is key to our proposed online allocation policies.

Recall that in the online setting the principal visits each agent sequentially in a fixed order i = 1, . . . , n, whereupon visiting agent i the principal sees their latent type θ i ∼ F i and decides on an allocation before continuing to the next agent. A natural approach to obtain fair allocations in this setting is to develop allocations which satisfy Pareto-efficiency, envy-freeness, and proportionality ex-post. However, we start with a negative result, showing that such an approach is infeasible even with two agents. Lemma 2.3. For n = 2 agents with filling-ratio utilities there exists type distributions F 1 and F 2 such that no online algorithm can guarantee ex-post envy-freeness and Pareto-efficiency almost surely.

Proof. Consider a setting with B = 2, and each agent having S i = 1 and demand type θ ∼ F i = Uniform(1 − δ, 1 + δ) for some arbitrary constant 0 < δ < 1. With probability 1 2 the algorithm will observe that the first agent has a demand of 1 + δ. When the second agent has demand 1 + δ the optimal fair allocation in hindsight will be X opt = (1, 1). When the second agent has demand 1 − δ the optimal solution will be X opt = (1 + δ, 1 − δ). Moreover, each allocation is the unique fair solution for the sequence of demand types (1 + δ, 1 + δ) and (1 + δ, 1 − δ) respectively. Hence, no algorithm can achieve the ex-post fair allocation on all sample paths. Lemma 2.3 shows that simultaneously achieving ex-post envy-freeness and Pareto-efficiency is futile, and hence we need to consider approximate fairness notions. While the example is trivial, it highlights the true difficulty in designing online allocations. Any 'fair' online allocation upon visiting agent i must adapt to realized types thus far ({θ j } j≤i ) and exploit the type distribution for future agents ({F j ) j>i }). It also highlights the difficulty in designing approximate fairness notions, as any such approach must be constructed with individual guarantees in mind.

On that note, a reasonable modified fairness criteria is to seek allocations X alg ∈ R n×K that minimize the distance from (ex-post) envy-freeness, Pareto-efficiency, and proportionality. From a practical perspective, moreover, while measuring ex-post envy and distance to proportionality is simple, measuring the distance of an allocation to Pareto-efficiency is not straightforward. An important proxy, however, is the resource waste B − i S i X i under any allocation; this follows from observing that any PE algorithm necessarily has no waste (see Appendix D for proof and discussion of an alternative approach):

We now define our proposed online fairness yardstick: 

to Pareto-efficiency as

the distance of {X alg i } i∈[n] to proportionality as

Note these are all random quantities, depending on both the realized types but also randomness in the allocation algorithm; moreover, any fair allocation necessarily has all of these quantities bounded above by zero. The distance to envy-freeness can be thought of as the worst-case envy of any individual, mimicking Definition 2.1; the distance to Pareto-efficiency is taken to be the average per-agent excess of resources wasted by the algorithm; the distance to proportionality is the worst-case loss agent experiences under their allocation compared to equal allocations. The normalization ensures that all are measured on the same scale.

A natural definition of an optimal online fair allocation now is one which minimizes

(or alternately, any weighted linear combination). Given type distributions {F i } i∈[n] , finding such an allocation gives rise to a high-dimensional MDP, as each of the properties depends on the entire matrix of allocations. Moreover, since the problem has no obvious structure, the optimal solution may not have a simple form, and can be difficult to interpret.

To get around this, we consider an alternate objective function. Let X opt be the NSW maximizing solution in hindsight (i.e., given realized types {θ i }, solving Eq. (1)). We instead seek allocations that try to uniformly minimize the expected difference between X alg and X opt , which we know to be a fair allocation. Definition 2.6 ( -Fair Allocation). Given type distributions {F i } i∈[n] and utility functions {u(·, θ)} θ∈Θ , we say an online allocation algorithm X ∈ R n×K + is -fair if n i=1 S i X i ≤ B almost surely, and moreover,

where X opt is the NSW maximizing solution in hindsight.

One advantage of this definition is that any allocation algorithm which is -fair also satisfies similar -additive guarantees in expectation for the earlier defined metrics (Definition 2.5) as established below.

Lemma 2.7. Suppose that an algorithm X alg satisfies E X opt − X alg max ≤ . Then we have

• Approximate Envy-Freeness:

• Approximate Pareto-Efficiency:

Proof. (see Appendix D) Follows directly from definitions and Lipschitzness of utility functions.

In this section we present Hope-Online and its counterpart Hope-Full, scalable algorithms that approximate the Nash Social Welfare solution. These solutions are motivated by approximation algorithms to dynamic programming solutions generated by resolving relaxed versions of the optimization problems [45] . Moreover, the allocation rule for the algorithm corresponds to an easily computable policy, is interpretable, and scales well under multiple resources. Our main algorithm, Histogram of Preference Estimates Online, or Hope-Online, arises from the observation that a natural approximation algorithm arises from replacing unknown quantities in Eq. (1) with their distribution. Moreover, in any envy-free allocation, two agents i and j with the same type θ will acquire the same allocation. Hence, we can rewrite Eq. (1) as follows:

where the allocation to any agent i with type θ is X θ . The Hope-Online algorithm now approximates the NSW allocation in Eq. (2) by re-solving the above program while replacing the unknown quantities

with their expectation (P(θ i = θ)). In more detail, for the i th agent, given the current budget vector B i , we re-solve the Eisenberg-Gale program in Eq. (2) with all future demand replaced by the expected future histogram over types. Formally, at iteration i, the algorithm observes the latent type θ i for agent i and allocates X alg i = X θ i according to the solution to:

is the current available resources, and the expected histogram over types is defined as

Note here that as the type θ i for agent i is observed, the probability for agent i is replaced by the Dirac-δ function on the observed value.

An alternative algorithm is Histogram of Preference Estimates Full, or Hope-Full, which follows the same idea but instead solves the Eisenberg-Gale program with all agents (including agents already visited) to decide an allocation. At iteration i, the algorithm observes the types {θ j } j≤i and allocates to agent i the allocation X alg i = min(X θ i , B i ) according to the solution to:

and the expected histogram over types now given by:

This forms a natural estimator for Eq. (2), in that for the last agent, it solves the exact same optimization problem as Eq. (2). However, experiments show that its performance is worse than Hope-Online, as the algorithm uses the original budget B instead of the remaining budget.

We now test our algorithms on both single and multiple resource allocation settings, with experiment parameters based on food-bank allocation data. For the full experimental details, details on the dataset, run-time analysis, and measures of variance of these results see Appendix E.

In each experiment, we compare the allocations X alg given by the Hope-Online algorithm to the NSW maximizing allocation in hindsight X opt (i.e. the solution to Eq. (2)). We report multiple performance measures, including the maximum allocation deviation E X alg − X opt max , the agent-by-agent difference in allocations E X alg i − X opt i , and expected approximate envy-freeness, Pareto-efficiency, and proportionality (see Definition 2.5). Finally, to benchmark the performance of Hope-Online, we simulate several alternate heuristics, including the Hope-Full heuristic, as well as other optimization-based approaches discussed in Appendix C. Table 1 : Comparison of fairness metrics (averaged over 1000 replications) on the single-resource online allocation problem with filling-ratio utilities. We compare the four unfairness metrics from Definitions 2.5 and 2.6 (larger values correspond to lower scores; best value highlighted), over the two datasets described in Section 4.1 (i.i.d discretized Gaussian demands with n = 100, demands generated from FBST dataset with n = 6). Hope-Online is best or second-best across all metrics and settings (note that Greedy naturally minimizes waste, while Adaptive-Threshold ensures Proportionality by design).

n = 6 n = 100 n = 6 n = 100 n = 6 n = 100 n = 6 n = 100 

We first consider a simple single-resource variant of the food-bank allocation problem motivated in Section 1.1. A mobile food pantry loads up the truck at the start of the day with a fixed number of 'meals' B, and travels sequentially from one drop-off location to the next. In each location i, they observe a demand θ i ∈ R + , make an allocation, and then proceed to the next food drop-off point. We consider filling-ratio utilities u(X, θ) = min( X θ , 1). These utility functions are of interest to food-banks as it serves as a common metric used in evaluating the effectiveness of a food bank. In particular, the filling ratio a food bank is able to provide to each of the distribution sites often serves as one component of a 'score' used in markets deciding how many resources a food bank receives [37] .

We start by including a structural result. While the filling-ratio utilities chosen are not monotonically increasing (due to the min), the Eisenberg-Gale program in Eq. (1) still guarantees a fair allocation. Moreover, the optimal solution can be characterized via a Waterfilling solution (see Fig. 3 ). Since Hope-Online also allocates resources according to the solution to an optimization problem of a similar form (albeit with different weights, based on current available information), the resulting allocation also takes a waterfilling form, with the waterfilling threshold adjusted over time.

Theorem 4.1. An optimal solution to Eq. (1) for a fixed set of agent demands {θ i } i∈[n] is given by a waterfilling policy

where the waterfilling threshold w f solves

Moreover, this allocation is Pareto-efficient, envy-free, proportional, and hence fair.

Choice of Demand Distribution. We perform several synthetic experiments, with parameters based on food bank demand data. We present results for two simulations here, and defer others to the appendix. The first simulation considered the performance of Hope-Online with respect to the number of agents n. Here we model the demand distribution F i as an i.i.d. discretized Gaussian distribution with mean 15 and variance of 3. The second simulation considers a more realistic scenario, where the number of agents n = 6 (corresponding to the six counties in the Southern Tier of New York) and a demand distribution as a discretized Gaussian with mean and variance generated from a demand histogram collected from a dataset on food demands by county.

In the experiments we set endowment variables S i = 1 as the demand θ i represents the 'size' of an agent. We also set the budget as the expected sum of demands. From a practical perspective, this is a typical choice made by food banks as to how many meals/boxes to prepare; theoretically, this is meaningful, as the optimal solution will give everyone an equal allocation if the budget is smaller than n times the smallest possible demand, and give everyone their demand if the budget is larger than n times the largest possible demand, both of which make the online problem trivial. The expected sum of demands is between these two bounds, forcing the online algorithms to make non-trivial decisions that can significantly impact their performances.

Heuristics. We compare Hope-Online to several natural online fair-allocation heuristics:

• Greedy: every agent is given its demand θ i until the budget is depleted.

• Adaptive-Threshold: Allocate X i = min(B i /(n − i), θ i ), the equal share of the remaining budget.

• MaxMin: The heuristic proposed in [30] for optimizing the maxmin allocation.

• Hope-Full: Re-solves Eq.

(2) over all agents, using predictive-histogram for future agents (cf. Section 3). • ET-Online, ET-Full: Alternate model-predictive heuristics based on re-solving Eq. (2) with expected utility functions (See Appendix C)

Distance to Fair Allocation in Hindsight. In Fig. 2 we compare E X opt − X alg ∞ , the difference in the allocation generated by the algorithm and the optimal fair solution in hindsight as we vary the number of agents n. We see that Hope-Online generates allocations close to the optimal solution in hindsight. Unsurprisingly, the Greedy and Adaptive-Threshold algorithms perform worse with an increasing number of agents. Thus Hope-Online does well at ensuring approximate individual fairness guarantees (Definition 2.6). Moreover, for the 1 norm between the allocations, our experiments show Hope-Online has sub-linear dependence with n (see Appendix E).

Group-Based Differences in Allocation. In Fig. 3 , we study if the algorithms tend to under/over allocate based on the order of arrival. For this, we set n = 100, and compare E X opt i − X alg i for each agent i = 1, . . . , 100 (averaged over 100 replications). Hope-Online gives a uniform approximation to the offline NSW solution, with small performance degradation for later agents. In contrast, MaxMin is too pessimistic and under-allocates to all agents.

Online Fairness Metrics. In Table 1 we compare each algorithm on the fairness metrics for a fixed number of agents (on the two problem set-ups described). With these metrics we see that Hope-Online performs competitively across a wide variety of distributions and problem set-ups. Moreover, in Appendix E we show further simulation results, showing that Hope-Online even outperforms MaxMin on the metric of the minimum fill rate (E[min i u(X i , θ i )]), a metric MaxMin was designed to maximize!

We next consider multiple resource allocation settings, with linear utility functions u(x, θ) = θ, x . We use n = 6 agents, with sizes S i corresponding to population of the six counties of the Southern Tier of New York. For the utility functions, we use data from [37] on the 'prices' on different resources in the non-monetary mechanism used to allocate resources to major food bank distribution centers; we consider a subsection of products presented: cereal, diapers, pasta, paper, prepared meals, rice, meat, fruit, and produce. To generate our preference distributions, we created eight different preference profiles, with each θ k = Bernoulli(1/2)w k where w k is the price of the product. The type distribution is chosen to be uniform over these types.

Heuristics We again test Hope-Online against the alternative heuristics introduced before (note that Greedy, Adaptive-Threshold and MaxMin do not extend to multiple resources).

An additional heuristic here is Proportional allocation, which assigns X i = B/S to each agent. By design, this allocation satisfies ∆ P E , ∆ EF , and ∆ P rop = 0. However, the allocation can be arbitrarily bad in terms of Pareto-efficiency, and so we omit it from the comparison here.

Online Fairness Metrics. In Table 2 we compare each algorithm on the fairness metrics for a fixed number of agents n = 6 averaged over 100 simulations. With these results we again see that Hope-Online performs competitively across each benchmark compared to the other algorithms. Moreover, it performs the best in terms of minimizing the distance to the optimal allocation in hindsight, E X opt − X alg max .

In this paper we considered online allocation of divisible resources. In the offline setting achieving a fair allocation scheme is found by maximizing the Nash Social Welfare objective. In the online setting, however, we showed that no algorithm can achieve ex-post fairness almost surely. In light of this, we defined a new notion of approximate fairness, where an approximately fair allocation algorithm is one which is close to the fair Nash Social Welfare maximizing solution in hindsight. This definition is natural, as any approximately fair algorithm satisfies approximate counterparts to Pareto-efficiency, envy-freeness, and proportionality. While the usefulness of this objective is not immediately apparent, we show that it leads to a simple algorithm Hope-Online which approximates the offline solution by solving informationrelaxed versions of the Eisenberg-Gale program, where unknown quantities are replaced with their histogram. Through experiments we show that Hope-Online leads to allocations with fairness properties competitive to several benchmarks and prior work.

Although fairness in resource allocation is well-studied in the offline setting, fairness metrics for the sequential setting are very poorly understood. Our proposed metrics (Definition 2.5) give a novel way to extend Varian's definitions to the sequential setting. While we do not believe our work gives the final answer in defining fairness in sequential settings, we hope it starts a conversation on how to formally incorporate ethics and fairness constraints in sequential allocation problems.

Fairness in resource allocation, and the use of Nash Social Welfare, was pioneered by Varian in his seminal work [43, 44] . Since then, researchers have investigated fairness properties for both offline and online allocation, in settings with divisible or indivisible resources, and when either the agents or resources arrive online. We now briefly discuss some related works; see [3] for a comprehensive survey. What distinguishes our setting from many of the previous works is that we consider the online Bayesian setting with a known distribution. Many previous works are either limited to offline or non-adaptive algorithms, or consider adversarial online arrivals.

Food Bank Operations: There is a growing body of work in the operations research literature addressing logistics and supply chain issues in the area of humanitarian relief and food distribution [40, 35] . The research focuses on designing systems which balance efficiency, effectiveness, and equity. In [19] they study the logistical challenges of managing vehicles with limited capacity to distribute food and provide routing and scheduling protocols. In [30] they consider sequential allocation with an alternative objective of maximizing the minimum utility (also called the leximen in the literature [34] ). We instead consider sequential allocation of resources under the Nash Social Welfare objective to obtain equitable allocations [43] .

Cake Cutting: Cake cutting serves as a model for dividing a continuous object (whether that be a cake, advertisement space, land, etc) [14, 38] . Under this model, prior work considers situations where individuals arrive and depart during the process of dividing a resource, where the utility of an agent is a set-function on the interval of the resource received. Researchers analyze the offline setting to develop algorithms to allocate the resource with a minimal number of cuts [15] , or online under adversarial arrivals [46] . Our model imposes stochastic assumptions on the utilities for arriving agents and characterizes probabilistic instead of sample-path fairness criteria.

Online Resources: One line of work considers the resource (here to be thought of as the units of food, processing power, etc) are online and the agents are fixed [11, 4, 33, 32, 2] . In [47] they study the tradeoffs between fairness and efficiency when items arrive adversarially. Another common criteria is designing algorithms which are envy-free up to one item, where researchers design algorithms that can reallocate previously allocated items, but try to minimize these adjustments [24, 7] .

Online Agents: The other setting considers agents as arriving online and the resources as fixed. In [27] they consider this setting where the resources are indivisible with the goal of maximizing utilitarian welfare (or the sum of utilities) which provides no guarantees on envy-freeness. Another approach in [22] considers a scheduling setting where agents arrive and depart online. Each agent has a fixed and known arrival time, departure time, and demand. The goal then is to determine a schedule and allocation which is Pareto-efficient and envy-free. We instead consider a stochastic setting where each agent has a distribution on utilities and seek algorithms which satisfy probabilistic versions of Pareto-efficiency and envy-freeness.

Non-adaptive Allocations: A separate line of work considers fairness questions for resource allocations in a similar setting where the utilities across groups are drawn from known probability distributions [17, 20] . This line of work investigates probabilistic versions of fairness, where the goal is to quantify the discrepancy between the objectives of ensuring the expected utilization of the resources is large (ex-ante Pareto-optimal), while the probability of receiving the resource is proportional across groups (ex-ante proportional). However, they consider algorithms which decide on the entire allocation for each agent upfront before observing the utilities for any individuals rather than adaptive policies.

Adaptive Allocations: In contrast, we consider a model where the principal makes decisions on how much of the resource to allocate after witnessing an agent's type, where all future types are unknown. Most similar to our work is recent work analyzing a setting where agents arrive over time and do not depart, so that the algorithm can allocate additional resources to agents who arrived in the past [28] . We instead consider a stochastic setting where agents arrive and depart in the same step with the goal of characterizing allocations that cannot reallocate to previous agents. Other papers either seek competitive ratios in terms of the Nash Social Welfare objective [6, 9, 8] , or derive allocation algorithms which perform well in terms of max-min [30] .

Economists, computer scientists, and people in the operations research literature have become increasingly interested in questions of fairness [41, 43, 44] . One particular concept of fairness which has gained wide circulation due to its ease in compatibility is the so-called notion of 'Varian fairness' pioneered by Hal Varian in the 1970s taken in Definition 2.1. This theory of fairness provides three criteria for judging a given allocation of resources: envy-freeness, Pareto-efficiency, and proportionality, all defined with respect to utilities an agent has for different allocations. These criteria serve as more of a classification than an optimization perspective, as each of them merely provides a true/false criteria for an allocation to satisfy fairness rather than a way an allocation can approach fairness. Numerous other researchers have proposed other definitions of fairness, including α-fairness obtained by instead maximizing [34, 5] :

In this definition, taking α = 1 recovers utilitarian welfare, or maximizing the sum of utilities. Taking α → −∞ also recovers the leximin objective, and α → ∞ the leximax objective.

One primary critique of 'Varian fairness' is that a Varian fair allocation may not exist at all. Moreover, the implication is that if a Varian-fair allocation exists, then it has special merit. While we specifically consider settings where a 'Varian fair' allocation always exists (and is remarkably found as a result of optimizing the Nash Social Welfare objective), it is important to consider some of the several downsides of this model.

Scale Invariance: The concept of fairness is strictly operational, in the sense that it requires no more information than what is contained in an agent's utility function. Settings like matching students to local schools via school choice require definitions which measure the 'utility of replacement' [1] . As an example, a student with preferences (School A, School B, School C) in descending order gets matched to School B. How can we measure the overall gain to society when the student is instead matched to School A, or School C in comparison to another students list of preferences? In Varian's definition of fairness, utility functions are only used to exhibit an ordering on preferences, rather than a relative value on different outcomes.

We believe the settings considered in Section 1.1 are well suited to Varian's model on fairness.

In the example motivated with the Food Bank of the Southern Tier of New York, agents correspond to individual distribution sites, whether that be a soup kitchen, a drop-off location for the mobile food bank, etc. In these settings, locations have use for all resources with strictly increasing utility with respect to the resource allocated. This motivates using scale invariant measures, as every agent will be able to use all available food allocated to them. Considering processor assignment in cloud computing platforms, each individual request coming in should be treated independently and symmetrically.

One approach on obtaining fairness guarantees for an online algorithm could be in the form of a competitive ratio. These results find allocation algorithms X alg to which you can construct a bound on the competitive ratio for the Nash Social Welfare (or its logarithm):

.

While theoretically interesting as the typical Nash Social Welfare (or logarithm in the Eisenberg Gale program) are of a different form than typical competitive ratio guarantees done in computer science, these results provide no immediate individual fairness guarantees. The motivation for the Eisenberg-Gale program arises from the fact that fairness is a byproduct. To some extent, the actual objective value of an allocation is meaningless, and the objective is only taken as it serves as a proxy to obtain fair allocations. Ensuring a good competitive ratio has no direct guarantees on individual fairness. In many applications of resource allocation, stakeholders are more interested in obtaining individualized guarantees than global guarantees on social welfare. This motivated our alternative approach of designing algorithms with individualized guarantees in mind.

Here we describe the other heuristic algorithms considered in Section 4, and compare our approaches to similar approaches developed in [30] for the max-min objective.

A second derivation of heurstic algorithms is to notice that in many practical applications, the agent types θ ∈ Θ are in R K , corresponding to 'preferences' for each item. One simple heuristic approach in designing online allocation schemes is to replace θ i in the offline Nash Social Welfare optimization problem (Eq. (1)) with its expectation E[θ i ]. This leads to two simple heuristic algorithms which use the Expected Types and either the Online or Full information over previous agents.

In particular, the ET-Online algorithm at every iteration i, observes the type θ i and resolves the Nash Social Welfare objective with the current available resources and future agents types are replaced with their expectation. In particular, at every iteration i, the algorithm allocates X alg i = X i according to the solution to:

A similar approach would be the ET-Full algorithm that mimics Hope-Full by utilizing all of the prior observed types in designing an allocation. In particular, this heuristic allocates X alg i = min(X i , B i ) where X i is the solution to:

Similar to Hope-Full, this solves the exact same optimization problem as the offline optimal fair solution Eq. (2) for the last agent.

One downside to both ET-Online and ET-Full is that the expected type will not necessarily be in the support of the distribution. This negatively impacts the allocation returned by the algorithm since the offline Nash Social Welfare solution will only take into account observed types which fall into the support of the distributions. As observed in Section 4 these algorithms have worse performance compared to Hope-Full and Hope-Online.

Here we provide a brief explanation for the heuristic allocation algorithm for a single-resource and filling ratio utilities from [30] . This algorithm was set-up to approximate the optimal solution to the max-min objective:

u(X i , θ i ) which aims to uniformly (across all agents) maximize the filling ratio. Now the measure of performance for these algorithms is defined as ∆ M M = min i u(X alg i , θ i ). The MaxMin heuristic (called Two Node Decomposition Heuristic Algorithm in [30] ) arises from a specific form of the dynamic programming solution to the two-agent problem. In particular, the heuristic algorithm proceeds in three phases.

• Decomposition: The n agent resource allocation problem is decomposed into a sequence of two agent allocation problems.

• Supply Allotment: For each two agent problem, the available budget is divided so that each portion is utilized to solve the different two-agent problems, with the rest saved for the agents who are yet to be visited. The allotmentB i for a two agent problem for agents i and i + 1 is calculated byB

n j=i µ j where B i is the current budget remaining and µ i = E[θ i ].

• Resource Allocation: For each two node problem consisting of agent i and agent i + 1 the allocation being made is a threshold policy as follows:

(m i +m i+1 )/2 and β i min = min j<i u(X alg j , θ j ) is the minimum fill rate thus far.

We use this algorithm as a benchmark in Section 4 since it considers a similar problem set-up and utility function but with a different objective. We compare our algorithms to MaxMin, including the metric E[∆ M M ] to which this algorithm was constructed for in Appendix E.

In this section we include the omitted proofs from the main paper. We restate each of them here for ease of presentation.

Lemma D.1 (Lemma 2.7) . Suppose that an algorithm X alg satisfies E X opt − X alg max ≤ . Then we have

• Approximate Envy-Freeness:

• Approximate Pareto-Efficiency:

Proof. Notice that for any agent i, by assumption on the utility functions being L-Lipschitz continuous we have that

Each of the three properties follows then by the fact that X opt is the optimal fair solution in hindsight.

Approximate Pareto-Efficiency: Notice that X opt is Pareto-efficient, and so by Proposition 2.4 we have that for any resource k, B k = n i=1 S i X opt i,k . Thus we have that

Thus we get by taking the maximum over k that

Approximate Proportionality: By adding and subtracting u(X opt i , θ i ) and using the fact that X opt is proportional so u(X opt i , θ i ) ≥ u(B/S, θ i ) we get:

Since this inequality is true almost surely for any i we can take the maximum of the left hand side over i and the expectation of both to show that E[∆ P rop ] ≤ L .

Approximate Envy-Freeness: Again by adding and subtracting and noting that for any j as the optimal solution is envy-free u(X opt j , θ i ) ≤ u(X opt i , θ i ) for any j and i we have that

As this is true for any i and j, we take the maximum over all i and j on the left hand side and expectations to show that E[∆ EF ] ≤ 2L .

From a practical perspective, measuring the distance to Pareto-efficiency is not straightforward. One important proxy in the context of food bank resource allocation is waste, defined as B − n i=1 S i X i under an allocation X. This is the vector of leftover unallocated resources by an allocation.

Proposition D.2 (Proposition 2.4). If an allocation X ∈ R n×K is Pareto-efficient, then we have

Proof. Suppose for sake of contradiction we have an allocation X ∈ R n×k which is Pareto-efficient, but there exists a resource k such that i S i X i,k < B k . Consider a new allocation Y such that Y i,k = X i,k + B k − i S i X i,k for any agent i with strictly increasing utility for resource k. Then Y is still a valid allocation, and agent i has strictly higher utility under this allocation (⇒⇐) An alternative expression would be to define the distance to Pareto-efficiency in terms of the maximum difference between the allocation considered and a Pareto-efficient one. In particular we have:

Definition D.3. Given a set of agent types {θ i } i∈[n] and utility functions {u(X, θ)} θ∈Θ and budget B we define the set of Pareto-efficient allocations as:

With this, the alternative distance to Pareto-efficiency for an allocation X alg is defined as

Note that under this definition, both the allocation X alg given by the algorithm is a random variable (depending on randomness in the algorithm), but also the set P E({θ i } i∈[n] , B) (as its definition depends on the realized types for each agent). With this, we notice that our definition of an -optimal allocation algorithm also satisfies guarantees in terms of this alternative definition for distance to Pareto-efficiency. In particular we have Lemma D.4. Suppose that the algorithm X alg satisfies E X alg − X opt max ≤ . Then we also have that E ∆ P E ≤ .

Proof. This follows from noting that the optimal solution X opt is Pareto efficient based on the realized types and so belongs in P E({θ i } i∈[n] , B). Thus we have that ∆ P E ≤ X alg − X opt . The result follows from taking the expectation of both sides.

In this section we specialize the previous results to the single-resource case with filling ratio utilities u(X, θ) = min( X θ , 1). Unfortunately, as these utility functions are not strictly increasing many of the previous proofs do not directly follow under these utility functions. However, we show respective definitions and results which follow from efficient solutions to the Eisenberg-Gale program with homogeneous concave utility functions [39] . 

Moreover, this allocation is Pareto-efficient, envy-free, proportional, and hence fair.

Proof. First we show that the optimal solution to Eq. (1) is the waterfilling solution by showing it satisfies the KKT conditions, before proceeding to show that the solution is also fair.

Let X opt i be the proposed optimal solution. In taking the Lagrangian of the optimization problem we introduce dual variables λ for the constraint that X ≥ 0 and µ for the constraint that n i=1 S i X i ≤ B. This yields the following Lagrangian optimization problem:

Taking the sub-gradient and using the complementary slackness conditions we get the following conditions:

Since X opt i = 0 we can safely set the dual variables λ i = 0. Moreover, we also have that the subgradient of the log utility term is:

To summarize we have the following conditions after eliminating the dual variables λ:

First consider the case when the dual variable µ = 0. Then by the subgradient condition we must have that X i ≥ θ i for every i, and by feasability that n i=1 X i ≤ B. The waterfilling solution will then have a waterfilling level w f = max i=1,...,n θ i and the allocation will be X opt i = θ i . This solution satisfies all of the conditions listed, and so will be optimal.

For the case when µ = 0 then we must have that n i=1 S i X i = B by complementary slackness. Moreover, the gradient condition asserts that X i ≤ θ i for every i. Without loss of generality we will assume that θ 1 ≤ θ 2 ≤ . . . ≤ θ n and break into cases:

S . In this case the optimal solution X opt i = B S = w f is the waterfilling solution. This is as we can set µ = S B and check that all of the KKT conditions hold. Case II: θ n < B S . This is impossible as we must have that n i=1 S i X i = B ≤ n i=1 S i θ i < B. Case III: Otherwise let µ = 1 w f where w f is the waterfilling level. Then for any agent i such that

Moreover, for any agent such that X opt i < θ i then the allocation is

Noting that this mimics the subgradient condition we see that the waterfilling solution satisfies the complementary slackness conditions and so is optimal.

Next we show that the waterfilling solution is envy-free, proportional, and Pareto-efficient. We start by showing that w f ≥ B S . Indeed,

Summing up from i = 1, . . . , n and multiplying the terms by S i we find that

Now suppose for the sake of contradiction that there exists some other allocation Y ∈ R n such that u(X opt i , θ i ) < u(Y i , θ i ) and u(X opt j , θ j ) ≤ u(Y j , θ j ) for every j = i. Then we must have that X opt i < θ i as otherwise the utilities would both be one, and so X opt i = w f . Moreover, by definition of the waterfilling level this implies that n i=1 S i θ i > B and so n i=1 S i X opt i = B. Consider any agent j = i, as u(Y j , θ j ) ≥ u(X opt j , θ j ) we must have that Y j ≥ X opt j as the utilities are increasing. Hence we see that

which contradicts Y being a valid allocation (⇒⇐).

Proportional : Note that for any group i such that X opt i = θ i then

Similarly if X opt

Envy-Free: Consider a group i and let j be any other group. If X opt i = θ i then u(X opt i , θ i ) = 1 ≥ u(X opt j , θ i ) so the agent is trivially envy free. Otherwise, if X opt i = w f then for any group j either X opt j = θ j or X opt j = w f . If X opt j = w f then clearly u(X opt i , θ i ) = u(X opt j , θ i ). However, if X opt j = θ j then it must be true that θ j ≤ w f and so 

Here we provide a discussion on all of the simulations conducted. Some of this will be a repeat of Section 4 while adding measures of variance of the results. For ease of presentation we include the same benchmark algorithms discussed previously and include further discussion and other heuristic algorithms in the attached code base. Moreover, all figures are deferred until after the discussion to save on space. Each simulation uses data arising from our motivation in resource allocation for Food Banks, and we compare the algorithms on the approximate fairness definitions from Definition 2.5 and Definition 2.6.

We start by discussing the single-resource variant of the food bank allocation problem motivated in Section 1.1. In this setting, a mobile food pantry loads up the truck at the start of the day with a fixed number of 'meals' B, and travels sequentially from one drop-off location to the next. At each location i, they observe a demand θ i ∈ R + drawn from a known distribution F i , make an allocation X alg i , before proceeding to the next drop off point. We consider the filling ratio utilities u(X, θ) = min( X θ , 1). These utilities are of particular importance to food-banks for several reasons: • Feeding America collects millions of pounds of food donations across the United States and uses a centralized allocation mechanism to redistribute these resources to food banks across the country. As a proxy for money, each food bank is given a daily budget to use in the auction based on their 'Goal Score'. Major components of this score include population, local supply, but also efficiency, i.e. the relative demand the food bank is able to satisfy for its distribution sites [37] . Filling ratio utilities serve as a proxy for ensuring efficiency across different drop-off locations and distribution sites.

• The utility functions are normalized by the relative demand of the different locations, placing distribution sites of varying sizes at equal levels.

We start by discussing the structural result. While the filling ratio utilities chosen are not monotonically increasing (due to the minimum), the Eisenberg-Gale program in Eq. (1) still guarantees a fair allocation. Moreover, the optimal solution can be characterized via a Waterfilling solution (seen in Fig. 3) .

Theorem E.1. An optimal solution to Eq. (1) for a fixed set of agent demands {θ i } i∈[n] is given by a waterfilling policy

where the waterfilling threshold w f solves

Moreover, this allocation is Pareto-efficient, envy-free, proportional, and hence fair.

Important to note, is that the allocation algorithms discussed, (Hope-Online, Hope-Full, ET-Online, ET-Full) all make allocation decisions based on a formulation of the Eisenberg-Gale where unknown quantities are replaced with their histogram or expectation. A slight caveat is that Hope-Online and Hope-Full solve optimization problems where the summation is over agent types instead of over agents. For each of the simulation results, we include an additional plot of the estimated waterfilling level for a specific agent, i.e. w i f , which is the waterfilling threshold the optimization problem for the algorithm returns when visiting agent i. As an example, the Hope-Online algorithm allocates according to the solution to:

i−1 is the current available resources, and the expected histogram over types is defined as

Thus, the waterfilling threshold at agent i, w i f will be the solution to:

Heuristics: We compare the following algorithms:

• Hope-Online (see Section 3)

• Hope-Full (see Section 3)

• ET-Online (see Appendix C.1)

• ET-Full (see Appendix C.1)

• MaxMin (see Appendix C.2)

• Greedy: where every agent is given its demand θ i until the budget is depleted

• Adaptive-Threshold: where the allocation X alg i = min(B i /(n − i), θ i ) provides either an agents demand or an equal share of the remaining budget.

Choice of Demand Distributions: We perform several synthetic experiments, with demand parameters based on food bank demand data. In each simulation we set the size S i = 1 for each agent, and sample the demands θ i as follows:

• FBST Dataset: Here we consider the setting with n = 6 agents corresponding to the six counties serviced by the Food Bank of the Southern Tier of New York [21] . For the type distributions, we set θ i ∼ F i where F i is a discretized Gaussian with mean and variance based on historical food demands. We normalized the means of the distributions for them to total to one hundred (shown in Table 3 ).

• Gaussian Demands: Here we set the type distribution θ i ∼ F i as an i.i.d. discretized Gaussian distribution with mean fifteen and variance three, where we discretized the distribution into twenty buckets.

• Poisson Demands: Here we set the type distribution θ i ∼ F i as an i.i.d. discretized Poisson distribution with λ = 10. To discretize the distribution to have finite support we set the mass of all points larger than twenty to be zero, and redistributed that mass to P(θ i = 1).

• Simple Distribution: Here we set the type distribution θ i ∼ F i where F i = Uniform{1, 2}.

Metrics Included: We include plots the simulation results in Figures 4, 5, and 6 , and table of the fairness metrics in Tables 7, 8, 9, and 10 . In the figures we include four plots of the following:

• E X opt − X alg max , the expected maximum difference between allocations as we scale the number of agents n from 1 to 100

• E X opt − X alg max , the expected additive difference between allocations as we scale the number of agents n from 1 to 100

• E w i f , the expected waterfilling threshold of the algorithms on agent i with n = 100 agents total • E |X alg i − X opt i | , the expected difference in allocations for agent i with n = 100 agents total

In the tables we include:

• E X opt − X alg max , the expected maximum difference between the allocation generated by the algorithm and the fair one in hindsight (Definition 2.6)

• E X opt − X alg 1 , the expected additive difference between the allocation generated by the algorithm and the fair one in hindsight • E[∆ EF ], the expected maximum envy between any two agents (Definition 2.5)

• E[∆ P E ], the expected waste (Definition 2.5)

• E[∆ P rop ], the expected maximum envy between an agent and equal allocation (Definition 2.5)

• E[∆ M M ] = E min i u(X alg i , θ i ) , the expected minimum fill rate [30] Summary of Results: Here we provide a brief discussion on the major metrics and plots.

Scaling with n: From the plots (figures 4, 5, and 6) we see that Hope-Online performs competitively across all metrics. In particular, from the plots in the top-left showing E X alg − X opt max we see that Hope-Online has constant scaling with respect to the number of agents n in comparison to the other heuristic algorithms. Similar results are shown in the plots in the top right, where we see E X alg − X opt 1 . In these plots, Hope-Online has sublinear dependence with respect to the number of agents n. This supports Hope-Online as being a promising candidate for achieving approximate fairness (Definition 2.6).

Moreover, the MaxMin algorithm often has linear dependence for the 1 norm. This arises as the algorithm under allocates at every time period (shown in the plots in the bottom right). This is due to the formulation of the algorithm to provide guarantees in terms of the max-min objective, where once a mistake has been made the algorithm has no incentive to correct it. This is typical for allocation algorithms formulated under the max-min objective, and motivates other fairness objectives (such as Definition 2.5).

Lastly, in the plot on the bottom left we see that Hope-Online uses a waterfilling threshold that closely matches the true one (shown in black in the figures). The figures on the bottom right shows that Hope-Online provides a uniform approximation to the optimal fair allocation in hindsight X opt , with a slight deviation for later arriving agents. This is in contrast to other heuristics, which often have extreme suboptimal performance for later agents (as in the case of Greedy), or uniformly underallocates (as in the case of MaxMin). Lastly we see that Hope-Online has a waterfilling level that closely matches the true one, and that the algorithm is uniformly close to X opt across each agent.

Performance on fairness distance (Definition 2.5 and Definition 2.6): In the tables we see that Hope-Online either has the best, or second best performance across all of the fairness metrics considered. Important to note, is that Hope-Online is also competitive in terms of ∆ M M , an objective that MaxMin was formulated to perform with respect to.

Here we consider the multiple resource allocation problem with linear utility functions u(x, θ) = x, θ . Now, the agent type θ ∈ R k represents a vector of preferences, where θ k is the agent's preference for resource k. We consider the resource allocation problems with n = 6 agents, corresponding to the six counties serviced by the Food bank of the Southern Tier of New York. The sizes S i are taken as representative of their total 2018 food demand normalized to sum up to one hundred, displayed in Table 3 [21] . We also include their respective populations, highlighting the choice of using their normalized food demands as a representative of their size instead of their population, as different counties have different food assistance needs irrespective of population.

Heuristics: We compare the following algorithms Table 4 [37] .

We sampled eight such type vectors, and then considered the uniform distribution over those eight types for each distribution F i , i.e. F i = Uniform(Θ).

Metrics Included: As the number of agents n = 6 is fixed we only include a table summarizing the metrics. In the table we include:

• E X opt − X alg max , the expected maximum difference between the allocation generated by the algorithm and the fair one in hindsight (Definition 2.6)

• E X opt − X alg 1 , the expected additive difference between the allocation generated by the algorithm and the fair one in hindsight Table 5 we compare the fairness metrics averaged over one thousand simulations with the addition of a standard normal confidence interval. Here we see that Hope-Online is competitive with respect to all metrics in comparison to the other heuristic algorithms. In all metrics, Hope-Online either performs the best or second-best. More important, is that Hope-Online performs the best in terms of minimizing the distance to the optimal allocation in hindsight, E X opt − X alg max . Table 5 : Comparison of fairness metrics averaged over 1000 simulations on the multiple-resource online allocation problem with linear utilities. We plot the four metrics from Definition 2.6 and Definition 2.5. Larger values corresponds to a lower score on that metric. We include both the mean and a standard normal approximation confidence interval for the metrics. Due to space constraints in the table, we include a separate row with the order of magnitude of the confidence intervals for these results.

Algorithm 

Experiment Setup: Each experiment was run with 1000 iterations where the relevant plots are taking the mean and a standard-normal confidence interval of the related quantities. In the case of a single resource the budget B is set to be the total expected demand. For the experiments with multiple resources we use a budget of the total preferences for each product as the allocations are the same up to scaling. All randomness is dictated by a seed set at the start of each simulation for verifying results.

Computing Infrastructure: The experiments were conducted on a personal computer with an AMD Ryzen 5 3600 6-Core 3.60 GHz processor and 16.0GB of RAM. No GPUs were harmed in these experiments.

Run-time Analysis: The average computation time for running a single iteration of the heuristic algorithms in comparison to the offline solution is listed in Table 6 . These statistics were computed by averaging over 1000 simulations. The case with n = 6 is on the multiple resource allocation dataset described in Appendix E.2. The case with n = 100 is on the single resource allocation dataset with an i.i.d. discretized Gaussian from Appendix E.1. These results mostly serve to indicate how the algorithms scale well and are easily implementable. Table 10 : Comparison of fairness metrics (averaged over 1000 replications) on the single-resource online allocation problem with filling-ratio utilities on the FBST Dataset problem. We compare the four unfairness metrics from Definitions 2.5 and 2.6 (larger values correspond to lower scores; best value highlighted) with the addition of E[∆MM ] = E mini u(X alg i , θi) , the minimum fill rate, and E X alg − X opt 1 , the 1 difference in allocations. Due to space constraints in the table, we include a separate row with the order of magnitude for the confidence intervals for these results. Figure 6 : Comparison of Hope-Online, Hope-Full, ET-Online, ET-Full, MaxMin, and the Greedy and Adapt-Threshold algorithms on a synthetic dataset where each type θ i ∼ Uniform{1, 2}. Top left: comparison of E X opt − X alg ∞ for the different algorithms as we scale n from 1 to 100. Top right: comparison of E X opt − X alg 1 for the different algorithms as we scale n from 1 to 100. Bottom left: comparison of the threshold used in the allocation for different groups, averaged over many simulations with n = 100 agents. Bottom right: comparison of the agent by agent allocation difference, E |X opt i − X alg i | for the different agents with a fixed n = 100 agents.

Roles for computing in social change

Monotone and online fair division

Online fair division: A survey

Online fair division: Analysing a food bank problem

Social choice and individual values

How to allocate goods in an online market?

Control of fair division

Online nash social welfare maximization via promised utilities

Fair resource allocation in a volatile marketplace

About 14 million children in the us are not getting enough to eat

How to make envy vanish over time

A new solution to the random assignment problem

Convex optimization

An envy-free cake division protocol

Fair Division: From cake-cutting to dispute resolution

The unreasonable fairness of maximum nash welfare

Fairness and utilization in allocating resources with uncertain demand

Aggregation of utility functions

The humanitarian pickup and distribution problem

Fair algorithms for learning in allocation problems

Food Bank of the Southern Tier of New York FBST

Fair online allocation of perishable goods and its application to electric vehicle charging

Dominant resource fairness: Fair allocation of multiple resource types

Achieving a fairer future by changing the past

One and done? equality of opportunity and repeated access to scarce, indivisible medical resources

Dynamic weighted fairness with minimal disruptions

A social welfare optimal sequential allocation procedure

No agent left behind: Dynamic fair division of multiple resources

never seen anything like it': Cars line up for miles at food banks

Sequential resource allocation for nonprofit operations

How feds decide on remdesivir shipments to states remains mysterious

Mechanisms for online organ matching

Fairness in deceased organ matching

Fair division and collective welfare

Achieving equity, effectiveness, and efficiency in food bank operations: Strategies for feeding america with implications for global hunger relief

Beyond beds:the vital role of a full continuum of psychiatric care

How food banks use markets to feed the poor

Cake cutting: not just child's play

Algorithmic game theory

Modeling for the equitable and effective distribution of food donations under stochastic receiving capacities

Is fairness good? a critique of varian's theory of fairness

Dynamic fair resource division

Equity, envy, and efficiency

Two problems in the theory of fairness

The bayesian prophet: A low-regret framework for online decision making

Online cake cutting

Fairness-efficiency tradeoffs in dynamic fair division

Part of this work was done while Sean Sinclair and Christina Yu were visiting the Simons Institute for the Theory of Computing for the semester on the Theory of Reinforcement Learning. We also gratefully acknowledge funding from the NSF under grants ECCS-1847393, DMS-1839346, CCF-1948256, and CNS-1955997, the ARL under grant W911NF-17-1-0094, and the Cornell Engaged Grant: Applied Mathematics in Action.