id sid tid token lemma pos xg94hm54j30 1 1 optimal optimal ADJ xg94hm54j30 1 2 decision decision NOUN xg94hm54j30 1 3 - - PUNCT xg94hm54j30 1 4 making making NOUN xg94hm54j30 1 5 or or CCONJ xg94hm54j30 1 6 policy policy NOUN xg94hm54j30 1 7 satisfies satisfie NOUN xg94hm54j30 1 8 both both CCONJ xg94hm54j30 1 9 short short ADJ xg94hm54j30 1 10 - - PUNCT xg94hm54j30 1 11 term term NOUN xg94hm54j30 1 12 and and CCONJ xg94hm54j30 1 13 long long ADJ xg94hm54j30 1 14 - - PUNCT xg94hm54j30 1 15 term term NOUN xg94hm54j30 1 16 objectives objective NOUN xg94hm54j30 1 17 . . PUNCT xg94hm54j30 2 1 policy policy NOUN xg94hm54j30 2 2 optimization optimization NOUN xg94hm54j30 2 3 in in ADP xg94hm54j30 2 4 complex complex ADJ xg94hm54j30 2 5 dynamic dynamic ADJ xg94hm54j30 2 6 environments environment NOUN xg94hm54j30 2 7 is be AUX xg94hm54j30 2 8 notoriously notoriously ADV xg94hm54j30 2 9 difficult difficult ADJ xg94hm54j30 2 10 due due ADP xg94hm54j30 2 11 to to ADP xg94hm54j30 2 12 the the DET xg94hm54j30 2 13 absence absence NOUN xg94hm54j30 2 14 of of ADP xg94hm54j30 2 15 succinct succinct ADJ xg94hm54j30 2 16 mathematical mathematical ADJ xg94hm54j30 2 17 models model NOUN xg94hm54j30 2 18 . . PUNCT xg94hm54j30 3 1 reinforcement reinforcement NOUN xg94hm54j30 3 2 learning learning PROPN xg94hm54j30 3 3 ( ( PUNCT xg94hm54j30 3 4 rl rl PROPN xg94hm54j30 3 5 ) ) PUNCT xg94hm54j30 3 6 leverages leverage VERB xg94hm54j30 3 7 data datum NOUN xg94hm54j30 3 8 from from ADP xg94hm54j30 3 9 experiments experiment NOUN xg94hm54j30 3 10 or or CCONJ xg94hm54j30 3 11 simulations simulation NOUN xg94hm54j30 3 12 to to PART xg94hm54j30 3 13 learn learn VERB xg94hm54j30 3 14 optimal optimal ADJ xg94hm54j30 3 15 policies policy NOUN xg94hm54j30 3 16 that that PRON xg94hm54j30 3 17 , , PUNCT xg94hm54j30 3 18 in in ADP xg94hm54j30 3 19 many many ADJ xg94hm54j30 3 20 domains domain NOUN xg94hm54j30 3 21 , , PUNCT xg94hm54j30 3 22 have have AUX xg94hm54j30 3 23 already already ADV xg94hm54j30 3 24 outperformed outperform VERB xg94hm54j30 3 25 policies policy NOUN xg94hm54j30 3 26 designed design VERB xg94hm54j30 3 27 by by ADP xg94hm54j30 3 28 humans human NOUN xg94hm54j30 3 29 . . PUNCT xg94hm54j30 4 1 rl rl PROPN xg94hm54j30 4 2 still still ADV xg94hm54j30 4 3 faces face VERB xg94hm54j30 4 4 numerous numerous ADJ xg94hm54j30 4 5 challenges challenge NOUN xg94hm54j30 4 6 in in ADP xg94hm54j30 4 7 the the DET xg94hm54j30 4 8 multi multi ADJ xg94hm54j30 4 9 - - ADJ xg94hm54j30 4 10 agent agent ADJ xg94hm54j30 4 11 setting setting NOUN xg94hm54j30 4 12 , , PUNCT xg94hm54j30 4 13 where where SCONJ xg94hm54j30 4 14 the the DET xg94hm54j30 4 15 participating participate VERB xg94hm54j30 4 16 agents agent NOUN xg94hm54j30 4 17 interact interact VERB xg94hm54j30 4 18 in in ADP xg94hm54j30 4 19 shared shared ADJ xg94hm54j30 4 20 environments environment NOUN xg94hm54j30 4 21 . . PUNCT xg94hm54j30 5 1 this this DET xg94hm54j30 5 2 dissertation dissertation NOUN xg94hm54j30 5 3 aims aim VERB xg94hm54j30 5 4 to to PART xg94hm54j30 5 5 address address VERB xg94hm54j30 5 6 two two NUM xg94hm54j30 5 7 challenges challenge NOUN xg94hm54j30 5 8 in in ADP xg94hm54j30 5 9 decentralized decentralized ADJ xg94hm54j30 5 10 cooperative cooperative ADJ xg94hm54j30 5 11 multi multi ADJ xg94hm54j30 5 12 - - ADJ xg94hm54j30 5 13 agent agent ADJ xg94hm54j30 5 14 reinforcement reinforcement NOUN xg94hm54j30 5 15 learning learning NOUN xg94hm54j30 5 16 , , PUNCT xg94hm54j30 5 17 a a DET xg94hm54j30 5 18 new new ADJ xg94hm54j30 5 19 training training NOUN xg94hm54j30 5 20 paradigm paradigm NOUN xg94hm54j30 5 21 that that PRON xg94hm54j30 5 22 features feature VERB xg94hm54j30 5 23 scalability scalability NOUN xg94hm54j30 5 24 and and CCONJ xg94hm54j30 5 25 privacy privacy NOUN xg94hm54j30 5 26 guarantees guarantee NOUN xg94hm54j30 5 27 for for ADP xg94hm54j30 5 28 cooperative cooperative ADJ xg94hm54j30 5 29 agents.in agents.in X xg94hm54j30 5 30 the the DET xg94hm54j30 5 31 first first ADJ xg94hm54j30 5 32 part part NOUN xg94hm54j30 5 33 , , PUNCT xg94hm54j30 5 34 we we PRON xg94hm54j30 5 35 study study VERB xg94hm54j30 5 36 the the DET xg94hm54j30 5 37 behavior behavior NOUN xg94hm54j30 5 38 of of ADP xg94hm54j30 5 39 the the DET xg94hm54j30 5 40 cooperative cooperative ADJ xg94hm54j30 5 41 agents agent NOUN xg94hm54j30 5 42 in in ADP xg94hm54j30 5 43 a a DET xg94hm54j30 5 44 network network NOUN xg94hm54j30 5 45 that that PRON xg94hm54j30 5 46 includes include VERB xg94hm54j30 5 47 adversarial adversarial ADJ xg94hm54j30 5 48 agents agent NOUN xg94hm54j30 5 49 . . PUNCT xg94hm54j30 6 1 adversarial adversarial ADJ xg94hm54j30 6 2 attacks attack NOUN xg94hm54j30 6 3 in in ADP xg94hm54j30 6 4 training training NOUN xg94hm54j30 6 5 can can AUX xg94hm54j30 6 6 strongly strongly ADV xg94hm54j30 6 7 influence influence VERB xg94hm54j30 6 8 the the DET xg94hm54j30 6 9 performance performance NOUN xg94hm54j30 6 10 of of ADP xg94hm54j30 6 11 multi multi ADJ xg94hm54j30 6 12 - - NOUN xg94hm54j30 6 13 agent agent ADJ xg94hm54j30 6 14 rl rl NOUN xg94hm54j30 6 15 algorithms algorithm NOUN xg94hm54j30 6 16 . . PUNCT xg94hm54j30 7 1 it it PRON xg94hm54j30 7 2 is be AUX xg94hm54j30 7 3 , , PUNCT xg94hm54j30 7 4 thus thus ADV xg94hm54j30 7 5 , , PUNCT xg94hm54j30 7 6 highly highly ADV xg94hm54j30 7 7 desirable desirable ADJ xg94hm54j30 7 8 to to PART xg94hm54j30 7 9 augment augment VERB xg94hm54j30 7 10 existing exist VERB xg94hm54j30 7 11 algorithms algorithm NOUN xg94hm54j30 7 12 such such ADJ xg94hm54j30 7 13 that that SCONJ xg94hm54j30 7 14 the the DET xg94hm54j30 7 15 impact impact NOUN xg94hm54j30 7 16 of of ADP xg94hm54j30 7 17 adversarial adversarial ADJ xg94hm54j30 7 18 attacks attack NOUN xg94hm54j30 7 19 on on ADP xg94hm54j30 7 20 cooperative cooperative ADJ xg94hm54j30 7 21 networks network NOUN xg94hm54j30 7 22 is be AUX xg94hm54j30 7 23 eliminated eliminate VERB xg94hm54j30 7 24 , , PUNCT xg94hm54j30 7 25 or or CCONJ xg94hm54j30 7 26 at at ADP xg94hm54j30 7 27 least least ADJ xg94hm54j30 7 28 bounded bound VERB xg94hm54j30 7 29 . . PUNCT xg94hm54j30 8 1 we we PRON xg94hm54j30 8 2 introduce introduce VERB xg94hm54j30 8 3 a a DET xg94hm54j30 8 4 resilient resilient ADJ xg94hm54j30 8 5 projection projection NOUN xg94hm54j30 8 6 - - PUNCT xg94hm54j30 8 7 based base VERB xg94hm54j30 8 8 consensus consensus NOUN xg94hm54j30 8 9 multi multi NOUN xg94hm54j30 8 10 - - ADJ xg94hm54j30 8 11 agent agent ADJ xg94hm54j30 8 12 actor actor NOUN xg94hm54j30 8 13 - - PUNCT xg94hm54j30 8 14 critic critic NOUN xg94hm54j30 8 15 algorithm algorithm NOUN xg94hm54j30 8 16 , , PUNCT xg94hm54j30 8 17 whereby whereby SCONJ xg94hm54j30 8 18 each each DET xg94hm54j30 8 19 agent agent NOUN xg94hm54j30 8 20 receives receive VERB xg94hm54j30 8 21 a a DET xg94hm54j30 8 22 private private ADJ xg94hm54j30 8 23 reward reward NOUN xg94hm54j30 8 24 and and CCONJ xg94hm54j30 8 25 communicates communicate NOUN xg94hm54j30 8 26 with with ADP xg94hm54j30 8 27 its its PRON xg94hm54j30 8 28 neighbors neighbor NOUN xg94hm54j30 8 29 to to PART xg94hm54j30 8 30 estimate estimate VERB xg94hm54j30 8 31 the the DET xg94hm54j30 8 32 team team NOUN xg94hm54j30 8 33 - - PUNCT xg94hm54j30 8 34 average average ADJ xg94hm54j30 8 35 reward reward NOUN xg94hm54j30 8 36 and and CCONJ xg94hm54j30 8 37 value value NOUN xg94hm54j30 8 38 function function NOUN xg94hm54j30 8 39 . . PUNCT xg94hm54j30 9 1 we we PRON xg94hm54j30 9 2 show show VERB xg94hm54j30 9 3 that that SCONJ xg94hm54j30 9 4 in in ADP xg94hm54j30 9 5 the the DET xg94hm54j30 9 6 presence presence NOUN xg94hm54j30 9 7 of of ADP xg94hm54j30 9 8 byzantine byzantine ADJ xg94hm54j30 9 9 agents agent NOUN xg94hm54j30 9 10 , , PUNCT xg94hm54j30 9 11 whose whose DET xg94hm54j30 9 12 estimation estimation NOUN xg94hm54j30 9 13 and and CCONJ xg94hm54j30 9 14 communication communication NOUN xg94hm54j30 9 15 strategies strategy NOUN xg94hm54j30 9 16 are be AUX xg94hm54j30 9 17 arbitrary arbitrary ADJ xg94hm54j30 9 18 , , PUNCT xg94hm54j30 9 19 the the DET xg94hm54j30 9 20 estimates estimate NOUN xg94hm54j30 9 21 of of ADP xg94hm54j30 9 22 the the DET xg94hm54j30 9 23 cooperative cooperative ADJ xg94hm54j30 9 24 agents agent NOUN xg94hm54j30 9 25 converge converge VERB xg94hm54j30 9 26 to to ADP xg94hm54j30 9 27 a a DET xg94hm54j30 9 28 bounded bound VERB xg94hm54j30 9 29 consensus consensus NOUN xg94hm54j30 9 30 value value NOUN xg94hm54j30 9 31 , , PUNCT xg94hm54j30 9 32 provided provide VERB xg94hm54j30 9 33 that that SCONJ xg94hm54j30 9 34 there there PRON xg94hm54j30 9 35 are be VERB xg94hm54j30 9 36 at at ADV xg94hm54j30 9 37 most most ADJ xg94hm54j30 9 38 h h NOUN xg94hm54j30 9 39 byzantine byzantine ADJ xg94hm54j30 9 40 agents agent NOUN xg94hm54j30 9 41 in in ADP xg94hm54j30 9 42 the the DET xg94hm54j30 9 43 network network NOUN xg94hm54j30 9 44 that that PRON xg94hm54j30 9 45 is be AUX xg94hm54j30 9 46 ( ( PUNCT xg94hm54j30 9 47 2h+1)-robust 2h+1)-robust PROPN xg94hm54j30 9 48 . . PUNCT xg94hm54j30 10 1 furthermore furthermore ADV xg94hm54j30 10 2 , , PUNCT xg94hm54j30 10 3 we we PRON xg94hm54j30 10 4 prove prove VERB xg94hm54j30 10 5 that that SCONJ xg94hm54j30 10 6 the the DET xg94hm54j30 10 7 joint joint ADJ xg94hm54j30 10 8 cooperative cooperative ADJ xg94hm54j30 10 9 policy policy NOUN xg94hm54j30 10 10 converges converge NOUN xg94hm54j30 10 11 to to ADP xg94hm54j30 10 12 a a DET xg94hm54j30 10 13 bounded bound VERB xg94hm54j30 10 14 neighborhood neighborhood NOUN xg94hm54j30 10 15 around around ADP xg94hm54j30 10 16 a a DET xg94hm54j30 10 17 locally locally ADV xg94hm54j30 10 18 optimal optimal ADJ xg94hm54j30 10 19 cooperative cooperative NOUN xg94hm54j30 10 20 policy.in policy.in X xg94hm54j30 10 21 the the DET xg94hm54j30 10 22 second second ADJ xg94hm54j30 10 23 part part NOUN xg94hm54j30 10 24 , , PUNCT xg94hm54j30 10 25 we we PRON xg94hm54j30 10 26 consider consider VERB xg94hm54j30 10 27 a a DET xg94hm54j30 10 28 fully fully ADV xg94hm54j30 10 29 cooperative cooperative ADJ xg94hm54j30 10 30 network network NOUN xg94hm54j30 10 31 subject subject ADJ xg94hm54j30 10 32 to to ADP xg94hm54j30 10 33 communication communication NOUN xg94hm54j30 10 34 delays delay NOUN xg94hm54j30 10 35 and and CCONJ xg94hm54j30 10 36 packet packet NOUN xg94hm54j30 10 37 dropouts dropout NOUN xg94hm54j30 10 38 . . PUNCT xg94hm54j30 11 1 the the DET xg94hm54j30 11 2 assumption assumption NOUN xg94hm54j30 11 3 about about ADP xg94hm54j30 11 4 disrupted disrupt VERB xg94hm54j30 11 5 communication communication NOUN xg94hm54j30 11 6 is be AUX xg94hm54j30 11 7 reasonable reasonable ADJ xg94hm54j30 11 8 in in ADP xg94hm54j30 11 9 online online ADJ xg94hm54j30 11 10 decentralized decentralized ADJ xg94hm54j30 11 11 training training NOUN xg94hm54j30 11 12 where where SCONJ xg94hm54j30 11 13 agents agent NOUN xg94hm54j30 11 14 continuously continuously ADV xg94hm54j30 11 15 accumulate accumulate VERB xg94hm54j30 11 16 new new ADJ xg94hm54j30 11 17 experiences experience NOUN xg94hm54j30 11 18 from from ADP xg94hm54j30 11 19 the the DET xg94hm54j30 11 20 environment environment NOUN xg94hm54j30 11 21 and and CCONJ xg94hm54j30 11 22 communicate communicate VERB xg94hm54j30 11 23 periodically periodically ADV xg94hm54j30 11 24 . . PUNCT xg94hm54j30 12 1 we we PRON xg94hm54j30 12 2 present present VERB xg94hm54j30 12 3 a a DET xg94hm54j30 12 4 multi multi ADJ xg94hm54j30 12 5 - - ADJ xg94hm54j30 12 6 agent agent ADJ xg94hm54j30 12 7 actor actor NOUN xg94hm54j30 12 8 - - PUNCT xg94hm54j30 12 9 critic critic NOUN xg94hm54j30 12 10 algorithm algorithm NOUN xg94hm54j30 12 11 with with ADP xg94hm54j30 12 12 td td PROPN xg94hm54j30 12 13 error error NOUN xg94hm54j30 12 14 aggregation aggregation NOUN xg94hm54j30 12 15 , , PUNCT xg94hm54j30 12 16 where where SCONJ xg94hm54j30 12 17 the the DET xg94hm54j30 12 18 aggregation aggregation NOUN xg94hm54j30 12 19 of of ADP xg94hm54j30 12 20 td td NOUN xg94hm54j30 12 21 errors error NOUN xg94hm54j30 12 22 ensures ensure VERB xg94hm54j30 12 23 cooperation cooperation NOUN xg94hm54j30 12 24 between between ADP xg94hm54j30 12 25 the the DET xg94hm54j30 12 26 agents agent NOUN xg94hm54j30 12 27 . . PUNCT xg94hm54j30 13 1 the the DET xg94hm54j30 13 2 assumptions assumption NOUN xg94hm54j30 13 3 about about ADP xg94hm54j30 13 4 the the DET xg94hm54j30 13 5 communication communication NOUN xg94hm54j30 13 6 lead lead NOUN xg94hm54j30 13 7 to to ADP xg94hm54j30 13 8 an an DET xg94hm54j30 13 9 increased increase VERB xg94hm54j30 13 10 communication communication NOUN xg94hm54j30 13 11 burden burden NOUN xg94hm54j30 13 12 for for ADP xg94hm54j30 13 13 every every DET xg94hm54j30 13 14 agent agent NOUN xg94hm54j30 13 15 as as SCONJ xg94hm54j30 13 16 measured measure VERB xg94hm54j30 13 17 by by ADP xg94hm54j30 13 18 the the DET xg94hm54j30 13 19 dimension dimension NOUN xg94hm54j30 13 20 of of ADP xg94hm54j30 13 21 the the DET xg94hm54j30 13 22 transmitted transmit VERB xg94hm54j30 13 23 data datum NOUN xg94hm54j30 13 24 ; ; PUNCT xg94hm54j30 13 25 nonetheless nonetheless ADV xg94hm54j30 13 26 , , PUNCT xg94hm54j30 13 27 the the DET xg94hm54j30 13 28 communication communication NOUN xg94hm54j30 13 29 burden burden NOUN xg94hm54j30 13 30 is be AUX xg94hm54j30 13 31 only only ADV xg94hm54j30 13 32 quadratic quadratic ADJ xg94hm54j30 13 33 in in ADP xg94hm54j30 13 34 the the DET xg94hm54j30 13 35 graph graph NOUN xg94hm54j30 13 36 size size NOUN xg94hm54j30 13 37 , , PUNCT xg94hm54j30 13 38 and and CCONJ xg94hm54j30 13 39 thus thus ADV xg94hm54j30 13 40 the the DET xg94hm54j30 13 41 algorithm algorithm PROPN xg94hm54j30 13 42 is be AUX xg94hm54j30 13 43 applicable applicable ADJ xg94hm54j30 13 44 in in ADP xg94hm54j30 13 45 large large ADJ xg94hm54j30 13 46 networks network NOUN xg94hm54j30 13 47 . . PUNCT xg94hm54j30 14 1 we we PRON xg94hm54j30 14 2 prove prove VERB xg94hm54j30 14 3 analytically analytically ADV xg94hm54j30 14 4 that that SCONJ xg94hm54j30 14 5 the the DET xg94hm54j30 14 6 agents agent NOUN xg94hm54j30 14 7 approximately approximately ADV xg94hm54j30 14 8 maximize maximize VERB xg94hm54j30 14 9 the the DET xg94hm54j30 14 10 team team NOUN xg94hm54j30 14 11 - - PUNCT xg94hm54j30 14 12 average average ADJ xg94hm54j30 14 13 objective objective ADJ xg94hm54j30 14 14 function function NOUN xg94hm54j30 14 15 . . PUNCT