key: cord-0576868-qwow1x1c authors: Collaboration, The LIGO Scientific; Collaboration, the Virgo; Abbott, the KAGRA Collaboration R.; Abe, H.; Acernese, F.; Ackley, K.; Adhikari, N.; Adhikari, R. X.; Adkins, V. K.; Adya, V. B.; Affeldt, C.; Agarwal, D.; Agathos, M.; Agatsuma, K.; Aggarwal, N.; Aguiar, O. D.; Aiello, L.; Ain, A.; Ajith, P.; Akutsu, T.; Alarc'on, P. F. de; Albanesi, S.; Alfaidi, R. A.; Allocca, A.; Altin, P. A.; Amato, A.; Anand, C.; Anand, S.; Ananyeva, A.; Anderson, S. B.; Anderson, W. G.; Ando, M.; Andrade, T.; Andres, N.; Andr'es-Carcasona, M.; Andri'c, T.; Angelova, S. V.; Ansoldi, S.; Antelis, J. M.; Antier, S.; Apostolatos, T.; Appavuravther, E. Z.; Appert, S.; Apple, S. K.; Arai, K.; Araya, A.; Araya, M. C.; Areeda, J. S.; Arene, M.; Aritomi, N.; Arnaud, N.; Arogeti, M.; Aronson, S. M.; Arun, K. G.; Asada, H.; Asali, Y.; Ashton, G.; Aso, Y.; Assiduo, M.; Melo, S. Assis de Souza; Aston, S. M.; Astone, P.; Aubin, F.; AultONeal, K.; Austin, C.; Babak, S.; Badaracco, F.; Bader, M. K. M.; Badger, C.; Bae, S.; Bae, Y.; Baer, A. M.; Bagnasco, S.; Bai, Y.; Baird, J.; Bajpai, R.; Baka, T.; Ball, M.; Ballardin, G.; Ballmer, S. W.; Balsamo, A.; Baltus, G.; Banagiri, S.; Banerjee, B.; Bankar, D.; Barayoga, J. C.; Barbieri, C.; Barish, B. C.; Barker, D.; Barneo, P.; Barone, F.; Barr, B.; Barsotti, L.; Barsuglia, M.; Barta, D.; Bartlett, J.; Barton, M. A.; Bartos, I.; Basak, S.; Bassiri, R.; Basti, A.; Bawaj, M.; Bayley, J. C.; Bazzan, M.; Becher, B. R.; B'ecsy, B.; Bedakihale, V. M.; Beirnaert, F.; Bejger, M.; Belahcene, I.; Benedetto, V.; Beniwal, D.; Benjamin, M. G.; Bennett, T. F.; Bentley, J. D.; BenYaala, M.; Bera, S.; Berbel, M.; Bergamin, F.; Berger, B. K.; Bernuzzi, S.; Berry, C. P. L.; Bersanetti, D.; Bertolini, A.; Betzwieser, J.; Beveridge, D.; Bhandare, R.; Bhandari, A. V.; Bhardwaj, U.; Bhatt, R.; Bhattacharjee, D.; Bhaumik, S.; Bianchi, A.; Bilenko, I. A.; Billingsley, G.; Bini, S.; Birney, R.; Birnholtz, O.; Biscans, S.; Bischi, M.; Biscoveanu, S.; Bisht, A.; Biswas, B.; Bitossi, M.; Bizouard, M.-A.; Blackburn, J. K.; Blair, C. D.; Blair, D. G.; Blair, R. M.; Bobba, F.; Bode, N.; Boer, M.; Bogaert, G.; Boldrini, M.; Bolingbroke, G. N.; Bonavena, L. D.; Bondu, F.; Bonilla, E.; Bonnand, R.; Booker, P.; Boom, B. A.; Bork, R.; Boschi, V.; Bose, N.; Bose, S.; Bossilkov, V.; Boudart, V.; Bouffanais, Y.; Bozzi, A.; Bradaschia, C.; Brady, P. R.; Bramley, A.; Branch, A.; Branchesi, M.; Brau, J. E.; Breschi, M.; Briant, T.; Briggs, J. H.; Brillet, A.; Brinkmann, M.; Brockill, P.; Brooks, A. F.; Brooks, J.; Brown, D. D.; Brunett, S.; Bruno, G.; Bruntz, R.; Bryant, J.; Bucci, F.; Bulik, T.; Bulten, H. J.; Buonanno, A.; Burtnyk, K.; Buscicchio, R.; Buskulic, D.; Buy, C.; Byer, R. L.; Davies, G. S. Cabourn; Cabras, G.; Cabrita, R.; Cadonati, L.; Caesar, M.; Cagnoli, G.; Cahillane, C.; Bustillo, J. Calder'on; Callaghan, J. D.; Callister, T. A.; Calloni, E.; Cameron, J.; Camp, J. B.; Canepa, M.; Canevarolo, S.; Cannavacciuolo, M.; Cannon, K. C.; Cao, H.; Cao, Z.; Capocasa, E.; Capote, E.; Carapella, G.; Carbognani, F.; Carlassara, M.; Carlin, J. B.; Carney, M. F.; Carpinelli, M.; Carrillo, G.; Carullo, G.; Carver, T. L.; Diaz, J. Casanueva; Casentini, C.; Castaldi, G.; Caudill, S.; Cavaglia, M.; Cavalier, F.; Cavalieri, R.; Cella, G.; Cerd'a-Dur'an, P.; Cesarini, E.; Chaibi, W.; Subrahmanya, S. Chalathadka; Champion, E.; Chan, C.-H.; Chan, C.; Chan, C. L.; Chan, K.; Chan, M.; Chandra, K.; Chang, I. P.; Chanial, P.; Chao, S.; Chapman-Bird, C.; Charlton, P.; Chase, E. A.; Chassande-Mottin, E.; Chatterjee, C.; Chatterjee, Debarati; Chatterjee, Deep; Chaturvedi, M.; Chaty, S.; Chatziioannou, K.; Chen, C.; Chen, D.; Chen, H. Y.; Chen, J.; Chen, K.; Chen, X.; Chen, Y.-B.; Chen, Y.-R.; Chen, Z.; Cheng, H.; Cheong, C. K.; Cheung, H. Y.; Chia, H. Y.; Chiadini, F.; Chiang, C-Y.; Chiarini, G.; Chierici, R.; Chincarini, A.; Chiofalo, M. L.; Chiummo, A.; Choudhary, R. K.; Choudhary, S.; Christensen, N.; Chu, Q.; Chu, Y-K.; Chua, S. S. Y.; Chung, K. W.; Ciani, G.; Ciecielag, P.; Cie'slar, M.; Cifaldi, M.; Ciobanu, A. A.; Ciolfi, R.; Cipriano, F.; Clara, F.; Clark, J. A.; Clearwater, P.; Clesse, S.; Cleva, F.; Coccia, E.; Codazzo, E.; Cohadon, P.-F.; Cohen, D. E.; Colleoni, M.; Collette, C. G.; Colombo, A.; Colpi, M.; Compton, C. M.; Constancio, M.; Conti, L.; Cooper, S. J.; Corban, P.; Corbitt, T. R.; Cordero-Carri'on, I.; Corezzi, S.; Corley, K. R.; Cornish, N. J.; Corre, D.; Corsi, A.; Cortese, S.; Costa, C. A.; Cotesta, R.; Cottingham, R.; Coughlin, M. W.; Coulon, J.-P.; Countryman, S. T.; Cousins, B.; Couvares, P.; Coward, D. M.; Cowart, M. J.; Coyne, D. C.; Coyne, R.; Creighton, J. D. E.; Creighton, T. D.; Criswell, A. W.; Croquette, M.; Crowder, S. G.; Cudell, J. R.; Cullen, T. J.; Cumming, A.; Cummings, R.; Cunningham, L.; Cuoco, E.; Curylo, M.; Dabadie, P.; Canton, T. Dal; Dall'Osso, S.; D'alya, G.; Dana, A.; D'Angelo, B.; Danilishin, S.; D'Antonio, S.; Danzmann, K.; Darsow-Fromm, C.; Dasgupta, A.; Datrier, L. E. H.; Datta, Sayak; Datta, Sayantani; Dattilo, V.; Dave, I.; Davier, M.; Davis, D.; Davis, M. C.; Daw, E. J.; Dean, R.; DeBra, D.; Deenadayalan, M.; Degallaix, J.; Laurentis, M. De; Del'eglise, S.; Favero, V. Del; Lillo, F. De; Lillo, N. De; Dell'Aquila, D.; Pozzo, W. Del; DeMarchi, L. M.; Matteis, F. De; D'Emilio, V.; Demos, N.; Dent, T.; Depasse, A.; Pietri, R. De; Rosa, R. De; Rossi, C. De; DeSalvo, R.; Simone, R. De; Dhurandhar, S.; D'iaz, M. C.; Didio, N. A.; Dietrich, T.; Fiore, L. Di; Fronzo, C. Di; Giorgio, C. Di; Giovanni, F. Di; Giovanni, M. Di; Girolamo, T. Di; Lieto, A. Di; Michele, A. Di; Ding, B.; Pace, S. Di; Palma, I. Di; Renzo, F. Di; Divakarla, A. K.; Divyajyoti,; Dmitriev, A.; Doctor, Z.; Donahue, L.; D'Onofrio, L.; Donovan, F.; Dooley, K. L.; Doravari, S.; Drago, M.; Driggers, J. C.; Drori, Y.; Ducoin, J.-G.; Dupej, P.; Dupletsa, U.; Durante, O.; D'Urso, D.; Duverne, P.-A.; Dwyer, S. E.; Eassa, C.; Easter, P. J.; Ebersold, M.; Eckhardt, T.; Eddolls, G.; Edelman, B.; Edo, T. B.; Edy, O.; Effler, A.; Eguchi, S.; Eichholz, J.; Eikenberry, S. S.; Eisenmann, M.; Eisenstein, R. A.; Ejlli, A.; Engelby, E.; Enomoto, Y.; Errico, L.; Essick, R. C.; Estell'es, H.; Estevez, D.; Etienne, Z.; Etzel, T.; Evans, M.; Evans, T. M.; Evstafyeva, T.; Ewing, B. E.; Fabrizi, F.; Faedi, F.; Fafone, V.; Fair, H.; Fairhurst, S.; Fan, P. C.; Farah, A. M.; Farinon, S.; Farr, B.; Farr, W. M.; Fauchon-Jones, E. J.; Favaro, G.; Favata, M.; Fays, M.; Fazio, M.; Feicht, J.; Fejer, M. M.; Fenyvesi, E.; Ferguson, D. L.; Fernandez-Galiana, A.; Ferrante, I.; Ferreira, T. A.; Fidecaro, F.; Figura, P.; Fiori, A.; Fiori, I.; Fishbach, M.; Fisher, R. P.; Fittipaldi, R.; Fiumara, V.; Flaminio, R.; Floden, E.; Fong, H. K.; Font, J. A.; Fornal, B.; Forsyth, P. W. F.; Franke, A.; Frasca, S.; Frasconi, F.; Freed, J. P.; Frei, Z.; Freise, A.; Freitas, O.; Frey, R.; Fritschel, P.; Frolov, V. V.; Fronz'e, G. G.; Fujii, Y.; Fujikawa, Y.; Fujimoto, Y.; Fulda, P.; Fyffe, M.; Gabbard, H. A.; Gabella, W. E.; Gadre, B. U.; Gair, J. R.; Gais, J.; Galaudage, S.; Gamba, R.; Ganapathy, D.; Ganguly, A.; Gao, D.; Gaonkar, S. G.; Garaventa, B.; N'unez, C. Garc'ia; Garc'ia-Quir'os, C.; Garufi, F.; Gateley, B.; Gayathri, V.; Ge, G.-G.; Gemme, G.; Gennai, A.; George, J.; Gerberding, O.; Gergely, L.; Gewecke, P.; Ghonge, S.; Ghosh, Abhirup; Ghosh, Archisman; Ghosh, Shaon; Ghosh, Shrobana; Ghosh, Tathagata; Giacomazzo, B.; Giacoppo, L.; Giaime, J. A.; Giardina, K. D.; Gibson, D. R.; Gier, C.; Giesler, M.; Giri, P.; Gissi, F.; Gkaitatzis, S.; Glanzer, J.; Gleckl, A. E.; Godwin, P.; Goetz, E.; Goetz, R.; Gohlke, N.; Golomb, J.; Goncharov, B.; Gonz'alez, G.; Gosselin, M.; Gouaty, R.; Gould, D. W.; Goyal, S.; Grace, B.; Grado, A.; Graham, V.; Granata, M.; Granata, V.; Grant, A.; Gras, S.; Grassia, P.; Gray, C.; Gray, R.; Greco, G.; Green, A. C.; Green, R.; Gretarsson, A. M.; Gretarsson, E. M.; Griffith, D.; Griffiths, W. L.; Griggs, H. L.; Grignani, G.; Grimaldi, A.; Grimes, E.; Grimm, S. J.; Grote, H.; Grunewald, S.; Gruning, P.; Gruson, A. S.; Guerra, D.; Guidi, G. M.; Guimaraes, A. R.; Guix'e, G.; Gulati, H. K.; Gunny, A. M.; Guo, H.-K.; Guo, Y.; Gupta, Anchal; Gupta, Anuradha; Gupta, I. M.; Gupta, P.; Gupta, S. K.; Gustafson, R.; Guzman, F.; Ha, S.; Hadiputrawan, I. P. W.; Haegel, L.; Haino, S.; Halim, O.; Hall, E. D.; Hamilton, E. Z.; Hammond, G.; Han, W.-B.; Haney, M.; Hanks, J.; Hanna, C.; Hannam, M. D.; Hannuksela, O.; Hansen, H.; Hansen, T. J.; Hanson, J.; Harder, T.; Haris, K.; Harms, J.; Harry, G. M.; Harry, I. W.; Hartwig, D.; Hasegawa, K.; Haskell, B.; Haster, C.-J.; Hathaway, J. S.; Hattori, K.; Haughian, K.; Hayakawa, H.; Hayama, K.; Hayes, F. J.; Healy, J.; Heidmann, A.; Heidt, A.; Heintze, M. C.; Heinze, J.; Heinzel, J.; Heitmann, H.; Hellman, F.; Hello, P.; Helmling-Cornell, A. F.; Hemming, G.; Hendry, M.; Heng, I. S.; Hennes, E.; Hennig, J.; Hennig, M. H.; Henshaw, C.; Hernandez, A. G.; Vivanco, F. Hernandez; Heurs, M.; Hewitt, A. L.; Higginbotham, S.; Hild, S.; Hill, P.; Himemoto, Y.; Hines, A. S.; Hirata, N.; Hirose, C.; Ho, T-C.; Hochheim, S.; Hofman, D.; Hohmann, J. N.; Holcomb, D. G.; Holland, N. A.; Hollows, I. J.; Holmes, Z. J.; Holt, K.; Holz, D. E.; Hong, Q.; Hough, J.; Hourihane, S.; Howell, E. J.; Hoy, C. G.; Hoyland, D.; Hreibi, A.; Hsieh, B-H.; Hsieh, H-F.; Hsiung, C.; Hsu, Y.; Huang, H-Y.; Huang, P.; Huang, Y-C.; Huang, Y.-J.; Huang, Yiting; Huang, Yiwen; Hubner, M. T.; Huddart, A. D.; Hughey, B.; Hui, D. C. Y.; Hui, V.; Husa, S.; Huttner, S. H.; Huxford, R.; Huynh-Dinh, T.; Ide, S.; Idzkowski, B.; Iess, A.; Inayoshi, K.; Inoue, Y.; Iosif, P.; Isi, M.; Isleif, K.; Ito, K.; Itoh, Y.; Iyer, B. R.; JaberianHamedan, V.; Jacqmin, T.; Jacquet, P.-E.; Jadhav, S. J.; Jadhav, S. P.; Jain, T.; James, A. L.; Jan, A. Z.; Jani, K.; Janquart, J.; Janssens, K.; Janthalur, N. N.; Jaranowski, P.; Jariwala, D.; Jaume, R.; Jenkins, A. C.; Jenner, K.; Jeon, C.; Jia, W.; Jiang, J.; Jin, H.-B.; Johns, G. R.; Johnson-McDaniel, N. K.; Johnston, R.; Jones, A. W.; Jones, D. I.; Jones, P.; Jones, R.; Joshi, P.; Ju, L.; Jue, A.; Jung, P.; Jung, K.; Junker, J.; Juste, V.; Kaihotsu, K.; Kajita, T.; Kakizaki, M.; Kalaghatgi, C. V.; Kalogera, V.; Kamai, B.; Kamiizumi, M.; Kanda, N.; Kandhasamy, S.; Kang, G.; Kanner, J. B.; Kao, Y.; Kapadia, S. J.; Kapasi, D. P.; Karathanasis, C.; Karki, S.; Kashyap, R.; Kasprzack, M.; Kastaun, W.; Kato, T.; Katsanevas, S.; Katsavounidis, E.; Katzman, W.; Kaur, T.; Kawabe, K.; Kawaguchi, K.; K'ef'elian, F.; Keitel, D.; Key, J. S.; Khadka, S.; Khalili, F. Y.; Khan, S.; Khanam, T.; Khazanov, E. A.; Khetan, N.; Khursheed, M.; Kijbunchoo, N.; Kim, A.; Kim, C.; Kim, J. C.; Kim, J.; Kim, K.; Kim, W. S.; Kim, Y.-M.; Kimball, C.; Kimura, N.; Kinley-Hanlon, M.; Kirchhoff, R.; Kissel, J. S.; Klimenko, S.; Klinger, T.; Knee, A. M.; Knowles, T. D.; Knust, N.; Knyazev, E.; Kobayashi, Y.; Koch, P.; Koekoek, G.; Kohri, K.; Kokeyama, K.; Koley, S.; Kolitsidou, P.; Kolstein, M.; Komori, K.; Kondrashov, V.; Kong, A. K. H.; Kontos, A.; Koper, N.; Korobko, M.; Kovalam, M.; Koyama, N.; Kozak, D. B.; Kozakai, C.; Kringel, V.; Krishnendu, N. V.; Kr'olak, A.; Kuehn, G.; Kuei, F.; Kuijer, P.; Kulkarni, S.; Kumar, A.; Kumar, Prayush; Kumar, Rahul; Kumar, Rakesh; Kume, J.; Kuns, K.; Kuromiya, Y.; Kuroyanagi, S.; Kwak, K.; Lacaille, G.; Lagabbe, P.; Laghi, D.; Lalande, E.; Lalleman, M.; Lam, T. L.; Lamberts, A.; Landry, M.; Lane, B. B.; Lang, R. N.; Lange, J.; Lantz, B.; Rosa, I. La; Lartaux-Vollard, A.; Lasky, P. D.; Laxen, M.; Lazzarini, A.; Lazzaro, C.; Leaci, P.; Leavey, S.; LeBohec, S.; Lecoeuche, Y. K.; Lee, E.; Lee, H. M.; Lee, H. W.; Lee, K.; Lee, R.; Legred, I. N.; Lehmann, J.; Lemaitre, A.; Lenti, M.; Leonardi, M.; Leonova, E.; Leroy, N.; Letendre, N.; Levesque, C.; Levin, Y.; Leviton, J. N.; Leyde, K.; Li, A. K. Y.; Li, B.; Li, J.; Li, K. L.; Li, P.; Li, T. G. F.; Li, X.; Lin, C-Y.; Lin, E. T.; Lin, F-K.; Lin, F-L.; Lin, H. L.; Lin, L. C.-C.; Linde, F.; Linker, S. D.; Linley, J. N.; Littenberg, T. B.; Liu, G. C.; Liu, J.; Liu, K.; Liu, X.; Llamas, F.; Lo, R. K. L.; Lo, T.; London, L. T.; Longo, A.; Lopez, D.; Portilla, M. Lopez; Lorenzini, M.; Loriette, V.; Lormand, M.; Losurdo, G.; Lott, T. P.; Lough, J. D.; Lousto, C. O.; Lovelace, G.; Lucaccioni, J. F.; Luck, H.; Lumaca, D.; Lundgren, A. P.; Luo, L.-W.; Lynam, J. E.; Ma'arif, M.; Macas, R.; Machtinger, J. B.; MacInnis, M.; Macleod, D. M.; MacMillan, I. A. O.; Macquet, A.; Hernandez, I. Magana; Magazzu, C.; Magee, R. M.; Maggiore, R.; Magnozzi, M.; Mahesh, S.; Majorana, E.; Maksimovic, I.; Maliakal, S.; Malik, A.; Man, N.; Mandic, V.; Mangano, V.; Mansell, G. L.; Manske, M.; Mantovani, M.; Mapelli, M.; Marchesoni, F.; Pina, D. Mar'in; Marion, F.; Mark, Z.; M'arka, S.; M'arka, Z.; Markakis, C.; Markosyan, A. S.; Markowitz, A.; Maros, E.; Marquina, A.; Marsat, S.; Martelli, F.; Martin, I. W.; Martin, R. M.; Martinez, M.; Martinez, V. A.; Martinez, V.; Martinovic, K.; Martynov, D. V.; Marx, E. J.; Masalehdan, H.; Mason, K.; Massera, E.; Masserot, A.; Masso-Reid, M.; Mastrogiovanni, S.; Matas, A.; Mateu-Lucena, M.; Matichard, F.; Matiushechkina, M.; Mavalvala, N.; McCann, J. J.; McCarthy, R.; McClelland, D. E.; McClincy, P. K.; McCormick, S.; McCuller, L.; McGhee, G. I.; McGuire, S. C.; McIsaac, C.; McIver, J.; McRae, T.; McWilliams, S. T.; Meacher, D.; Mehmet, M.; Mehta, A. K.; Meijer, Q.; Melatos, A.; Melchor, D. A.; Mendell, G.; Menendez-Vazquez, A.; Menoni, C. S.; Mercer, R. A.; Mereni, L.; Merfeld, K.; Merilh, E. L.; Merritt, J. D.; Merzougui, M.; Meshkov, S.; Messenger, C.; Messick, C.; Meyers, P. M.; Meylahn, F.; Mhaske, A.; Miani, A.; Miao, H.; Michaloliakos, I.; Michel, C.; Michimura, Y.; Middleton, H.; Mihaylov, D. P.; Milano, L.; Miller, A. L.; Miller, A.; Miller, B.; Millhouse, M.; Mills, J. C.; Milotti, E.; Minenkov, Y.; Mio, N.; Mir, Ll. M.; Miravet-Ten'es, M.; Mishkin, A.; Mishra, C.; Mishra, T.; Mistry, T.; Mitra, S.; Mitrofanov, V. P.; Mitselmakher, G.; Mittleman, R.; Miyakawa, O.; Miyo, K.; Miyoki, S.; Mo, Geoffrey; Modafferi, L. M.; Moguel, E.; Mogushi, K.; Mohapatra, S. R. P.; Mohite, S. R.; Molina, I.; Molina-Ruiz, M.; Mondin, M.; Montani, M.; Moore, C. J.; Moragues, J.; Moraru, D.; Morawski, F.; More, A.; Moreno, C.; Moreno, G.; Mori, Y.; Morisaki, S.; Morisue, N.; Moriwaki, Y.; Mours, B.; Mow-Lowry, C. M.; Mozzon, S.; Muciaccia, F.; Mukherjee, Arunava; Mukherjee, D.; Mukherjee, Soma; Mukherjee, Subroto; Mukherjee, Suvodip; Mukund, N.; Mullavey, A.; Munch, J.; Muniz, E. A.; Murray, P. G.; Musenich, R.; Muusse, S.; Nadji, S. L.; Nagano, K.; Nagar, A.; Nakamura, K.; Nakano, H.; Nakano, M.; Nakayama, Y.; Napolano, V.; Nardecchia, I.; Narikawa, T.; Narola, H.; Naticchioni, L.; Nayak, B.; Nayak, R. K.; Neil, B. F.; Neilson, J.; Nelson, A.; Nelson, T. J. N.; Nery, M.; Neubauer, P.; Neunzert, A.; Ng, K. Y.; Ng, S. W. S.; Nguyen, C.; Nguyen, P.; Nguyen, T.; Quynh, L. Nguyen; Ni, J.; Ni, W.-T.; Nichols, S. A.; Nishimoto, T.; Nishizawa, A.; Nissanke, S.; Nitoglia, E.; Nocera, F.; Norman, M.; North, C.; Nozaki, S.; Nurbek, G.; Nuttall, L. K.; Obayashi, Y.; Oberling, J.; O'Brien, B. D.; O'Dell, J.; Oelker, E.; Ogaki, W.; Oganesyan, G.; Oh, J. J.; Oh, K.; Oh, S. H.; Ohashi, M.; Ohashi, T.; Ohkawa, M.; Ohme, F.; Ohta, H.; Okada, M. A.; Okutani, Y.; Olivetto, C.; Oohara, K.; Oram, R.; O'Reilly, B.; Ormiston, R. G.; Ormsby, N. D.; O'Shaughnessy, R.; O'Shea, E.; Oshino, S.; Ossokine, S.; Osthelder, C.; Otabe, S.; Ottaway, D. J.; Overmier, H.; Pace, A. E.; Pagano, G.; Pagano, R.; Page, M. A.; Pagliaroli, G.; Pai, A.; Pai, S. A.; Pal, S.; Palamos, J. R.; Palashov, O.; Palomba, C.; Pan, H.; Pan, K.-C.; Panda, P. K.; Pang, P. T. H.; Pankow, C.; Pannarale, F.; Pant, B. C.; Panther, F. H.; Paoletti, F.; Paoli, A.; Paolone, A.; Pappas, G.; Parisi, A.; Park, H.; Park, J.; Parker, W.; Pascucci, D.; Pasqualetti, A.; Passaquieti, R.; Passuello, D.; Patel, M.; Pathak, M.; Patricelli, B.; Patron, A. S.; Paul, S.; Payne, E.; Pedraza, M.; Pedurand, R.; Pegoraro, M.; Pele, A.; Arellano, F. E. Pena; Penano, S.; Penn, S.; Perego, A.; Pereira, A.; Pereira, T.; Perez, C. J.; P'erigois, C.; Perkins, C. C.; Perreca, A.; Perries, S.; Pesios, D.; Petermann, J.; Petterson, D.; Pfeiffer, H. P.; Pham, H.; Pham, K. A.; Phukon, K. S.; Phurailatpam, H.; Piccinni, O. J.; Pichot, M.; Piendibene, M.; Piergiovanni, F.; Pierini, L.; Pierro, V.; Pillant, G.; Pillas, M.; Pilo, F.; Pinard, L.; Pineda-Bosque, C.; Pinto, I. M.; Pinto, M.; Piotrzkowski, B. J.; Piotrzkowski, K.; Pirello, M.; Pitkin, M. D.; Placidi, A.; Placidi, E.; Planas, M. L.; Plastino, W.; Pluchar, C.; Poggiani, R.; Polini, E.; Pong, D. Y. T.; Ponrathnam, S.; Porter, E. K.; Poulton, R.; Poverman, A.; Powell, J.; Pracchia, M.; Pradier, T.; Prajapati, A. K.; Prasai, K.; Prasanna, R.; Pratten, G.; Principe, M.; Prodi, G. A.; Prokhorov, L.; Prosposito, P.; Prudenzi, L.; Puecher, A.; Punturo, M.; Puosi, F.; Puppo, P.; Purrer, M.; Qi, H.; Quartey, N.; Quetschke, V.; Quinonez, P. J.; Quitzow-James, R.; Qutob, N.; Raab, F. J.; Raaijmakers, G.; Radkins, H.; Radulesco, N.; Raffai, P.; Rail, S. X.; Raja, S.; Rajan, C.; Ramirez, K. E.; Ramirez, T. D.; Ramos-Buades, A.; Rana, J.; Rapagnani, P.; Ray, A.; Raymond, V.; Raza, N.; Razzano, M.; Read, J.; Rees, L. A.; Regimbau, T.; Rei, L.; Reid, S.; Reid, S. W.; Reitze, D. H.; Relton, P.; Renzini, A.; Rettegno, P.; Revenu, B.; Reza, A.; Rezac, M.; Ricci, F.; Richards, D.; Richardson, J. W.; Richardson, L.; Riemenschneider, G.; Riles, K.; Rinaldi, S.; Rink, K.; Robertson, N. A.; Robie, R.; Robinet, F.; Rocchi, A.; Rodriguez, S.; Rolland, L.; Rollins, J. G.; Romanelli, M.; Romano, R.; Romel, C. L.; Romero, A.; Romero-Shaw, I. M.; Romie, J. H.; Ronchini, S.; Rosa, L.; Rose, C. A.; Rosi'nska, D.; Ross, M. P.; Rowan, S.; Rowlinson, S. J.; Roy, S.; Roy, Santosh; Roy, Soumen; Rozza, D.; Ruggi, P.; Ruiz-Rocha, K.; Ryan, K.; Sachdev, S.; Sadecki, T.; Sadiq, J.; Saha, S.; Saito, Y.; Sakai, K.; Sakellariadou, M.; Sakon, S.; Salafia, O. S.; Salces-Carcoba, F.; Salconi, L.; Saleem, M.; Salemi, F.; Samajdar, A.; Sanchez, E. J.; Sanchez, J. H.; Sanchez, L. E.; Sanchis-Gual, N.; Sanders, J. R.; Sanuy, A.; Saravanan, T. R.; Sarin, N.; Sassolas, B.; Satari, H.; Sathyaprakash, B. S.; Sauter, O.; Savage, R. L.; Savant, V.; Sawada, T.; Sawant, H. L.; Sayah, S.; Schaetzl, D.; Scheel, M.; Scheuer, J.; Schiworski, M. G.; Schmidt, P.; Schmidt, S.; Schnabel, R.; Schneewind, M.; Schofield, R. M. S.; Schonbeck, A.; Schulte, B. W.; Schutz, B. F.; Schwartz, E.; Scott, J.; Scott, S. M.; Seglar-Arroyo, M.; Sekiguchi, Y.; Sellers, D.; Sengupta, A. S.; Sentenac, D.; Seo, E. G.; Sequino, V.; Sergeev, A.; Setyawati, Y.; Shaffer, T.; Shahriar, M. S.; Shaikh, M. A.; Shams, B.; Shao, L.; Sharma, A.; Sharma, P.; Shawhan, P.; Shcheblanov, N. S.; Sheela, A.; Shikano, Y.; Shikauchi, M.; Shimizu, H.; Shimode, K.; Shinkai, H.; Shishido, T.; Shoda, A.; Shoemaker, D. H.; Shoemaker, D. M.; ShyamSundar, S.; Sieniawska, M.; Sigg, D.; Silenzi, L.; Singer, L. P.; Singh, D.; Singh, M. K.; Singh, N.; Singha, A.; Sintes, A. M.; Sipala, V.; Skliris, V.; Slagmolen, B. J. J.; Slaven-Blair, T. J.; Smetana, J.; Smith, J. R.; Smith, L.; Smith, R. J. E.; Soldateschi, J.; Somala, S. N.; Somiya, K.; Song, I.; Soni, K.; Soni, S.; Sordini, V.; Sorrentino, F.; Sorrentino, N.; Soulard, R.; Souradeep, T.; Sowell, E.; Spagnuolo, V.; Spencer, A. P.; Spera, M.; Spinicelli, P.; Srivastava, A. K.; Srivastava, V.; Staats, K.; Stachie, C.; Stachurski, F.; Steer, D. A.; Steinhoff, J.; Steinlechner, J.; Steinlechner, S.; Stergioulas, N.; Stops, D. J.; Stover, M.; Strain, K. A.; Strang, L. C.; Stratta, G.; Strong, M. D.; Strunk, A.; Sturani, R.; Stuver, A. L.; Suchenek, M.; Sudhagar, S.; Sudhir, V.; Sugimoto, R.; Suh, H. G.; Sullivan, A. G.; Sullivan, J. M.; Summerscales, T. Z.; Sun, L.; Sunil, S.; Sur, A.; Suresh, J.; Sutton, P. J.; Suzuki, Takamasa; Suzuki, Takanori; Suzuki, Toshikazu; Swinkels, B. L.; Szczepa'nczyk, M. J.; Szewczyk, P.; Tacca, M.; Tagoshi, H.; Tait, S. C.; Takahashi, H.; Takahashi, R.; Takano, S.; Takeda, H.; Takeda, M.; Talbot, C. J.; Talbot, C.; Tanaka, K.; Tanaka, Taiki; Tanaka, Takahiro; Tanasijczuk, A. J.; Tanioka, S.; Tanner, D. B.; Tao, D.; Tao, L.; Tapia, R. D.; Mart'in, E. N. Tapia San; Taranto, C.; Taruya, A.; Tasson, J. D.; Tenorio, R.; Terhune, J. E. S.; Terkowski, L.; Thirugnanasambandam, M. P.; Thomas, M.; Thomas, P.; Thompson, E. E.; Thompson, J. E.; Thondapu, S. R.; Thorne, K. A.; Thrane, E.; Tiwari, Shubhanshu; Tiwari, Srishti; Tiwari, V.; Toivonen, A. M.; Tolley, A. E.; Tomaru, T.; Tomura, T.; Tonelli, M.; Tornasi, Z.; Torres-Forn'e, A.; Torrie, C. I.; Melo, I. Tosta e; Toyra, D.; Trapananti, A.; Travasso, F.; Traylor, G.; Trevor, M.; Tringali, M. C.; Tripathee, A.; Troiano, L.; Trovato, A.; Trozzo, L.; Trudeau, R. J.; Tsai, D.; Tsang, K. W.; Tsang, T.; Tsao, J-S.; Tse, M.; Tso, R.; Tsuchida, S.; Tsukada, L.; Tsuna, D.; Tsutsui, T.; Turbang, K.; Turconi, M.; Tuyenbayev, D.; Ubhi, A. S.; Uchikata, N.; Uchiyama, T.; Udall, R. P.; Ueda, A.; Uehara, T.; Ueno, K.; Ueshima, G.; Unnikrishnan, C. S.; Urban, A. L.; Ushiba, T.; Utina, A.; Vajente, G.; Vajpeyi, A.; Valdes, G.; Valentini, M.; Valsan, V.; Bakel, N. van; Beuzekom, M. van; Dael, M. van; Brand, J. F. J. van den; Broeck, C. Van Den; Vander-Hyde, D. C.; Haevermaet, H. van; Heijningen, J. V. van; Putten, M. H. P. M. van; Remortel, N. van; Vardaro, M.; Vargas, A. F.; Varma, V.; Vas'uth, M.; Vecchio, A.; Vedovato, G.; Veitch, J.; Veitch, P. J.; Venneberg, J.; Venugopalan, G.; Verkindt, D.; Verma, P.; Verma, Y.; Vermeulen, S. M.; Veske, D.; Vetrano, F.; Vicer'e, A.; Vidyant, S.; Viets, A. D.; Vijaykumar, A.; Villa-Ortega, V.; Vinet, J.-Y.; Virtuoso, A.; Vitale, S.; Vocca, H.; Reis, E. R. G. von; Wrangel, J. S. A. von; Vorvick, C.; Vyatchanin, S. P.; Wade, L. E.; Wade, M.; Wagner, K. J.; Wald, R.; Walet, R. C.; Walker, M.; Wallace, G. S.; Wallace, L.; Wang, J.; Wang, J. Z.; Wang, W. H.; Ward, R. L.; Warner, J.; Was, M.; Washimi, T.; Washington, N. Y.; Watchi, J.; Weaver, B.; Weaving, C. R.; Webster, S. A.; Weinert, M.; Weinstein, A. J.; Weiss, R.; Weller, C. M.; Weller, R. A.; Wellmann, F.; Wen, L.; Wessels, P.; Wette, K.; Whelan, J. T.; White, D. D.; Whiting, B. F.; Whittle, C.; Wilken, D.; Williams, D.; Williams, M. J.; Williamson, A. R.; Willis, J. L.; Willke, B.; Wilson, D. J.; Wipf, C. C.; Wlodarczyk, T.; Woan, G.; Woehler, J.; Wofford, J. K.; Wong, D.; Wong, I. C. F.; Wright, M.; Wu, C.; Wu, D. S.; Wu, H.; Wysocki, D. M.; Xiao, L.; Yamada, T.; Yamamoto, H.; Yamamoto, K.; Yamamoto, T.; Yamashita, K.; Yamazaki, R.; Yang, F. W.; Yang, K. Z.; Yang, L.; Yang, Y.-C.; Yang, Y.; Yang, Yang; Yap, M. J.; Yeeles, D. W.; Yeh, S.-W.; Yelikar, A. B.; Ying, M.; Yokoyama, J.; Yokozawa, T.; Yoo, J.; Yoshioka, T.; Yu, Hang; Yu, Haocun; Yuzurihara, H.; Zadro.zny, A.; Zanolin, M.; Zeidler, S.; Zelenova, T.; Zendri, J.-P.; Zevin, M.; Zhan, M.; Zhang, H.; Zhang, J.; Zhang, L.; Zhang, R.; Zhang, T.; Zhang, Y.; Zhao, C.; Zhao, G.; Zhao, Y.; Zhao, Yue; Zhou, R.; Zhou, Z.; Zhu, X. J.; Zhu, Z.-H.; Zimmerman, A. B.; Zucker, M. E.; Zweizig, J. title: Tests of General Relativity with GWTC-3 date: 2021-12-13 journal: nan DOI: nan sha: 219d444e7a3b9b887b973d144c0ed44b10c3bc19 doc_id: 576868 cord_uid: qwow1x1c The ever-increasing number of detections of gravitational waves (GWs) from compact binaries by the Advanced LIGO and Advanced Virgo detectors allows us to perform ever-more sensitive tests of general relativity (GR) in the dynamical and strong-field regime of gravity. We perform a suite of tests of GR using the compact binary signals observed during the second half of the third observing run of those detectors. We restrict our analysis to the 15 confident signals that have false alarm rates $leq 10^{-3}, {rm yr}^{-1}$. In addition to signals consistent with binary black hole (BH) mergers, the new events include GW200115_042309, a signal consistent with a neutron star--BH merger. We find the residual power, after subtracting the best fit waveform from the data for each event, to be consistent with the detector noise. Additionally, we find all the post-Newtonian deformation coefficients to be consistent with the predictions from GR, with an improvement by a factor of ~2 in the -1PN parameter. We also find that the spin-induced quadrupole moments of the binary BH constituents are consistent with those of Kerr BHs in GR. We find no evidence for dispersion of GWs, non-GR modes of polarization, or post-merger echoes in the events that were analyzed. We update the bound on the mass of the graviton, at 90% credibility, to $m_g leq 1.27 times 10^{-23} mathrm{eV}/c^2$. The final mass and final spin as inferred from the pre-merger and post-merger parts of the waveform are consistent with each other. The studies of the properties of the remnant BHs, including deviations of the quasi-normal mode frequencies and damping times, show consistency with the predictions of GR. In addition to considering signals individually, we also combine results from the catalog of GW signals to calculate more precise population constraints. We find no evidence in support of physics beyond GR. The ever-increasing number of detections of gravitational waves from compact binaries by the Advanced LIGO and Advanced Virgo detectors allows us to perform ever-more sensitive tests of general relativity (GR) in the dynamical and strong-field regime of gravity. We perform a suite of tests of GR using the compact binary signals observed during the second half of the third observing run of those detectors. We restrict our analysis to the 15 confident signals that have false alarm rates ≤ 10 −3 yr −1 . In addition to signals consistent with binary black hole mergers, the new events include GW200115 042309, a signal consistent with a neutron star-black hole merger. We find the residual power, after subtracting the best fit waveform from the data for each event, to be consistent with the detector noise. Additionally, we find all the post-Newtonian deformation coefficients to be consistent with the predictions from GR, with an improvement by a factor of ∼ 2 in the −1PN parameter. We also find that the spin-induced quadrupole moments of the binary black hole constituents are consistent with those of Kerr black holes in GR. We find no evidence for dispersion of gravitational waves, non-GR modes of polarization, or post-merger echoes in the events that were analyzed. We update the bound on the mass of the graviton, at 90% credibility, to m g ≤ 1.27 × 10 −23 eV/c 2 . The final mass and final spin as inferred from the pre-merger and post-merger parts of the waveform are consistent with each other. The studies of the properties of the remnant black holes, including deviations of the quasi-normal mode frequencies and damping times, show consistency with the predictions of GR. In addition to considering signals individually, we also combine results from the catalog of gravitational waves signals to calculate more precise population constraints. We find no evidence in support of physics beyond general relativity. The first three observing runs of Advanced LIGO [1] and Advanced Virgo [2] have led to detections of signals consistent with coming from the three canonical classes of compact binary systems: binary black holes (BBH) [3] , binary neutron stars (BNS) [4] , and neutron star-black holes (NSBH) [5] . 1 These observations had, in particular, a profound impact on fundamental physics as they allowed us to probe the properties of gravity in the highly nonlinear and dynamical regime. These detections subjected Einstein's general relativity (GR), a Deceased, August 2020. b Deceased, April 2021. modeling using analytical (see for instance [34, 35] for reviews) and numerical relativity techniques within GR, e.g. [36] [37] [38] [39] , such null tests of GR have been successful so far. However, the null tests are also sensitive to the physics within GR that is not accounted for in the waveform model that is used (such as eccentricity, presence of exotic compact objects, etc) in addition to beyond-GR physics. For example, some of the tests performed here, which involve the dynamics of the merger remnant, may be interpreted as tests of GR or as a test of the black hole nature of the remnant. However, for uniformity, they will be referred to as tests of GR in this paper. The tests are not completely independent of each other and vary in sensitivity to different types of deviations [40] , and by considering them together we obtain the best overview of how well predictions and observations agree. Theoretically, gravitational waves in a modified theory of gravity may differ from GR broadly in three different ways: generation, propagation, and polarization. Gravitational-wave generation relates the outgoing radiation to the properties of the source, a hard problem even in GR. Propagation of gravitational waves in a modified theory of gravity can differ from that in GR via effects such as dispersion [41] , birefringence [42] , and amplitude damping [43, 44] . An effective-field theoretic approach to the problem can be found in [45] [46] [47] [48] . Tests based on propagation effects target those modified theories which predict generation of gravitational waves to be very close to that of GR, but differ in the way the waves propogate. This can happen in theories like massive graviton theories where the generation effects are suppressed by powers of r/λ g 1 where λ g is the Compton wavelength of the graviton and r the size the binary [49] . A general metric theory of gravity can allow up to six modes of polarization: two tensor, two vector, and two scalar modes [50, 51] . In GR, one only has the two tensor modes referred to as plus and cross. Hence, searching for non-GR modes of polarization is also an effective method to search for violation of GR. Nevertheless, there are alternative theories which predict modified GW generation or propagation but do not predict additional non-GR modes of polarization, e.g. [52, 53] . Hence, confidently detecting signatures of one or more of these three effects would strongly suggest a possible GR violation. While these theoretical insights get reflected in our analysis strategies while searching for possible departures from GR, the organization of this paper instead classifies the tests of GR as consistency tests and parameterized tests. Consistency tests, as the name indicates, search for possible violations of GR by asking how consistent the observed signal is with that of GR and do not invoke any parametrization of the deviation from GR. Consistency tests may be tests of self-consistency of the signal or overall consistency of the signal with the data. We consider one of each type of test: The inspiral-merger-ringdown (IMR) consistency test checks for consistency between the low-and high-frequency parts of the signal, while the residuals test subtracts the best-fit GR waveform from the data and asks whether there is any statistically significant residual power. Parameterized tests, on the other hand, invoke a specific parametrization that is appropriate for searching for possible deviations from GR in terms of certain physical effects. For example, the parameterized tests of post-Newtonian (PN) coefficients are sensitive to the physical effects that appear at different PN orders. Similarly, by parameterizing the dispersion relation, one can look for possible imprints of non-GR propagation effects in the gravitational waveforms. A variant of the null-stream based method allows us to probe non-GR modes of gravitational-wave polarization. The detected coalescences of massive compact objects may involve not only black holes of classical GR, but also different compact objects described by exotic physics, commonly referred to as exotic compact objects (ECOs). They include objects like firewalls [54] , fuzzballs [55] , gravastars [56] , boson stars [57] , AdS black bubbles [58] and dark matter stars [59] . What these objects have in common is the absence of a horizon, causing ingoing gravitational waves (e.g., resulting from merger) to reflect multiple times off effective radial potential barriers, with wave packets leaking out to infinity, potentially, at regular times; these are called echoes [60] [61] [62] . We also perform a series of tests that search for possible GR violation or non-Kerr nature of the merger remnant, specifically in the postinspiral part of the waveform. Ringdown tests [63] [64] [65] [66] probe the consistency of the post-merger dynamics with the predictions for Kerr black holes in GR, while searches for echoes constrain the presence of repeating ringdown signals [60, 61, [67] [68] [69] [70] expected in certain classes of exotic compact objects (ECOs). Tests of GR performed on the data of the previous observing runs have set increasingly stringent limits [10, 11] . The bounds on the parameterized post-Newtonian deformation coefficients have been used to constrain the parameter space of alternatives to GR [71, 72] relying on several assumptions about the underlying GR violation (critiqued in [40, 73] ). The joint detection of gravitational waves and the gamma rays from the binary neutron star merger GW170817 has placed extremely stringent bounds on the speed of gravitational waves [74] , which in turn has had a profound impact on constraining certain classes of modified theories from a cosmological standpoint [75] [76] [77] [78] . This joint detection also has been used to constrain the number of space-time dimensions [9, 79] . Results on the measurement of the properties of the merger remnant from ringdown, bounds on spin-induced quadrupole moment of compact binary constituents and the outcome of the search for echoes have implications for models of black hole mimickers [80] . This paper analyzes events reported during the second half of the third observing run of LIGO and Virgo (O3b) [81] , extending our previous analysis [11] which reported the status of the bounds up to and including the first half of the third observing run (O3a). This paper also provides joint bounds combining the events that occurred during the first three observing runs whenever possible, in addition to deriving limits on possible departures from GR for individual events. The analysis methods used here are largely similar to those used in our O3a analysis [11] , and there are only significant differences for the polarization-based test of GR and the search for post-merger echoes. The polarization test was updated to probe mixed polarization content using a framework based upon the null stream. [82] . The morphology-dependent search TABLE I. Summary of methods and results. This table summarizes the names of the tests performed, the corresponding sections, the parameters involved in the test, and the improvement with regard to our previous analysis. The analyses performed are: RT = residuals test; IMR = inspiral-merger-ringdown consistency test; PAR = parametrized tests of GW generation; SIM = spin-induced moments; MDR = modified GW dispersion relation; POL = polarization content; RD = ringdown; ECH = echoes searches. The last column provides the approximate improvement in the bounds over the previous analyses reported in [11] . This is defined as X GWTC−2 /X GWTC−3 , where X denotes the width of the 90% credible interval for the parameters for each test, using the combined results on all events considered. For the MDR test, some of the bounds have worsened in comparison to GWTC-2. See the corresponding section for details. Note that the high improvement factor for pSEOB is due to the larger number of events from GWTC-2 analysed here compared to [11] . [84, 85] . Besides these changes, the waveform models employed in most analyses have been upgraded to more accurate and complete ones, accounting for more physics, the details of which are discussed in Sec. III. Table I summarizes the tests that are performed, the quantities that are used for the test, the fractional changes with regard to the previous analyses, and the section where details about each test can be found. The paper is organized as follows. Section II discusses the data used in this paper while Sec. III describes the method of extracting astrophysical information about events from the data. Section IV discusses two tests of consistency with GR: examining the residuals left in the data after subtracting the best-fit GR waveform (Sec. IV A) and looking at consistency of the inspiral and postinspiral portions of the waveform with GR (Sec. IV B). Section V discusses two tests of gravitational wave generation, the parameterized tests of GR (Sec. V A) and the test for BBH nature using the spin-induced quadrupole moment (Sec.V B). Section VI discusses tests of gravitational wave propagation looking for non-GR dispersion of gravitational waves. Section VII reports results from the searches for non-GR polarization. Section VIII discusses various tests using the merger remnants, specifically two analyses of ringdown (Sec. VIII A) and a search for the signatures of echoes (Sec. VIII B). Section IX discusses the conclusions. The global network of gravitational wave detectors completed their third observing run in March 2020. O3b adds 35 candidate events with probability of being of astrophysical origin better than 0.5, including the first confident observations of neutron star-black hole systems [5, 81] . The analyses presented here are focused on events from O3b, though the joint bounds that are reported also include events from previous observing runs. Following our O3a analysis [11] , we consider only those events with false alarm rates lower than 10 −3 per year that were confidently observed in two or more detectors as determined by any search pipeline used in the catalog of O3b events [81] . Of the 14 binary black hole mergers and the one neutron star-black hole merger (GW200115 042309) that pass this threshold, nine events are observed with three detectors and six are observed with only two detectors. The median total masses in the detector frame of these analysed events range from ∼ 8-140 M . The LIGO interferometers maintained sensitivities comparable to that in O3a [86] , and the Virgo interferometer achieved a ∼ 20% improvement during O3b. As with previous results, noise subtraction methods [83, [87] [88] [89] were applied to selected events in order to improve parameter estimation. The third gravitational wave transient catalog (GWTC-3) [81] includes details on instrument performance. Three events analysed here were identified in as requiring additional data quality mitigation. One, GW191109 010717 had data quality issues in both LIGO detectors, while the other two (GW200115 042309 and GW200129 065458) were each only affected by noise transients in one interferometer. Details on the noise transient removal techniques can be found in GWTC-3's Table XIV . Appendix A below discusses how these data quality issues can affect analyses in this article. Table II shows selected source properties of events from the O3b observing run that are included in this paper. Similar tables in O2 [10] and O3a [11] analyses provide selected source properties for the other events included in the analyses presented here. Every test detailed in the following sections has its own sub-selection criteria of events from this table based on the physics it explores; these criteria are detailed in the respective sections for each analysis. Detection significance is given by four search pipelines, three of which rely on GR-based templates (PyCBC [90] [91] [92] , MBTA [93] , and GstLAL [94, 95] ) and one that does not (coherent WaveBurst; cWB [96] [97] [98] ). Details on these pipelines, including the two different PyCBC configurations used, is included in Appendix D of GWTC-3 [81] . While making significance determinations using searches based on GR could potentially lead to selection biases disfavoring events that deviate significantly from GR, the use of the minimally modeled search cWB helps alleviate this concern: cWB is sensitive to at least some of the potential chirp-like signals that deviate enough from GR that they would not be detected with high significance by the other three pipelines. However, we are unable to fully exclude the possibility of a hidden population of signals that show significant departures from GR thus evading detection by all of our pipelines. Of the O3b events which satisfy our thresholds, only one (GW200225 060421) was detected with more significance by the cWB pipeline, though the difference in detection significance between the various pipelines for this event is negligible. Many of the tests performed here build upon two BBH waveform families. Consistency tests, as well as tests of gravitational-wave generation and propagation are based on precessing phenomenological (Phenom) models. Phenomenological models are currently constructed following the twistingup technique [100] [101] [102] , where the signal in a co-precessing frame is approximated by an aligned-spin model and is then transformed to the detector's inertial frame by means of frequency (or time) dependent transfer functions. Previous works made use of the frequency-domain PhenomP family, which includes PhenomPv2 and PhenomPv3HM [103] , where version numbers distinguish between different PN-based approximations of the precession dynamics, with v2 (v3) adopting a single (double) spin prescription. The suffix "HM" is appended to indicate that the model includes some higherorder multipole moments. We will employ these models only for spin-induced quadrupole moment tests and otherwise upgrade to the most recent frequency-domain PhenomX family [104, 105] , which includes IMRPhenomXP and IMRPhe-nomXPHM [106] . While the precession prescriptions adopted by PhenomPv3HM and IMRPhenomXPHM are similar, the aligned-spin baseline of PhenomX, including the higher-order multipoles (2, 1), (3, 3) , (3, 2) and (4, 4) , is calibrated to a larger set of numerical simulations with respect to its predecessors. For parameterized tests of PN theory (Sec. V A), as well as for the ringdown test of Sec. VIII A 2, we also employ the spinning effective one-body family (SEOB) [107] [108] [109] [110] . In the EOB framework, the two-body dynamics is mapped onto that of an effective body moving in a deformed black-hole spacetime, the deformation being the symmetric mass ratio. The dynamics and gravitational radiation are obtained solving numerically the Hamilton equations, which are built from the EOB Hamiltonian and radiation-reaction force. The latter combines information from PN theory, gravitational self-force, perturbation theory, and NR simulations [34, [111] [112] [113] [114] . In this work, we will employ approximants belonging to the latest generation of SEOB models, SEOBNRv4 [115] , specifically the aligned-spin model SEOBNRv4HM [116] and the frequency-domain model SEOBNRv4 ROM, a reduced-order-model version of SEOB-NRv4, which allows a significant speed up with respect to the original time-domain version [115, 117] . On top of the leading (2, 2) multipole, the SEOBNRv4HM model also includes the spherical harmonic multipoles (2, 1), (3, 3) , (4, 4) , (5, 5) . For most tests, our choice of waveform model is dictated by the need for computational efficiency. However, for some tests, this is dictated by technical and physical considerations. For instance, we use time-domain waveform models for the ringdown tests, since such tests are most conveniently formulated in the time domain [118] [119] [120] . Most template-based tests presented here rely on the assumption that the signal was generated in a quasi-circular BBH coalescence. It has been shown that signatures of eccentricity may lead to biases in the estimated source parameters [121] [122] [123] [124] . The issue could especially affect the analysis of short-duration signals such as GW190521, where the complementarity between inspiral and merger-ringdown cannot be efficiently leveraged to break degeneracies in current templates [125] [126] [127] [128] [129] . Comprehensive BBH waveform models including precession, higher-order multipole moments and eccentricity are not yet available, and a thorough assessment of the impact of waveform systematics is beyond the scope of this analysis. Work in this direction is crucial, as inaccuracies in waveform templates could lead to detecting false violations of GR [130] . As for higher-order multipole moments, their relative contribution can be estimated by computing their SNR distribution on a reference set of posterior samples [5, [131] [132] [133] . Based on the IMRPhenomXPHM parameter estimation runs presented in GWTC-3 [81] , we find that the SNR distributions for individual higher-order multipoles is consistent with Gaussian noise for all the events reported in Table II , though the sum of several subdominant harmonics does contribute a non-negligible amount of SNR for GW191109 010717. Therefore, we choose to neglect them in the most computationally expensive tests, restricting their inclusion to the residuals, IMR consistency, and ringdown tests. Our event list includes the NSBH candidate GW200115 042309. We do not consider matter effects in its analysis and expect them to be negligible: tidal imprints are suppressed by the relatively extreme mass ratio of the system and a higher SNR would be needed for them to be confidently detected [5, 134, 135] . We perform Bayesian parameter estimation using either the python based package bilby [136, 137] or the software library LALInference [138] , which belongs to the LIGO Scientific Collaboration Algorithm Library Suite [139] . Runs performed with bilby are automated through the package bilby pipe [137] and rely on static nested sampling [140] as available through the package dynesty [141] , while LALInference runs use either nested or Markov-chain Monte Carlo sampling [142] . One of the ringdown tests presented here makes use of pyRing [143] , which relies on cpnest [144] . Power spectral densities are generated through the software BAYESWAVE [83, 145] and match those used in GWTC-3 TABLE II. List of O3b events considered in this paper. The first block of columns gives the names of the events and lists the instruments (LIGO Hanford, LIGO Livingston, Virgo) involved in each detection, as well as some relevant properties obtained assuming GR: luminosity distance D L , redshifted total mass (1 + z)M, redshifted chirp mass (1 + z)M, redshifted final mass (1 + z)M f , dimensionless final spin χ f = c| S f |/(GM 2 f ), and network signal-to-noise ratio SNR. Reported quantities correspond to the median and 90% symmetric credible intervals, as computed in Table IV in GWTC-3 [81] . The final mass and final spin quantities are inferred from analysis of the entire signal and are for the remnant long after the coalescence and ringdown are complete, as described in [99] . The last block of columns indicates which analyses are performed on a given event according to the selection criteria in Sec. II: RT = residuals test (Sec. IV A); IMR = inspiral-merger-ringdown consistency test (Sec. IV B); PAR = parametrized tests of GW generation (Sec. V A); SIM = spin-induced moments (Sec. V B); MDR = modified GW dispersion relation (Sec. VI); POL = polarization content (Sec. VII); RD = ringdown (Sec. VIII A); ECH = echoes searches (Sec. VIII B). Inst. Properties SNR Tests performed [81] , unless otherwise stated. When combining results obtained from multiple events, we employ two methods. The first method relies on the multiplication of the individual likelihoods corresponding to the deformation parameters that are inferred from the data. This method assumes that the deformation parameters take the same value across events [146] . This is a restrictive assumption for all the tests we consider except for the modified dispersion test. In order to address this, wherever possible, we combine the information from the tests for different events hierarchically using a model that does not make this restrictive assumption and hence provides bounds which are qualitatively more robust [147] . We quantify the agreement of our results with GR using several statistical indicators. In Section IV B we present GR quantiles Q 2D GR for joint distributions, which denote the fraction of the likelihood contained within the isoprobability contour passing through the GR value, with a smaller GR quantile indicating a better agreement with GR. In Sections V A and VI, we report instead quantiles on one-dimensional distributions Q GR . When error bars are reported, they denote 90% confidence intervals and, likewise, we show 90% credible regions (intervals) when presenting joint (individual) posterior distributions. Measuring the remnant coherent power in the network data after the subtraction of the best-fit GR template can be used to quantify consistency of GR waveform model with the data. The random noise in different detectors can be taken to be incoherent. The presence of consistent noise in the network after removing the GW signal from the data indicates an inconsistency between the signal present in the data and the GR template used. The residual analysis is designed to detect such discrepancies of the data with GR [6, 10, 148, 149] . A residual data set is obtained by subtracting the waveform corresponding to maximum likelihood parameters from individual detector data with a window size of one second around the trigger time. The window size of one second is used due to the relatively short length of the signal. Then the residual SNR, or SNR 90 , is computed as the 90% credible upper limit on the remnant coherent network SNR in the residual data using the BAYESWAVE pipeline [83, 145, 150] . BAYESWAVE uses a template-independent model to characterize any excess power in the residual compared to the detector noise. We follow the method used in the previous analyses performed in GWTC-1 [10] and GWTC-2 [11] . However, we use the new phenomenological waveform model PhenomX-PHM [104, 105] as the GR waveform model. For each gravitational wave event, in addition to calculating SNR 90 , additional BAYESWAVE runs are done on two hundred randomly selected time segments on a time window of 4096s symmetric around the event time. This allows us to calculate p-values of residual SNRs for individual events, which is equal to the probability of obtaining a background value of SNR 90 higher than that of the event. We perform the analysis on all the events listed in Table II . The results from the residual analysis are summarized in Table III . For each event, we have presented the SNR of the best-fit waveform SNR GR , SNR 90 , fitting factor FF 90 = SNR GR /(SNR 2 90 + SNR 2 GR ) 1/2 , and p-values calculated from the background analysis. To analyze the trends between SNR 90 and SNR GR , in Fig. 1 we present the scatter of SNR 90 and SNR GR . The absence of correlation between SNR 90 and SNR GR in the figure indicates that data is consistent with GR templates and the values of SNR 90 depend purely on the noise levels in the detectors at the detection of individual events. GW191222 033537 shows the highest pvalue = 1.0 with SNR 90 = 4.87 and FF 90 = 0.93. Even though, GW200219 094415 has the lowest fitting factor FF 90 = 0.74 with SNR 90 = 10.23, its p-value = 0.1 is slightly above the lowest p-value = 0.05 which corresponds to the event GW200225 060421. If the left-over coherent network SNR were purely from detector noise, we should expect the SNR 90 p-values to be uniformly distributed within [0, 1]. To demonstrate the consistency of the observed p-values with the noise (null) hypothesis, in Fig. 2 , we present a probability-probability (PP) plot of the p-values 2 . To produce the PP plot, we have considered all the events in GWTC-3 that pass the FAR threshold. The measurement of p-values is subjected to uncertainty due to the finite size of background runs. If N is the total number of background trials around an event, and if n of them produce SNR 90 greater than that of the event, then the likelihood of the estimated p-valuep = n/N is a binomial function, where p is the true p-value [11] . Assuming uniform prior, we can obtain posterior distribution of p-value as a Beta distribution, In Fig. 2 , the light-blue band around the PP curve represents the 90% uncertainty region of the p-value posteriors. The diagonal dashed line denotes the prior hypothesis with surrounding light-gray band representing 90% uncertainty region of the null hypothesis due to the finite number of events [151, 152] . The PP plot is well with in the 90% credible region of the null hypothesis indicating no significant deviation in the residual data from the expected incoherent noise distribution in the individual instruments. The IMR consistency test checks the consistency of the mass and spin of the remnant BH inferred from the low-and highfrequency parts of the signal. To achieve this, we divide the GW signal into two parts in the frequency domain at the cutoff frequency f IMR c which is the dominant mode GW frequency [153, 154] . The mass and spin of the remnant BH are estimated by applying NR-calibrated fits [99, [155] [156] [157] [158] to the median values of the redshifted component masses, dimensionless spins, and spin angles obtained using the full IMR signal and the waveform model IMRPhenomXPHM. The lowand high-frequency regimes roughly correspond to the inspiral and postinspiral, respectively, of the dominant mode of the waveform. To make sure that the two regimes of the signal have enough information, we calculate the SNR of the inspiral and the postinspiral parts of the waveform for each event using their maximum a posteriori parameter values obtained from the full IMR signal. We analyze only those signals which have SNRs greater than 6 in both the inspiral and the postinspiral parts. This constraint was also used in previous studies [10, 11] . We also impose an extra mass constraint (1 + z)M < 100 M as in our previous analysis of GWTC-2 events [11] to ensure enough inspiral signal for heavier BBHs. The SNRs for the inspiral and the postinspiral regimes of the events analyzed are given in Table IV. We independently estimate the posterior distributions of the mass M f and the dimensionless spin χ f of the remnant BH from both the inspiral and the postinspiral parts of the signal. To constrain possible deviations from GR, two fractional deviation andM Table IV Table IV where the color encodes the median redshifted total mass. denotes the cutoff frequency between the inspiral and postinspiral regimes; ρ IMR , ρ insp , and ρ postinsp are the SNR in the full signal, the inspiral part, and the postinspiral part respectively; and the GR quantile Q 2D GR denotes the fraction of the reweighted posterior enclosed by the isoprobability contour that passes through the GR value, with smaller values indicating better consistency with GR. The results are given only for O3b events which satisfy the selection criteria. See Appendix B for the updated results on GWTC-2 events. viation parameters should peak around (0, 0) when the test is applied to a signal from a quasi-circular BBH coalescence in GR, given that we use a waveform model for such signals to analyze the data. The parameter estimation runs employed the IMRPhenomX-PHM waveform with uniform priors on the redshifted com- Probability density ponent masses and spins. These priors translate into nontrivial priors on ∆M f /M f and ∆χ f /χ f . Thus, as in the previous analysis [11] , we reweight the posteriors to obtain uniform priors on the deviation parameters. We provide our results in Fig. 3 , where we show the 90% credible regions of the twodimensional posteriors on the fractional deviation parameters for the O3b events which satisfy our selection criteria. The reweighted posteriors on the fractional deviation parameters ∆M f /M f and ∆χ f /χ f of individual events are interpolated on a grid with bounds [−2, 2] for both the parameters, and the interpolated posteriors are then multiplied to obtain the combined posteriors. Here we assume the same deviation for all events to obtain the combined results. As shown in gray in Fig. 3 , the combined posteriors on the fractional deviation parameters of GWTC-3 events are consistent with the GR prediction with ∆M f /M f = −0.02 +0.07 −0.06 and ∆χ f /χ f = −0.06 +0.10 −0.07 . The two-dimensional GR quantile values Q 2D GR for the events are given in Table IV . Q 2D GR is defined as the fraction of the posterior enclosed by the isoprobabilty contour that passes through (0, 0), the GR value. Smaller values indicate better consistency with GR. The GR quantile of the combined distribution is 79.6% which is similar to the value obtained for GWTC-2 (78.7%). Among the O3b events, GW200225 060421 has the lowest Q 2D GR value of 1.3% and GW200224 222234 has the highest value of 20.7%. We can also combine the results hierarchically, as discussed in Sec. III B of our previous analysis [11] . Fig. 4 presents the results where the fractional mass (blue) and spin (red) deviation parameters for events from multiple observing runs are plotted with ∆M f /M f = 0.03 +0.14 −0.13 and ∆χ f /χ f = −0.05 +0.37 −0.38 , which are consistent with the expected values in GR. Treating ∆M f /M f and ∆χ f /χ f independently, we find that the Gaussian model parameters are constrained to (µ, σ) = (0.04 +0.08 −0.07 , 0.05 +0.10 −0.04 ) and (−0.04 +0. 12 −0.12 , 0.19 +0.17 −0.13 ) for ∆M f /M f and ∆χ f /χ f respectively, with 90% credibility. These bounds are not significantly different from the ones reported in GWTC-2 [11] , except for that of σ for ∆χ f /χ f . It peaks significantly away from zero as shown in Fig. 5 due to GW190814 whose updated posteriors (see Appendix B for more details related to the updated GWTC-2 results) show marked deviation from GR. We also show the posteriors excluding GW190814 which peak at σ = 0. Deviations from GR, such as additional fields or highercurvature corrections, may alter the binary's binding energy and angular momentum, and its energy and angular momentum flux [15, 16, 19, 20, 22, 23, [159] [160] [161] . This in turn would result in modifications to the binary motion and, hence, to the GW signal emitted by the system. A practical approach to quantifying such effects entails introducing a finite number of parameters that encapsulate possible deviations of a waveform from its GR prediction. We will focus here on parametrizations of the frequency-domain GW phase evolution since observations are in general most sensitive to it (as opposed to changes in the amplitude). Small modifications to the GW phase could accumulate for events with many detectable GW cycles and thus parametrized tests initially focussed on the inspiral part of the waveform, whose duration in the detector band grows for low mass binaries. The inspiral can be treated perturbatively within the post-Newtonian framework [34, [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] , which expands observables in powers of v/c, with each O([v/c] 2n ) being referred to as of nPN order. With the intrinsic parameters of the bi-nary given, the coefficients at different orders of v/c in the PN series are uniquely determined, and so is the perturbative expansion of the early-inspiral phasing within GR. Treating such PN coefficients as measurable parameters of the waveform is therefore a sensible consistency test of GR [173] [174] [175] [176] [177] [178] [179] [180] . While these parameterized waveforms could capture a wide variety of beyond-GR effects, the abrupt onset of waveform modifications, possible when nonperturbative phenomena such as dynamical scalarization are at play, may not be fully captured by them [181, 182] . However, in the spirit of null tests, these differences may still appear as apparent violations of GR. This approach can be applied by directly modifying coefficients in a specific waveform model that encodes PN information [32] or by adding corrections that correspond to deformations of a given inspiral PN coefficient at low frequencies and tapering the corrections to zero at a specific cutoff frequency [9] . Corrections are applied in both cases at the level of the aligned-spin phasing; however, the first method can be leveraged to perform parametrized tests with precessing phenomenological templates, as these automatically inherit non-GR corrections introduced in the aligned-spin phase by virtue of the twisting-up construction [10, 11] . Here we present results obtained with the second method, which we apply to the frequency domain model SEOB-NRv4 ROM [115, 117] , a reduced-order model of the timedomain aligned-spin approximant SEOBNRv4. We do not include results obtained with the first method, for which an upgrade to the precessing IMRPhenomXP model is under development. Due to time constraints, these results will be presented elsewhere. Past analyses [10, 11] showed good consistency between the two approaches, despite the differences in the waveform models being used and the physics content included. Dedicated studies would be needed to thoroughly assess the effect of waveform systematics on parametrized tests across parameter space and quantify the effects of specific approximations, such as the omission of precession and higher-order multipole moments. Fractional deviations are applied to the full phase as corrections scaling with f (−5+n)/3 at each n/2-th PN order. Following previous works [10, 11] , we reweight the posteriors to reparametrize the results as fractional deviations applied to the nonspinning terms of a 3.5PN TaylorF2 phase [183] , which is obtained by applying the stationary phase approximation [184] to time-domain post-Newtonian waveforms: Here,f = GM(1 + z) f /c 3 , with M(1 + z) being the redshifted total mass of the binary, ϕ c , t c are the coalescence phase and time, and η the symmetric mass ratio. This parametrization has the advantage of avoiding potentially singular behavior of the deviation coefficients, which might occur as a result of cancellations between the nonspinning and spin-dependent phasing coefficients. Phasing corrections to the inspiral phase are tapered off at the same cutoff frequency used in previous analyses, i.e., f PAR c = 0.35 f 22 peak , where f 22 peak is the GW frequency of the (2,2)-mode at the peak of the amplitude as defined in the model SEOBNRv4. We introduce the following parametric deviations to GW inspiral phasing: where each δφ i represents the fractional deviation from the GR PN coefficient at the i/2-th PN order, following the parametrization adopted in previous analyses [7, 9, 10, 185, 186] . The subscript l is used to denote coefficients of logarithmic-inf terms. We do not present bounds for the 2.5PN non-logarithmic term, as this is degenerate with the coalescence phase, as can be seen from Eq. (4). As predicted in GR, the coefficients corresponding to −1PN and 0.5PN are identically zero, so we parametrize δφ −2 and δφ 1 as absolute deviations, with a prefactor equal to the 0PN coefficient (3/128η); all other coefficients represent fractional deviations from the GR value. As detailed in Sec. I, we consider all binaries that meet the significance threshold of FAR < 10 −3 yr −1 and impose the additional requirement that SNR ≥ 6 in the inspiral regime, as defined with respect to f PAR c . Eligible events are summarized in Table V . The parametrization presented above recovers GR in the limit δφ i → 0, thus consistency with GR can be claimed if 0 is included within a given confidence interval and, in what follows, we will report 90% credible intervals for the posteriors of δφ i . We adopt uniform priors on δφ i that are symmetric about zero and compute their posterior distributions using LALInference. As in previous analyses, we only allow the coefficients δφ i to vary one at a time. It has been shown that this procedure is effective at picking up the deviations from GR that modify more than one PN coefficient [187] [188] [189] . Allowing multiple deviations in the same template is bound to produce less informative posteriors, due to correlations among different parameters and also with GR coefficients. Such correlations can be effectively reduced with alternative choices of the deformation parameters, which can be guided by a principal component analysis [190, 191] , thus increasing the effectiveness of multiparameter tests. It was also shown that synergies between third-generation detectors and space-based interferometry might dramatically improve the precision of multiparameter tests [73] . In Fig. 6 , we present the 90% upper bounds on the deviation coefficients obtained from the combined distribution of events from GWTC-3, under the assumption that deviations take the same value for all the events. While the combined bounds are fully consistent with GR, we do note that for a number of events δφ i = 0 falls outside the 90% credible interval of the deviation coefficients distributions. We find that the addition of an extra degree of freedom can enhance the weight of secondary modes observed in GR parameter estimation runs, as can be seen for GW191216 213338 and GW200316 215756, where a secondary mode in mass ratio present in the GR parameter estimation runs dominates the results when inspiral deviation parameters are included. This is partly due to the singular behavior of the parametrization employed here, and indeed the application of reweighting alleviates the problem. Still, the presence of a secondary mode remains and so do the broad posteriors for the deviation coefficients. Appendix A contains a more extended discussion on the impact of noise properties and waveform systematics on our bounds. We find that there is no uniform improvement over the GWTC-2 results [11] across all deviation coefficients. This is consistent with the modest improvement factor due to the increased number of events, which can be estimated to be ∼ 1.2. We find that this factor is comparable to fluctuations in the final bounds determined by individual events. The most striking difference with respect to the GWTC-2 analysis is the new constraint obtained for δφ −2 , for which we obtain an upper bound |δφ −2 | ≤ 7.3 × 10 −4 at the 90% credible level. This result improves upon the GWTC-2 bound by a factor ∼ 2. The improvement is driven by the inclusion of GW200115 042309, due to its long duration. The presence of a non-zero −1PN term can be associated to the emission of dipolar radiation, which is forbidden in GR but can be excited in alternative theories of gravity and is related to energy and momentum being transferred from the binary to additional fields. This result is less stringent than the one obtained with combined measurements of binary pulsars [192] or with the observation of GW170817 [9] . We also computed hierarchically combined posteriors for all the deviation coefficients introduced above, where we allow the deviation coefficients to take independent values for each event. These are shown in Fig. 7 , plotted against the combined posteriors employed to obtain the upper bounds of Fig. 6 . All results are consistent with the GR prediction with at least 90% credibility. In Fig. 8 , we show the joint distribution of the mean µ and standard deviation σ of the population-marginalized posterior for the deviation coefficients of Eq. (5). All distributions are consistent with the GR prediction for which µ = σ = 0, with deviations occurring at −1PN being the most tightly constrained. In Table VI , we report the medians, 90% credible intervals, and GR quantiles Q GR = P(δφ i < 0) for both the hierarchical and the joint-likelihood approaches. Values of Q GR significantly different from 50% indicate that the null hypothesis falls in the tails of the combined distribution. In the hierarchical analysis the most constrained parameter is δφ −2 = −0.05 +0.99 −1.25 × 10 −3 at the 90% credible level, while deviations in the 3.5PN order coefficient are the least constrained, with δφ 7 = 0.14 +1.05 −1. 16 . For the majority of the PN coefficients, the hierarchical analysis of GWTC-3 data obtains tighter constraints than the ones obtained with GWTC-2 events [11] . Results for the shifts to inspiral phase can also be mapped onto constraints on specific theories, in particular via the parametrized post-Einstein (ppE) framework [71, 177] . For instance, bounds on the coupling constant of Einstein-dilaton-Gauss-Bonnet and dynamical Chern-Simons gravity were obtained using GWTC-2 events [72, 193, 194] . However, the upper bounds reported here depend on the parametrization being used and on specific details of the analysis, such as the frequency at which non-GR corrections are being tapered off. The priors we impose on the deviation coefficients are not designed to suit any specific theory and, depending on the theory that is being considered, different sampling parameters and prior bounds might be preferable to the ones adopted here. Furthermore, in alternative theories of gravity multiple inspiral coefficients will be subject to deviations from GR; our bounds refer to single-coefficient deviations, that might capture at once deviations at several PN orders, so the mapping would be ambiguous. Likewise, one would also need a robust estimate of the error caused by neglecting currently unknown higher-order PN corrections and deviations in the merger-ringdown phase, which will also differ from the GR one [195, 196] . Finally, it is not clear whether some of these theories, such as Lovelock, Chern-Simons or Horndeski gravity, of which Einsteindilaton-Gauss-Bonnet gravity is a special case, would admit at all a well-posed initial value problem in their most general form [197] , although well-posed formulations are possible in the weak-coupling limit [198] [199] [200] . Spinning objects have quadrupole and higher contributions to the multipole decomposition of their gravitational field due to their rotational deformations. Following the no-hair conjecture, the spin-induced multipole moments take unique values for black holes given their mass and spin [201] [202] [203] . Gravitational waveforms describing spinning compact binary systems encode information about these spin-induced multipole moment effects. The leading order term can be schematically represented as, Here Q is the quadrupole moment scalar and is the leading order term in the gravitational wave phase at 2 PN order. m and χ are the mass and the dimensionless spin of the compact object. Along with this leading-order effect, we have included higherorder PN terms that appear through the inspiral phase [167, 204] of gravitational waveform. While Kerr BHs have κ = 1 [201] [202] [203] , compact stars have a value of κ that differs from the BH value, determined by the star's mass and internal composition. Numerical simulations of spinning neutron stars show that the value of κ can vary between ∼2 and ∼14 for these systems [205] [206] [207] . Moreover, for currently available models of spinning boson stars, κ can have values ∼10-150 [208] [209] [210] [211] . More exotic stars like gravastars can even take negative values for κ [212] . Hence, an independent measurement of κ from gravitational-wave observations can be used to distinguish black holes from other exotic ob-jects [213] [214] [215] [216] . However, to fully understand the nature of compact objects, one may also include effects such as the tidal deformations that arise due to the external gravitational field [217] [218] [219] [220] and tidal heating [221] [222] [223] [224] [225] [226] along with the spininduced deformations, an extensive study of these effects is not in the scope of this paper. For a spinning compact binary system, the coefficients κ i , i = 1, 2 represent the primary and secondary components' spin-induced quadrupole moment parameters. The correlation of κ i with the masses and spin parameters of the binary are evident from Eq. (6), which makes the simultaneous estimation of κ 1 and κ 2 hard. The higher-order terms present at the 3PN order help break this degeneracy, but are not enough to Fig. 6 . For general constraints, we also provide the mean µ and standard deviation σ of the inferred hyperdistribution. For δφ i and µ, we report the median as well as the 90%-credible intervals, while for σ we only present upper bounds. For the restricted method, the null hypothesis can occasionally fall in the tail of the distribution, while the hierarchical analysis places it in the bulk of the inferred distribution. FIG. 8. Joint distribution for the hyperparameters µ and σ of the GR deviation coefficients for the parametrized tests on generic GW modifications of Sec. V A. Contours mark 90% credible regions and they all include µ = σ = 0, which corresponds to the GR prediction. These results were obtained with a pipeline based on SEOBNRv4 ROM. The φ −2 -contour has been rescaled by a factor of 200 to improve visibility, as deviations for this term are the most tightly constrained. Warm (cold) colors refer to deviations occurring at higher (lower) PN orders. give reasonable constraints with our current detector sensitivities. However, a combination of these parameters can be measured [11, 213, 227, 228] . For this reason, we introduce the symmetric and anti-symmetric combinations of κ i , For binary black holes, κ s = 1 and κ a = 0. We measure the parameterised deviations κ s = 1 + δκ s assuming κ a = 0. This assumption restricts our study to binaries consisting of compact stars with identical spin-induced deformations. If the data supports κ 1 κ 2 , expect a significant offset away from zero in δκ s measurements. Further studies are required for such situations. The method employed here is the same as in previous analysis [11] . We perform a Bayesian analysis with LALInference using a nested sampling algorithm to estimate the posteriors on δκ s . The parametrized deviations δκ s are introduced in the inspiral phase of the IMRPhenomPv2 waveform model. We analyse the events listed in Table II passing the selection criterion. Along with the FAR < 10 −3 yr −1 criteria, we consider two additional conditions for selecting the events for this test. First, we select inspiral-dominated events having an inspiral network SNR ≥ 6. Since the test relies on at least one of the binary's components having nonzero spin, we also drop events whose effective inspiral spin parameter measurements include zero at the 68% credible level. Given the component masses m i and the dimensionless spins χ i = S i ·L/m 2 i pointing parallel to the orbital angular angular momentum axis, the effective inspiral spin parameter χ eff is defined as, χ eff = (m 1 χ 1 + m 2 χ 2 )/(m 1 + m 2 ) [229] . To compute the combined bounds we include events reported in [11] satisfying both the above selection criteria. This gives us a total of 13 events considering the entire GWTC-3. In Fig. 9 , the posterior distributions on δκ s for the events listed in Table II are derived assuming a uniform prior on δκ s ranging between [−500, 500]. Individual events constrain positive values of δκ s more strongly than negative ones. This is primarily because of how these parameters are correlated with the effective inpsiral spin parameter of the binary system [228] . As most of the events we observe have small but positive χ eff , We also consider a case where the analysis is restricted to only positive δκ s as is well motivated in the case of neutron stars [205, 206, 215] and boson stars [208] , in this case the event provides the tightest upper limits is GW191216 213338, with 90% credible bounds of δκ s < 10.65. We show the combined posterior distribution on δκ s from all the GW events passing the selection criteria in Fig. 10 . The red curve draws the posterior distribution obtained by multiplying the likelihoods of each individual signal. In contrast, the population-marginalized posterior from the hierarchical analysis is shown in the blue curve. Dotted lines show the 90% symmetric credible intervals, and a dashed line marks the BBH value (δκ s = 0). We estimate the combined symmetric 90% bound on δκ s considering GWTC-3 events to be δκ s = −16.0 +13.6 −16.7 and, conditional on positive values, δκ s < 6.66 from the joint likelihood analysis. With 90% credibility, we find δκ s = −26.3 +45.8 −52.9 from the hierarchical analysis. The generic population results constrain δκ s < 51.85 when we restrict to positive prior region. Also, we find the hyperparameters to be consistent with the Kerr BBH hypothesis with 90% credible bounds with µ = −26.8 +26.3 −34.1 and σ < 41.8. Compared to the previous bounds reported in [11] , µ = −24.6 +30.7 −35.3 and σ < 52.7, the σ estimate improves, meaning tighter constraints on δκ s , while the peak of the distribution is shifted more towards the negative prior region. The shift in the peak or µ omits the BBH value with the 90% credibility and can be associated to the poor δκ s constraints on the negative side of the prior region from the individual events, emerging from waveform degeneracies at δκ s < 0 with a certain region of the spin parameter space. A future study employing waveform models including higher harmonics may help break those degeneracies and hence to improve our overall parameter estimation [228, 230] . Moreover, a more generic approach has been recently proposed [230] that uses a hierarchical mixturelikelihood formalism to estimate the fraction of events in the population that deviated from BBH nature. With the increased number of detections in the future, it would be more natural to employ generic approaches that considers the population to be comprised of BBH and non-BBH subpopulations. The combined log Bayes factor of log 10 B Kerr δκ s 0 = 0.9 is obtained supporting the BBH hypothesis over the hypothesis of all events being non-BBH. This changes to log Bayes factor of log 10 B Kerr δκ s > 0 = 2.2 if we only allow δκ s ≥ 0. The findings here are all consistent with the results reported in GWTC-2 [11] although the combined constraints are not directly compatible due to the different selection of events. GR predicts that GWs propagate nondispersively and hence they are described by the dispersion relation E 2 = p 2 c 2 , where E and p are the energy and momentum of the wave. Detection of dispersion of GWs can be seen as a signature of modifications to GR. For example, some of the Lorentz violating theories of gravity predict a modified dispersion relation [45, [231] [232] [233] [234] . We use a parameterized model [41, 49] for dispersion of GWs that helps search for the presence of dispersion using the data without referring to the details of the modified theory. Our parameterized dispersion relation reads [41] where A α and α are two phenomenological parameters characterizing dispersion. The modified dispersion relation causes frequency modes of GWs to propagate at different speeds, changing the overall phase morphology of the GW that are observed with respect to the GR predictions. This can be incorporated in the waveform as frequency-dependent corrections to its phase evolution [10, 41] . Here we assume that the waveform obtained in the local wave zone [235] of the system is consistent with GR [10] . For different choices of α, the modified dispersion leads to a deviation in the GR phasing formula. For example, α = 0 with A α > 0 corresponds to the dispersion effect of a massive graviton with mass m g c 2 = √ A 0 [49] . We choose to test the dispersion relation for a set of eight discrete values of α between 0 and 4 with a step of 0.5 excluding α = 2. When α = 2, the speeds of all the frequency components are modified in the same way; therefore, the GW signal remains unchanged from the GR prediction except for an overall change in the time of arrival of the signal. Our method is identical to the previous analyses performed in GWTC-1 and GWTC-2 [10, 11] , except for the use of a more up-to-date IMRPhenomXP [105] waveform model as opposed to the PhenomPv2 [236] waveform employed in GWTC-1 and GWTC-2 [10, 11] . We perform parameter estimation using the nested sampling algorithm [140] as implemented in the LALInference package [138] and obtain bounds on the phenomenological parameters A α for each event. As in the case of preceding analyses, we perform the sampling for A α < 0 and A α > 0 separately [10, 11] , and then combine the posterior to produce the joint A α posterior. We choose uniform priors for the phenomenological parameters A α . However, while computing the bound on the graviton mass m g , which is derived from A 0 , we re-weight the posteriors such that the prior on m g is uniform. Propagation effects are independent of the source properties. Therefore we can combine the results from individual events to compute overall constraints over the phenomenological dispersion parameters. We obtain the combined posterior distributions of A α by multiplying the likelihoods from individual events and weighting the product with the prior. We perform the analysis on the 12 BBH candidate events in the catalog that are listed in Table II . Though we analyzed GW191109 010717, the posteriors obtained were too wide to be informative, and following the study regarding this event reported in Appendix A, which finds that nonstationarities in the detector noise could dominate over the signal, we exclude this event from further analysis. Analysis of another BBH event, GW200316 215756 has sampling issues and is thus excluded from the analysis. Further, we do not include NSBH event GW200115 042309 in this analysis due to the computational constraints. Nonetheless, this is among the closest events in the catalog and would have a negligible impact on the joint bounds. Fig. 11 shows the violin plots of joint posteriors on the phenomenological parameters A α for various values of α, which are obtained by combining posteriors from analysis of individual events. The red violin plots represent the posteriors obtained from all 43 selected events (31 events from pre-O3b and 12 O3b events). For some of the α values, the posteriors show biases with respect to the previous results [11] due to the inclusion of O3b events. We have identified the events GW200219 094415 and GW200225 060421 as having the strongest impact in biasing the combined posterior. These are the events with the lowest residual SNR p-values among all the O3b events (see Table III ). GW200225 060421 shows p-value of 0.05 with fitting factor FF 90 = 0.86 and GW200219 094415 has p-value = 0.1 with FF 90 = 0.74. These events require detailed analysis to understand the reasons for the observed deviations, which we leave for follow-up work. For comparison, in Fig. 11 , we also plot the combined posteriors from all the events exclud- ing GW200219 094415 and GW200225 060421 (blue violin plots). These are consistent with the GWTC-2 [11] results (gray plots in the background) and show an average improvement of 1.3 over the previous results, which is in agreement with the Gaussian expectation for improvement from 41 events compared to 31 of GWTC-2 [11] . In Fig. 12 , we present the scatter plot of 90% credible upper bounds on |A α |, for A α > 0 and A α < 0 separately. In the figure, red-filled diamond markers represent the GWTC-3 bounds. We also show the bounds from the analysis excluding the events GW200219 094415 and GW200225 060421 in the blue diamond markers. For quantitative comparison, we list |A α | bounds, including bounds on the graviton mass m g , in Table VII. To demonstrate the level of bias in the posteriors with respect to the GR hypothesis, we included the GR quantiles Q GR = P(A α < 0) in Table VII. The updated 90% credible bound on the graviton mass obtained by combining posteriors of 43 GWTC-3 events is m g ≤ 1.27 × 10 −23 eV/c 2 , which is 2.5 times better than the Solar System bound of 3.16 × 10 −23 eV/c 2 [237] . Compared to the previous GWTC-2 bound 1.76 × 10 −23 eV/c 2 [11] , the improvement is a factor of 1.4. Measuring the polarization content of GWs is a way of constraining possible deviations from GR, as the theory allows only two of the six polarization states predicted by generic metric theories of gravity [50, 51] . Assuming M generic polarization modes, the frequency-domain strain datad( f ) measured by a network of D detectors can be written as the combination TABLE VII. Results for the modified dispersion analysis (Sec. VI). The table shows 90%-credible upper bounds on the graviton mass m g and the absolute value of the dimensionless phenomenological parameterĀ α = A α /eV 2−α . Q GR = P(A α < 0) denotes the quantiles corresponding to GR hypothesis. The < and > labels denote the bounds on |Ā α | for A α > 0 and A α < 0 respectively. We also included bounds computed from GWTC-2 [10, 11] for comparison. of a signals( f ) and noiseñ( f ), or alternatively, as wheres( f ) = Fh( f ), F ∈ R D×M are the beam pattern functions of the detectors andh( f ) ∈ C M are the signal's polarization modes. We could interpret the gravitational-wave signal as a geometric projection on the subspace spanned by the basis vectors of F . By projecting the data on the subspace orthogonal to these vectors, one can then construct null streams, i.e., linear combinations of the data containing no information on the signal [238, 239] . Given D detectors, it is possible to construct at most D − M null streams. The projection operation can be formalized through the introduction of a null operator P [240] where I is the identity matrix and † denotes conjugate transpose. The quantities F depend on the sky location of the signal, as well on the polarization angle and event time and, by construction, Ps( f ) = 0. At least M + 1 detectors are needed to apply the null stream method in the most generic case, although for specific sky locations less detectors will suffice to test certain polarization hypotheses [241, 242] . The beam pattern functions of the breathing and longitudinal scalar modes are not linearly independent, and thus the maximum number of independent polarization modes is five [243, 244] . Consequently, past analyses [7, 9, 11, 245] tested only pure polarization hypotheses, as these are fully characterized by two polarisation modes at most, and in this case it is possible to construct a null stream with the strain measured by three detectors. In this work, we use a method that allows tests of mixed polarization states with 2 and 3 detectors [246] . This enables all our events to be used to compute combined Bayes factors, while the previous analysis [11] was restricted to 3-detector events. The method builds upon an effective antenna pattern functionF ∈ C D×L that is constructed from a subset of L < M polarization modes. For each hypothesis to be tested, the relevant polarization state is projected into the chosen basis: thus, one orthogonalizes the data with respect to a smaller subspace spanned by the basis modes, rather than the assumed polarization modes. Each polarization modeh m can be rewritten as a linear combination of the basis modes, plus an additional orthogonal component with C m , C ⊥ m ∈ C. We perform the null projection with respect to the subspace spanned by the component of the beam pattern vectors parallel to the basis mode(s). Therefore, the method is sensitive to any component of a given polarization hypothesis that is parallel to the chosen basis modes(s). The subspace removed by the null projection does not need to coincide with the polarization subspace of the hypothesis being tested. We will conduct analyses employing either one (L = 1) or two (L = 2) basis modes. The L = 2 parameterization allows more freedom in the choice of the basis modes, but at the cost of a weaker distinguishability between different polarization hypotheses. The subspaces spanned by the beam pattern function vectors for different hypotheses, in fact, will generally have a larger overlap in the L = 2 than in L = 1 case. The polarization content is constrained to be a linear combination of the basis modes and, therefore, the L = 1 analysis is expected to produce more stringent results, due to the strongest constraints imposed on the signal. On the other hand, the L = 2 analysis will be able to capture orthogonal components missed by the L = 1 analysis. Right ascension, declination and polarization angle are free parameters shared by both analyses. Additionally, we marginalize over the relative amplitude and phase with respect to the basis mode(s). While this implies the L = 2 analysis has more degrees of freedom than the L = 1 one, the polarization content of GR can be represented by both methods. In fact, in the quadrupolar approximation, h + and h × only differ in terms of a relative amplitude and phase, which are marginalized over in the computation of the evidence, and therefore only one basis mode is sufficient to capture both. In the L = 2 analysis, we choose one tensorial mode and one non-tensorial mode as the basis, to better capture possible extra degrees of freedom predicted by alternative theories of gravity. All events are analyzed with both models, to check consistency between the two formulations. The nature of null projection prohibits the comparison of model evidences with different number of basis modes [246] ; therefore, we conduct the analyses with one and two basis modes independently. We reanalyze here all of the events from the first three observing runs with a FAR below our 10 −3 yr −1 threshold. The pipeline employed here has been optimized to analyze the polarization content based on excess power in the data. Some events from the third observing run, namely GW190425 081805, GW190720 000836, GW190828 065509, GW191129 134029, GW200115 042309, GW200202 154313, and GW200316 215756, do not have a strong enough timefrequency track for the pipeline to capture them and we exclude them from our analysis. We test whether the data support a number of non-GR polarization hypotheses, which we denote by T (tensorial), V (vector) and S (scalar) or combinations of these terms. In the L = 1 case, we choose the tensorial plus mode h + as the basis for all hypotheses involving tensor modes, the vector mode x as the basis for the pure vector and vector-scalar hypotheses, and the scalar breathing mode as a basis for the scalar hypothesis. In the L = 2 analyses, following the notation introduced above, we project mixed polarizations of the type Tξ on a basis including h + and the relevant non-tensorial ξ mode, VS polarizations on a vector-scalar basis and TVS polarizations on the h + and scalar-breathing mode. The basis modes chosen for each polarization hypothesis and the corresponding free parameters are summarized in Table VIII. We present combined log 10 Bayes factors assuming the events are independent from each other. Table IX shows the combined log 10 Bayes factors of the 2-detector and 3-detector events for the L = 1 analysis, with the last row showing the combined log 10 Bayes factor for all eligible GWTC-3 events. The pure-scalar, pure-vector and vector-scalar hypotheses are significantly disfavored, while any mixed hypothesis involving tensor modes (i.e., tensor-scalar, tensor-vector, and tensor-vector-scalar) cannot be ruled out conclusively. Table X shows the combined log 10 Bayes factors of the 3detector events for the L = 2 analysis. There are no available data for O1, due to the fact that only two interferometers were operational at the time (LIGO Hanford and LIGO Livingston). In this case, the last row shows the combined log 10 Bayes factor for O2 and O3 events. Mixed hypotheses in this case are more strongly disfavored than the pure vector hypothesis (the pure scalar hypothesis cannot be tested here, due to the fact that the longitudinal and breathing modes for interferometers are not linearly independent). This is particularly evident for the TVS hypothesis, which has the largest number of free parameters. All mixed hypotheses will be penalized by a higher Occam factor, due to the increased prior volume in the Bayesian evidence integral, as each mode is characterized by a relative amplitude and phase with respect to the basis modes. The number of free parameters necessary to represent each hypothesis, depending on the number of basis modes employed, is reported in the last column of Table VIII . If the polarization content of the signals is purely tensorial, mixed-mode hypotheses including a tensorial component will be also able to represent the data, as the signal space of the pure tensorial hypothesis is a subspace of that of the mixedmode hypotheses including a tensorial component. In this case, we expect to see a larger penalization of mixed hypotheses in the L = 2 analysis, as the same polarization hypothesis will correspond to a larger parameter space than in the L = 1 analysis, as can be seen from Table VIII . This expectation is confirmed when comparing the results reported in Tables IX and X. On the other hand, if we consider the vector-scalar hypothesis, the additional parameters in the L = 2 analysis will help to fit the tensorial component of the data: this explains why we see a less negative log 10 Bayes factor in the L = 2 analysis than in the L = 1 one for this hypothesis. A similar argument applies to the purely scalar and vectorial hypotheses. These results support the conclusion that the population of events observed is consistent with the pure tensorial hypothesis, as predicted by GR, in line with the conclusion of past analyses of GWTC-1 and GWTC-2 data [10, 11] . The highly distorted BH remnant formed from the merger emits gravitational radiation which is referred to as ringdown. The late-ringdown waveform can be expressed as a superposition of quasi-normal modes (QNM) with a complex frequency [247, 248] . The real part of the complex frequency is the oscillation frequency and the imaginary part is the inverse damping time of the mode. According to GR, for astrophysical BHs, the frequency and damping times are completely determined by the mass and spin of the remnant BH [201] [202] [203] 249] . In fact, at the current sensitivity, electric-like BH charges have been shown to not leave a detectable imprint on ringdown measurements [250] . The relationship between frequency and remnant parameters, detection of multi-mode ringdown signals offers a unique test of the BH nature of the merger remnant [63, 251] and could be used to distinguish among different classes of ECOs [13] . The study of the QNM spectrum, and the post-merger waveform in general, contains a wealth of information about the remnant BH. The spectrum of radiation emitted during the ringdown is usually expressed in terms of spheroidal harmonic basis functions with spin-weight −2 denoted by −2 S mn . The in- x, ×, b, l, x, and y represent the plus mode, cross mode, scalar breathing mode, scalar longitudinal mode, vector x mode, and vector y mode respectively. The first column shows the polarization hypothesis being tested, the third column reports the number of basis modes, and the last column reports the number of free parameters that are marginalized over in the computation of the evidence. dices ( , m) represent the angular decomposition of the modes, whereas the index n denotes various tones of the spectrum start-ing with n = 0. A schematic decomposition of the post-merger signal reads [11] , where A mn denotes the amplitude of the mode, t 0 is the start time of the ringdown model, and z is the redshift of the source. The frequency and the damping time of a mode characterized by the three indices are denoted by τ mn and f mn , respectively, while χ f is the final spin. The polar and azimuthal angles (θ, φ), measured relative to the final spin axis, describe the direction to the observer. These coordinates assume the spin of the black hole to be along the θ = 0 direction. The contribution of counter-rotating perturbations is ignored, since it's expected to be negligible in the post-merger regime of the signals under consideration. We approximate the spheroidal harmonics with spherical harmonics, leading to a feature, called mode-mixing of QNMs [111, 252, 253] , which mixes modes that have the same m index but different indices. Mode mixing, especially of = |m| = 2 mode, is expected to be negligible for the modes that we consider here [118] . We present two analyses of ringdown: the time-domain ringdown analysis pyRing [254, 255] , based on damped sinusoids, and the parametrized ringdown analysis pSEOB, based on the SEOBNRv4HM waveform model [120] . 1. The pyRing analysis pyRing employs a time-domain approach. The three types of templates used in the analysis are Kerr 220 , Kerr 221 , and Kerr HM . Kerr 220 has only the fundamental modes, i.e., ( , |m|, n) = (2, 2, 0), Kerr 221 has the fundamental modes and its first overtones (both ±m, with the same damping time and frequencies of opposite sign). Instead, Kerr HM includes the fundamental mode and higher moments (HMs) with ≤ 4, without overtones. The Kerr HM template also takes into account the effect of mode-mixing due to the use of spherical harmonics instead of spheroidal harmonics to describe the ringdown. While in the Kerr 220 and Kerr 221 models the amplitudes and phases are left free to vary, in the Kerr HM model, quasi-circular, aligned-spin NR fits are used to compute the mode amplitudes, frequencies, and damping times. The model calibration assumes that the remnant BH originated from the quasi-circular coalescence of progenitor BHs with spins aligned with the orbital angular momentum. As other templates account for precession, this assumption does not affect the overall efficiency of the analysis. The progenitors' masses and spins are sampled with uniform priors. In contrast to the GWTC-2 results [11] , where they were independently measured, the remnant mass and spin are now also obtained through NR fits from the progenitor parameters [157] . This increases the coherence of the measurement, leading to tighter constraints on the remnant parameters. The model amplitudes are calibrated against NR simulations at 20M after the peak of the = |m| = 2 complex strain, where M is the total mass of the progenitor binary. The other two template types remain unchanged from previous analysis [11] . pyRing uses a reference time t 0 , which is computed from an estimate of the peak of the strain (h 2 + + h 2 × ) from the full IMR analyses (with the approximant IMRPhenomXPHM) assuming GR. The sky location is fixed to coincide with the maximum likelihood value inferred from the full IMR analysis. When assuming the Kerr 221 template, we fit the data starting at t 0 [255, 256] . To take into account the NR calibration of the template, the start time of the analysis is instead set to 15GM f (1 + z)/c 3 after t 0 when employing the Kerr HM template, whereM f is the median IMR value of the remnant mass. This choice extrapolates the model outside its nominal validity region, however we have verified that this does not induce appreciable biases [257] . In the Kerr 220 case, due to the greater flexibility of this template, the analysis starts 10GM f (1 + z)/c 3 after t 0 , which is when a linearised ringdown description is expected to dominate the signal for our analysis [257, 258] . Both sky locations and start times at each detector, together with all configuration data required to reproduce the analysis, are released in [259] . We report results where: i) the remnant parameters are constrained relative to the prior, and ii) the Bayesian evidence favors the presence of a signal over pure Gaussian noise when using our most sensitive template (Kerr 221 ). Estimates of the remnant parameters obtained for the five events from O3b that pass the above criteria, are reported in Table XI. There is agreement between the remnant properties obtained from the three waveform templates and the IMR analysis. The contribution of overtones or HMs during ringdown is quantified by log Bayes factors (log 10 B HM 220 , log 10 B 221 220 ), which are reported in Table XI . (Note, however, that the Bayes factors depend sensitively upon the chosen priors [118] .) When comparing a fit including all QNMs included in Kerr HM , versus the same model including only the = |m| = 2, n = 0 mode, we observe no strong evidence for the presence of HMs. Additionally, we search for evidence of HMs augmenting the Kerr 220 , Kerr 221 templates with each of the ( , |m|, n) = (3, 3, 0), (3, 2, 0), (2, 1, 0), (2, 0, 0) modes separately. Following [260] , to increase the sensitivity to these modes we assume the amplitudes of modes with opposite m signs to be related by the non-precessing symmetry A ,−|m|,n = (−1) A * ,|m|,n , while also imposing an amplitude hierarchy on the higher angular modes A ,m,n /A 2,2,0 < 0.9, as expected for quasi-circular binaries of moderate mass ratio. Under this set of assumptions, we find no significant evidence for the presence of higher modes in all events except for GW191109 010717, which shows weak evidence for the presence of the (3, 2, 0) and (2, 1, 0) modes in addition to the (2, 2, 0) one at times smaller than t 0 + 5M f . However, as discussed below, non-stationary noise around the event time does not allow to draw reliable inference from this signal. Moreover for the parameter space under consideration, the templates signalling such evidence are composed only of fundamental n = 0 modes and thus not reliable before ∼ t 0 + 10M f [257, 258] . Comparing the Kerr 220 and Kerr 221 analyses, with both templates starting at the peak, we observe weak evidence for the presence of overtones only for the loudest among these signals (for example GW200224 222234 shows such evidence). The estimates of the remnant parameters get closer to the full IMR waveform estimates when including overtones, in agreement with NR predictions. The exception to the above discussion is GW191109 010717, which shows an overestimation of the remnant mass and spin compared to IMR analyses. As discussed in Appendix A, such discrepancies can be attributed to possible nonstationarities in the detector noise and we will exclude this event from any combined statement made below. To investigate possible modifications to the ringdown spectrum of the remnant BH, we add parametrised deviations to the Kerr 221 template in the frequency and damping time with respect to their GR values for the n = 1 mode. This parametrization can almost fully cover the two-tone parameter space and has a number of desirable features that facilitate both sampling and interpretation [118] . Parameter estimation is performed TABLE XI. The median, and symmetric 90%-credible intervals, of the redshifted final mass and final spin, inferred from the full IMR analysis (IMR) and the pyRing analysis (Sec. VIII A 1) with three different waveform models (Kerr 220 , Kerr 221 , and Kerr HM ). A positive value of log 10 B HM 220 indicates support for HM in the data, and a positive value of log 10 B 221 220 shows support for the presence of the first overtone. A positive value of log 10 O modGR GR quantify the level of disagreement with GR. The catalog-combined (including GWTC-2 events) log odds ratio is negative (−0.90 ± 0.45). Redshifted The lower bound on δτ 221 prevents issues due to the finite time resolution in the waveform sampling. [11] . If GR provides an accurate description of the ringdown emission, we expect to observe posterior distributions of the deviation parameters to be centered around zero, together with a Bayesian evidence disfavouring the addition of non-GR parameters. The inferred values of the frequency deviation parameters are consistent with GR for all events analysed, while weak constraints can be extracted on the damping times deviations from single events. The damping time estimation of low-SNR events is more sensitive to violations of the Gaussianity and stationarity hypotheses compared to the frequency estimation [11] . Additional studies investigating this behaviour will be required in the future to properly derive joint posteriors on this parameter when combining many weak events. The posterior distribution of δτ 221 often tends to rail towards the lower prior bound −0.9 for events with low SNR in the ringdown regime, as the data show little evidence for the first overtone. To combine the set of measurements for all 21 available events we make use of a hierarchical analysis [11] . The single events posteriors used to derive this joint bound are the marginalised δf 221 posteriors obtained when allowing both the frequency and the damping time of the 221 mode to deviate from the GR predictions. We obtain a constraint on the frequency deviation equal to δf 221 = 0.01 +0. 27 −0.28 , overlapping with the GR predicted value for a Kerr BH, and show its posterior probability distribution in Fig. 13 . The corresponding hyperparameter values are: µ = 0.01 +0. 18 −0.18 , σ < 0.22. Although GW191109 010717 is excluded from the combined analysis, we note that even though the mass and spin estimates coming from this event show some tension with the ones coming from an IMR analyses, the parametrised deviations do not indicate preference for additional parameters required to describe the ringdown emission. We do not allow to obtain informative constraints on δτ 221 . The single event odds ratios log 10 O modGR GR values, computed following a procedure similar to our previous analysis [11] , are reported in Table XI O3b events, −0.09, corresponds to GW200129 065458 and does not signal significant tension. By considering all the GWTC-3 events that passed our selection criteria (including previous GWTC-2 results), we find a combined log odds ratio of −0.90 ± 0.44, at 90% uncertainty, favouring the hypothesis that GR gives an accurate description of the observed ringdown signals. Finally, as an agnostic test of the consistency of the ringdown emission with GR predictions, a single damped sinusoid (DS) template is used to fit the data. In this case we are not assuming an underlying Kerr metric, nor that the object emitting the signal is a BH, thus the frequency, damping time, and complex amplitude are considered as free parameters without imposing any predictions from GR. We adopt uniform priors on the frequency, damping time, log of the magnitude, and the phase of the complex amplitude. The fit starts at 10GM f (1+z)/c 3 after t 0 . The results on the frequency and damping time obtained in this case are reported in Table XII. The resulting values agree with the measured values obtained from IMR analyses assuming GR and with those obtained from the additional pSEOB ringdown test discussed below, except for GW191109 010717, where we find an overestimation of the frequency and damping time from the pyRing analysis with respect to the full IMR analyses, which is compatible with the overestimation of the remnant mass and spin. The pSEOBNRv4HM ringdown analysis [11, 120] , which uses parameterized spinning EOB waveforms with higher modes calibrated to non-precessing NR simulations [116, 261] , measures QNM frequencies within the framework of a complete IMR binary black hole waveform model. It makes full use of gravitational wave signal modelling and hence the complete SNR of the signal. In this approach, the ringdown start-time is built into the model, based on calibrations to NR, and may hence be considered complementary to the post-merger timedomain ringdown analysis described in the previous section. In the SEOBNRv4HM model, starting from estimates of the initial binary's masses and spins, NR fits [156, 262] are used to predict the mass and spin of the remnant object, which are then used to predict the ringdown frequencies and damping times [64, 263] . Thus, the frequency and damping time of the ( , ±m, 0) QNM, ( f m0 , τ m0 ) are functions of the initial masses and spins: In the parameterized version of the SEOBNRv4HM model used here (pSEOBNRv4HM), deviations in the frequency and damping time are described through fractional deviations, (δf m0 , δτ m0 ), from the corresponding GR predictions as: We use LALInference [138] to stochastically sample over the parameter space of {δf m0 , δτ m0 }, along with the set of GR parameters, and finally reconstruct ( f m0 , τ m0 ) using Eqs. (14)- (17) . Here, we keep our analysis identical to the one presented in previous analysis [11] and restrict ourselves to fractional deviations of the least-damped dominant QNM, i.e., (δf 220 , δτ 220 ), keeping the other QNMs fixed at their nominal GR values. The analysis is performed on events from Table II with an SNR ≥ 8 in the preand post-inspiral regimes, following Table IV. A reasonable SNR in the inspiral is required to break the degeneracy between the fundamental ringdown frequency deviation parameter and the remnant mass. The detector-frame total mass threshold criterion used (due to computational limitations) in the previous analysis [11] was relaxed in this analysis. In reporting joint constraints, we include events from past analyses. These include the 3 events from O3a [11] , and five events from first two observing runs and O3a, reported for the first time in [120] . The final list of events that were used for this analysis is listed in Table XIII . The results of the analysis is summarised in Fig. 14 . The left panel of Fig. 14 shows the 2D posteriors (along with the marginalised 1D posteriors) of the frequency and damping time for all the events listed in Table XIII . The contours are colored by the median redshifted total mass (1 + z)M of the corresponding binary. We also show the 90% credible bounds on the fractional deviations from GR in the right panel color coded by the median redshifted mass of the binary. We specifically highlight the bounds from two events, GW150914 and GW200129 065458. GW150914 previously provided the strongest bounds with this method [120] . However, GW200129 065458, similar in source properties to GW150914, but with an SNR of 26.5 [81] , provides currently provides the strongest single event bound, δf 220 = −0.01 +0.08 −0.08 ; δτ 220 = 0.11 +0.23 −0.23 . The joint constraints are reported using two methods: multiplying posteriors (given a flat prior on the deviation parameters) and hierarchically combining events. The joint bounds from these two methods read by combining hierarchically. The numbers in the square brackets are the hyper-parameter estimates. There is a significant improvement from results previously published in [11] (filled/unfilled upward/downward triangles in the right plot of Fig. 14 using the hierarchical method and δf 220 = 0.03 +0.07 −0.06 ; δτ 220 = 0.13 +0. 18 −0.18 using the restricted method. We note a couple of points about the joint posteriors. First, though we perform single-event analyses over GW191109 010717 and GW200208 130117, we do not include them in the computation of the joint bounds (hierarchical or simple combination). The posterior on δf 220 show multimodalities for these two events. For GW200208 130117, the secondary of two modes is consistent with the GR prediction δf 220 = 0, while for GW191109 010717, none of the modes are. Follow-up investigations with synthetic signals in segments of data immediately adjacent to the event suggests the possibility of noise systematics not accounted for. The same study rules out, within our statistical uncertainties, any systematic bias due to missing physics in the SEOBNRv4HM waveform model. We also note that the joint posterior distribution on δτ 220 in the left plot of Fig. 14 does not include the GR prediction at the 90% credible level. Although insufficient to claim a violation of GR, this apparent deviation definitely warrants further investigation. The trend of overestimating the combined damping time is consistent with what is observed on an event-by-event analysis, where the posterior on δτ 220 , although consistent with 0 is biased towards positive values. Hence a combination of information across multiple events is expected to reduce statistical uncertainties and make this bias more prominent. One possible reason might be a prior on (δf 220 , δτ 220 ) which is asymmetric around 0 with greater support for positive values. This is because, since ( f 220 , τ 220 ) are strictly positive quantities, the priors on (δf 220 , δτ 220 ) are strictly greater than −1. However, the upper prior boundary is free to be as large as is required for the posterior to not rail against it and it usually greater than 1. For events with moderately high SNRs analysed with this method, the effect of the prior on the final posterior can be non-negligible. We also note that while the posteriors on the fractional deviation show more support towards positive values, the frequency and damping time reconstructed using Eqs. (16) and (17) are consistent with those predicted using estimates of initial masses and spins from [81] and NR fits [158] . This gives us more confidence in the measured QNMs, while also pointing to the possibility that correlations among the remnant parameters may be responsible for the apparent deviation. Further, as has been argued in [11] , imperfect noise modelling can also lead to overestimation of damping time [120] . Finally, we can not rule out the statistical uncertainties of working with a sample of just 12 events. Mergers of certain classes of exotic compact objects that do not have a horizon can cause ingoing gravitational waves (e.g., resulting from merger) to reflect multiple times off effective radial potential barriers, with wave packets leaking out to infinity at regular times; these are called echoes [60] [61] [62] . Previous analysis [11] employed a morphology-dependent method [264] using the waveform template proposed in [67] to search for GW echoes. Here, however, we employ a method that is independent of the morphology of the signal [84, 85] . We make use of the BAYESWAVE pipeline [83, 145, 150] to search for echoes and compute the Bayes factors for the signal versus noise hypothesis. We further analyze the background around each event to quantify the significance of the Bayes factors. Our method employs combs of decaying identical sine-Gaussians as the basis functions or the generalized wavelets [84, 85] . The sine-Gaussians are parameterized by an amplitude A, a central frequency f 0 , a damping time τ, and a reference phase φ 0 . In addition to these parameters, basis functions include three extra parameters which characterize the way in which the sine-Gaussians are arranged in the wavelet; the central time of the first echo t 0 , the time separation between sine-Gaussians ∆t, a phase difference between them ∆φ, an amplitude damping factor γ, and a widening factor w. To perform the search, we use uniform priors for all the wavelet parameters except the damping time τ and the amplitude A. The central frequency, f 0 is sampled uniformly from the interval between the lower cut-off frequency and half the sampling rate of the analysis, [30, 1024] Hz. Also, ∆t, φ 0 , ∆φ and γ are sampled uniformly with respective ranges, ∆t ∈ [0, 0.7] s, φ 0 ∈ [0, 2π], ∆φ ∈ [0, 2π], γ ∈ [0, 1], and w ∈ [1, 2]. The damping time τ is sampled such that the corresponding the quality factor Q = 2π f 0 τ is distributed uniformly in the interval [2, 40] , which ensures τ to be within the time scales set by masses of the events (∼ [3 × 10 −4 , 0.2] s). The wavelet amplitude A is sampled based on signal-to-noise ratio as described in [83] . To construct background distributions for the log Bayes factors log 10 The foreground run is instead performed on a 4 s time interval starting at t event + 3τ 220 , since we want to start analyzing at a time that is safely beyond the plausible duration of the ringdown of the remnant object. τ 220 is a conservatively long estimate for the decay time of the 220 mode in the ringdown, obtained from the fitting formula for τ 220 (M f , χ f ) [64, 265] . We take the upper bound of 90% credible interval of τ 220 distribution computed from final mass and spin samples for the event. This is typically of the order of a few milliseconds. The complementary empirical cumulative distribution function of background distributions of log 10 B S N are used to quantify the search outcome via p-values for each event. We compute the p-value for log 10 B S N for each event, which is the fraction of background log 10 B S N above the log Bayes factor of the event. For all the events, the signal versus noise Bayes factor B S N are within the corresponding background distributions, and the corresponding p-values are tabulated in Table XIV . If echoes are not present in the data, we expect the p-values to follow a uniform distribution between [0, 1]. Fig. 15 plots the pvalue versus the cumulative fraction of events and the dotted dash line shows the prediction if no signal is present with corresponding 90% uncertainty regions marked in light-color bands. We follows the recipe described in Sec. IV A to make the PP plot. As can be seen, the measurement is consistent with the absence of echoes within 90% credible region. We conclude that we find no statistically significant evidence for echoes from the morphology-independent search we carried out. As the methodology employed here is different from that of our previous analysis [11] , and relies of the p-values, one cannot have a fair comparison of the results between the two. Gravitational-wave observations provide a unique tool to test fundamental physics. The strongly gravitating, highly dynamical and radiative spacetime associated with the late inspiral, merger and ringdown of compact binaries facilitates tests of general relativity in a regime that is unaccessible otherwise. Binary black holes and binary neutron star mergers observed in the past observing runs already set limits on possible deviations from GR [3, 6, 7, 9-11, 79, 99, 242, 259, 266-269] . Here we discuss a pool of tests aimed at unearthing deviations from GR using the events detected during the second part of the third observing run of advanced LIGO and advanced Virgo. We perform ten tests of GR on the 15 events that have a false alarm rate less than 10 −3 yr −1 . These tests are the same ones as in the previous analysis [11] , except with the following updates. Our search for post-merger echoes is morphologyindependent in this paper and the method to test for non-GR polarization modes is refined to address mixed polarizations as opposed to scalar-only, vector-only, and tensor-only hypotheses as was the case in [11] . Furthermore, some of the tests rely on more up-to-date waveforms; in the residuals and inspiral-merger-consistency tests, we account for higher order multipole moments for all the events from the second part of the third observing run. We subtract the maximum-likelihood GR waveform from the data to verify the consistency of the residuals with detector noise, thereby showing the consistency of the signals in the data with GR. Independent estimates of the mass and spin of the merger remants, from the inspiral and postinspiral parts of the waveform for different events show mutual consistency. The fractional changes in the final mass and spin from this test, assuming they take the same values for all the events and combining all the events analyzed so far, are constrained Tests aimed at looking for parametrized departures from GR in the post-Newtonian phasing coefficients all find consistency with GR within the statistical uncertainties. The most wellconstrained parameter is the absolute value of the −1PN coefficient, which is bound to ≤ 7.3 × 10 −4 at 90% credibility, assuming its value is the same for all the events. As certain modified theories of gravity predict dispersion of GWs, we searched for this effect and found no evidence for dispersion. The bound on the graviton mass is updated to m g ≤ 1.27 × 10 −23 eV/c 2 , at 90% credibility. A general metric theory of gravity admits up to six modes of GW polarization. We searched for non-GR polarization modes and found no signature of such modes. Analyses to measure the spin-induced quadrupole moments of the binary components found no signatures of exotic compact objects. Further, tests for deviations from GR in the ringdown of the remnant black hole were carried out using two independent methods and the frequency deviation parameters are constrained to δf 221 = 0.01 +0. 27 −0.28 and δf 220 = 0.02 +0.07 −0.07 , at 90% credibility, by hierarchically combining the results from the events that are analyzed. We also found no evidence for post-merger echoes from the merger remnant from our morphology-independent search. Future observing runs with improved detector sensitivities will provide a larger catalog of compact binary observations and events with larger SNR. These observations will enable us to carry out more stringent tests of GR in parts of the parameter space that are not covered here. Developing accurate waveform models that cover the diverse physics that will be revealed by the sources is an important step towards this goal. Finally, devising new tests or improving the existing ones, optimizing their sensitvity to predictions from specific modified theories of gravity, can play a very important role in constraining beyond-GR physics using future GW observations. consequence of specific approximations inherent to each waveform model (such as neglecting precessional effects, higher order multipole moments, etc.) or lack of a uniform coverage in the parameter space of NR waveforms, which are used to train or calibrate the waveform models. Extreme spins, highly unequal masses and nearly edge-on inclinations are all elements that can enhance such differences [285] [286] [287] . Another aspect to consider in this context is the presence of waveform degeneracies, where two or more templates provide a good fit to the signal: in this case, posteriors can show a complicate bi(multi)-modal structure, with different modes clearly separated in parameter space. The addition of non-GR parameters complicates the picture, increasing the dimensionality of the likelihood surfaces to be explored and possibly enhancing such multimodalities. Multi-modal features might appear also in the distributions for deviation parameters, with some of the modes having consistent support away from zero. It has been pointed out that this can be a direct consequence of a known degeneracy between source parameters and deviation coefficients [120] . In this sense, one needs to be extremely cautious in classifying support for values away from GR as violations from GR, without thorough cross-comparisons of multiple analyses [40] which might be individually prone to systematic biases related to the partial inclusion of beyond-GR effects [288] . The pSEOBNRv4HM model, an IMR model where ringdown complex frequencies deviations are free to vary, is one of the analyses which commonly suffers from such degeneracies [11, 120] . One example of degeneracy is due to the strong correlation between the fundamental ringdown frequency and the mass of the remnant BH. Such degeneracy is broken when the mass is constrained independently from other phases of the coalescence. This considerations led to the requirement of considering only events with SNR > 8 in the pre-merger regime for this analysis. While additional selection criteria can help in minimizing these effects, it is important to carefully interpret the results of each test keeping in mind the assumptions underlying the analysis of the signal. An additional important aspect of data analysis is the impact of transient non-gaussian noise or glitches affecting the data around the time of the event analysed. Glitches can lead to systematic deviations in the null parameter that is measured and mimic false deviations from GR [11, 289] . These issues will affect both low and high mass systems in different ways, depending on the amplitude and duration of the noise transient [290] , and might exacerbate waveform systematics effects, which tend to be more severe for short signals. Furthermore, since the parameterized models we employ invoke parameters over and above the GR source parameters, these additional parameters can capture any residual noise artifacts that are left after methods such as deglitching [83, 291] are employed to mitigate non-stationarities or non-Gaussianities. As we detect more and more signals, including high-mass ones, these considerations become of increasing importance while testing for deviations from GR. Already in the first half of the third observing run, a few events have shown deviations which were understood to be an effect of incomplete models of the noise background (e.g., in the ringdown regime) [11] . Gaussian noise fluctuations alone are expected to cause deviations from GR for ∼ 1 in ten events with our choice of credible intervals, though the results of our hierarchical analyses should be robust against these effects. As an example, we next discuss how statistically significant deviations seen in some of the analyses, as mentioned in the main text, may be understood. Among the different events we analysed, GW191109 010717 is the only event which led to statistically significant deviations in three of our null tests (i.e., modified dispersion tests, Sec. VI, and two ringdown tests, Sec. VIII A). GW191109 010717 is the heaviest system in our event list and hence has the shortest signal duration. It is also the only event considered in this work for which both Livingston and Hanford frames required mitigation, due to the presence of glitches overlapping with the inspiral track of the signal (see Table XIV of GWTC-3 [81] ). Such noise artifacts belong to the category of 'slow scattering' glitches [292] , which are due to light scattering and lead to long duration arches (∼ 2 − 2.5s) in time-frequency spectograms, as discussed in [293] (see also Sec. IIIB of GWTC-3 [81] for a discussion of data quality of O3b events). Data below 30 (40) Hz were affected in Livingston (Hanford), with noise directly overlapping the time of the trigger, and glitches were subtracted with BayesWave [83, 291] . The same glitch subtraction algorithm was applied only to another event in our selection list, GW200115 042309. However, GW200115 042309 has a long inspiral compared to GW191109 010717, which lasts for less than a second in detector band, meaning the glitch overlaps with a significant portion of the signal. GR parameter estimation runs [81] already show evidence of a multimodal likelihood surface for this event. Previous investigations have shown that BayesWave deglitching is not expected to affect parameter estimation [291, 294] nor tests of GR [289] . However, these works did not consider situations in which two detectors are affected by the type of glitch that impacted GW191109 010717. Tests of injections overlapping scattered light in one detector have been previously shown to mimic GR violations [289] . Injections performed around the trigger time in detector noise could partially reproduce the deviations seen in the actual analyses of the event, indicating noise properties might be the main explanation behind our results. Further studies would be required to assess the reliability of tests of GR in similar situations, as well as the sensitivity of different analyses to residual noise features. These investigations are outside the scope of this paper and, as a cautionary measure, we drop GW191109 010717 from combined statements in the context of ringdown and modified dispersion tests. Table XV ). The gray distribution corresponds to the joint posterior of GWTC-2 events. O3a (O1 and O2) events are plotted with solid (dot-dashed) traces. bias leads to the marginalized GW190814 ∆χ f /χ f posterior (blue solid) in the right side panel of Fig. 16 peaking significantly away from the GR value. GW190503 185404, and GW190814 also have prominent bimodal and multimodal posteriors which can be seen as additional peaks away from GR value in the marginalized ∆M f /M f and ∆χ f /χ f posteriors. This significant bias from the GR value for GW190814 is the reason why we see the σ posterior of ∆χ f /χ f in Fig. 5 peaking away from zero for GWTC-2 and, in turn, for GWTC-3. Considering the change in Q 2D GR values with respect to our previous analysis [11] , we have already explained why we see significant differences for GW170814 and GW190503 185404 with absolute differences of 13.0 and 41.1 percentage points, respectively. The next largest absolute change is 2.3 percentage points for GW170818 and for most events the change is at most 0.5 percentage points. To summarize, we changed our calculation of the reweighted posteriors in three ways. We corrected the cases where the mismatch of priors used in the inspiral and postinspiral runs led to the prior on the deviation parameters peaking well away from zero, recomputed the prior distribution on the deviation parameters for the O1/O2 events, and made the limits on the deviation parameters uniform across all events. With these changes, we note that Q 2D GR changes appreciably only for two events: GW170814 and GW190503 185404. We use the results from these new reweighted posteriors to obtain the combined results. Virgo Collaboration) Virgo Collaboration Virgo Collaboration) Virgo Collaboration) Virgo Collaboration) Instrument Science Authors) PyCBC software Virgo Collaboration) Virgo Collaboration) Virgo Collaboration) LAL-Suite software 24th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, AIP Conf. Proc pyRing: a time-domain ringdown analysis software CPNest: an efficient python parallelizable nested sampling algorithm Data release for Tests of General Relativity with GWTC-2, LIGO-P2000438 Determining the final spin of a binary black hole system including in-plane spins: Method and checks of accuracy Wiseman Gravitational waves. Numerical relativitydata analysis. Proceedings, 9th Edoardo Amaldi Conference, Amaldi 9, and meeting Teukolsky Data release for Tests of General Relativity with GWTC-3 Virgo Collaboration) Analyses in this paper made use of NumPy [270] , SciPy [271] , Astropy [272, 273] , IPython [274] , qnm [275] , PE-Summary [276] , and GWpy [277] ; plots were produced with Matplotlib [278] , and Seaborn [279] . Posteriors were sampled with Stan [280] , CPNest [144] , PyMultinest [281, 282] , Bilby [136, 137] , and LALInference [138] . Power spectral densities are generated through the software BAYESWAVE [83, 145] . This material is based upon work supported by NSF's LIGO Laboratory which is a major facility fully funded by the National Science Foundation. The authors also gratefully acknowledge the support of the Science and Technology Facilities Council (STFC) of the United Kingdom, the Max-Planck-Society (MPS), and the State of Niedersachsen/Germany for support of the construction of Advanced LIGO and construction and operation of the GEO600 detector. Additional support for Advanced LIGO was provided by the Australian Research Council. The authors gratefully acknowledge the Italian Istituto Nazionale di Fisica Nucleare (INFN), the French Centre National de la Recherche Scientifique (CNRS) and the Netherlands Organization for Scientific Research (NWO), for the construction and operation of the Virgo detector and the creation and support of the EGO consortium. The authors also gratefully acknowledge research support from these agencies as well as by the Council of Scientific and We would like to thank all of the essential workers who put their health at risk during the COVID-19 pandemic, without whom we would not have been able to complete this work. Appendix A: Effect of waveform systematics and noise artifacts on the testsIn the context of null tests of GR, the posterior distributions on deformation parameters are sensitive to every assumption that goes into the analysis. These include assumptions about the relevant physics included in the waveforms employed and detector noise model, among others. Hence, any deviation from GR seen in the posteriors, statistically significant or not, could be due to the breakdown of one or more of the above assumptions. We discuss below some of these features and discuss the specific case of GW191109 010717 as an example. The increasing number of events detected results in a large variety of systems observed, spanning regions of parameter space where different approximants lead to visibly different estimates of source parameters [81, 135, 283, 284] . Such discrepancies are expected for non-standard events, and are a direct In this section, we revisit the IMR consistency test results of GWTC-2 events which are summarized in Table IV of our previous analysis [11] . Here we describe the main reasons for this reanalysis.First, for some events the parameter estimation analyses of the inspiral and the postinspiral parts used different prior bounds. This is not necessarily problematic but it can lead to the 2D distribution of the prior on the fractional deviation parameters (∆M f /M f , ∆χ f /χ f ) peaking away from (0, 0). Such priors are undesirable, since we do not want to prefer a GR deviation a priori. The prior distributions on the deviation parameters for GW170823 and GW190503 185404 peaked significantly away from (0, 0), so we reanalyzed these events using the same priors for masses and spins in the inspiral and postinspiral analyses. The GR quantile value for GW190503 185404 (94.3%) is significantly higher than its previous value (53.2%). This can be attributed to the fact that the new prior for this event peaks at zero whereas the old prior peaked close to the peak of the posterior (well away from zero). The new and old posteriors peak at almost the same place, causing the reweighted posterior to shift further away from (0, 0).Second, in our GWTC-1 analysis [10] , the prior distributions on the fractional mass and spin parameters of O1 and O2 events were computed only using the prior range on component masses, and not accounting for the additional constraints on the mass priors. This was discontinued for O3a events in GWTC-2 [11] , where prior samples were generated considering the full set of priors. However, the GWTC-1 priors were used for O1/O2 events. To maintain uniformity, we recomputed the priors for O1/O2 events which were then used to reweight the posteriors. The old prior for the event GW170814 favored fractional mass deviation parameters further away from zero compared to the new prior which pushed the portion of the posterior with less probability closer to the origin. This is likely the reason why the GR quantile value of GW170814 in our previous analysis [11] is significantly higher (22.9%) than the current value (9.9%).Third, we change the limits of the fractional deviation parameters, ∆M f /M f and ∆χ f /χ f . As can be seen from Fig. 3 of the GWTC-2 analysis [11] , the 90% credible regions of the posteriors on (∆M f /M f , ∆χ f /χ f ) for a few of the GWTC-2 events such as GW190814 were not closed within the range of the deviation parameters for which they were calculated. The ranges of deviation parameters were earlier set to [−1.5, 1.5] for ∆M f /M f and [−1, 1] for ∆χ f /χ f . We now set the ranges of both deviation parameters to [-2,2] , which encloses all the 90% credible regions we find. This change in the ranges of deviation parameters has at most a small effect on the GR quantiles, with the largest absolute difference of 0.5 percentage points for GW190828 063405.The new results obtained with these three changes are given in Table XV and the posteriors are shown in Fig. 16 . The three events whose contours do not enclose the origin are GW170823 (orange dot-dashed), GW190503 185404 (orange solid), and GW190814 (blue solid). Additionally, GW170814, GW170818, and GW190828 063405 show some small multimodal structures. The possible reasons for the high Q 2D GR values for GW170823 and GW190814 have already been discussed in our previous study [11] . Specifically, GW170823 is the event with the lowest SNR among the events in Table XV and has a relatively high redshifted mass, while GW190814 has a low SNR in its postinspiral part due to its low redshifted mass leading to significant bias in its final spin measurements compared to the relatively high SNR inspiral regime. This