key: cord-0748809-965j5tpz authors: Manso, Taciana; Folch, Géraldine; Giudicelli, Véronique; Jabado-Michaloud, Joumana; Kushwaha, Anjana; Nguefack Ngoune, Viviane; Georga, Maria; Papadaki, Ariadni; Debbagh, Chahrazed; Pégorier, Perrine; Bertignac, Morgane; Hadi-Saljoqi, Saida; Chentli, Imène; Cherouali, Karima; Aouinti, Safa; El Hamwi, Amar; Albani, Alexandre; Elazami Elhassani, Merouane; Viart, Benjamin; Goret, Agathe; Tran, Anna; Sanou, Gaoussou; Rollin, Maël; Duroux, Patrice; Kossida, Sofia title: IMGT(®) databases, related tools and web resources through three main axes of research and development date: 2021-12-07 journal: Nucleic Acids Res DOI: 10.1093/nar/gkab1136 sha: 35aad7eaadc44dc0ba091dde6f95a8cf9b4bab0d doc_id: 748809 cord_uid: 965j5tpz IMGT(®), the international ImMunoGeneTics information system(®), http://www.imgt.org/, is at the forefront of the immunogenetics and immunoinformatics fields with more than 30 years of experience. IMGT(®) makes available databases and tools to the scientific community pertaining to the adaptive immune response, based on the IMGT-ONTOLOGY. We focus on the recent features of the IMGT(®) databases, tools, reference directories and web resources, within the three main axes of IMGT(®) research and development. Axis I consists in understanding the adaptive immune response, by deciphering the identification and characterization of the immunoglobulin (IG) and T cell receptor (TR) genes in jawed vertebrates. It is the starting point of the two other axes, namely the analysis and exploration of the expressed IG and TR repertoires based on comparison with IMGT reference directories in normal and pathological situations (Axis II) and the analysis of amino acid changes and functions of 2D and 3D structures of antibody and TR engineering (Axis III). The adaptive immune response appeared with the jawed vertebrates (or Gnathostomata), 450 million years ago. It is characterized by a remarkable immune specificity and memory which are the properties of the B and T cells owing to an extreme diversity of their antigen receptors, immunoglobulins (IG) or antibodies and T cell receptors (TR) (1) . In human and other mammals, an IG consists of two identical light chains (Kappa (IGK) or Lambda (IGL)) and two identical heavy chains (IGH) (2) , while a TR consists of two chains, either Alpha (TRA) and Beta (TRB), or Gamma (TRG) and Delta (TRD) (3) . Each IG and TR chain comprises a variable domain (V-DOMAIN) which determines the specificity for the antigen, and a constant region (C-REGION). The V-DOMAIN results from the genomic DNA rearrangement of variable (V), diversity (D) and joining (J) genes for IGH, TRB and TRD chains (V-D-J-REGION) and from V and J genes for IGK, IGL, TRA and TRG chains (V-J-REGION) (Supplementary Figure S1 ). Additional mechanisms occurring during the rearrangements (N diversity, somatic hypermutations for the IG) contribute to the extreme diversity of the IG and TR (theoretically 10 12 different IG and TR per individual, which is only limited by the number of the B and T cells that an organism is genetically programmed to produce). IMGT ® , the international ImMunoGeneTics information system ® (http://www.imgt.org) (4) , was created in 1989 in order to characterize the genes and alleles involved in the IG and TR synthesis of vertebrates. IMGT ® is an integrated knowledge system for sequences, genes and structures of the IG or antibodies, TR and major histocompatibility proteins (MH) of the adaptive immune responses, as well as of other proteins of the IG superfamily (IgSF) and MH superfamily (MhSF) of vertebrates and invertebrates. IMGT ® comprises 7 databases, 17 online tools ( Figure 1A ) and >20 000 pages of Web resources. The accuracy and the consistency of the IMGT ® data are based on IMGT-ONTOLOGY (5, 6) , the first ontology for immunogenetics and immunoinformatics and IMGT Scientific chart rules. IMGT-ONTOLOGY includes the IMGT structured terminology and the annotation rules and is composed of seven axioms. The IDENTIFICATION axiom provides the standardized keywords for the identification of nucleotide and protein sequences and the 3D structures. The DESCRIPTION axiom comprises the IMGT standardized labels for the description and the delimitation of constitutive motifs within sequences and structures. The CLASSIFICATION axiom defines the criteria for IG and TR genes and alleles classification for the setting of the standardized nomenclature. The NUMEROTATION axiom includes the IMGT unique numbering and its graphical 2D representation, the IMGT Collier de Perles. The LOCALIZATION axiom allows to characterize the localization of IG and TR genes. The ORIENTATION axiom defines the orientation of genomic instances (chromosome, locus and gene) of DNA strands. The OBTENTION axiom precises the biological and methodological origins of the IMGT data (5, 6) . IMGT ® comprises in particular databases which are specialized in nucleotide sequences (IMGT/LIGM-DB) (7), genes and alleles (IMGT/GENE-DB) (8), amino acid sequences and 2D (IMGT/2Dstructure-DB) and 3D structures (IMGT/3Dstructure-DB) (9) and therapeutic monoclonal antibodies (IG, mAb) and other proteins for clinical applications (IMGT/mAb-DB) (4) . The four IMGT databases, the related tools and Web resources are described in this manuscript through the three main axes of IMGT research and development: the identification and characterization of IG and TR genes and knowledge of their genomic organization (Axis I), the analysis and exploration of the expressed IG and TR repertoires in normal and pathological situations (Axis II) and the analysis of adaptive immune proteins from antigen receptor to amino acid changes (Axis III) ( Figure 1B ). IG and TR chains are encoded by polymorphic multigene families located on different chromosomes. In humans and other mammals, there are seven main loci for IG and TR: three for IG (IGH, IGK and IGL) (2,10) and four for TR (TRA, TRB, TRD and TRG) (3). The V, D, J and constant (C) IMGT gene names were assigned according to the concepts of the CLASSIFICATION axiom (5,6) and were approved by the Human Genome Organization (HUGO) Nomenclature Committee (HGNC) for human (11) in 1999 and were endorsed by the WHO IUIS Nomenclature Subcommittee for IG and TR (12) . The characterization of genes and alleles for the seven loci of human (Homo sapiens) and mouse (Mus musculus) were published in 2001 and 2005. The organization of the genes within these loci was deduced and built from the complete annotation of the genomic nucleotide sequences and contigs integrated in the IMGT nucleotide sequence database IMGT/LIGM-DB (7) from European Nucleotide Archive (ENA) (13) and Gen-Bank (14) . IMGT genes and alleles are managed in the IMGT gene database IMGT/GENE-DB (8) and displayed in IMGT Repertoire (IMGT Web resources) and IMGT tools (http://www.imgt.org/IMGTposters/Poster-10th-Biocuration-Conference2017.pdf). With the introduction of genome assemblies, which have become available in NCBI assembly (15) and Ensembl (16) , IMGT® developed a new approach and new concepts in order to decipher complete IG and TR loci. First of all, IMGT ® defines conserved genes that flank the IG and TR loci, designated as 'IMGT bornes'. IMGT bornes are genes coding for proteins other than IG or TR, which are conserved among species. They are located either upstream of the first IG or TR gene (IMGT locus 5prime borne) or downstream of the last IG or TR gene (IMGT locus 3prime borne) of the IMGT locus. If the IMGT bornes are identified and are at most 10 kb away from the closest IG or TR genes, they will be included in the locus genomic nucleotide sequences available through IMGT/LIGM-DB. These IMGT bornes have allowed to set a standardized delimitation of the locus whatever the species and they are helpful for comparative genomics. However, such conserved non IG or TR genes could not be systematically defined (n.d.) up to now, as for example for the IGH locus. In absence of the IMGT borne, the limit of the locus is artificially defined by 10 kb in 5 upstream of the first IG or TR gene and in 3 downstream from the last IG or TR gene. TRB is an example of locus with delimited IMGT bornes and can be accessed on the page http://www.imgt.org/ IMGTrepertoire/LocusGenes/bornes/bornesTRB.html. IMGT/LIGM-DB provides standardized and detailed immunogenetics annotations for IG, TR and MH nucleotide sequences from human and other vertebrate species (7) . IMGT/LIGM-DB includes sequences from different steps of IG and TR synthesis and therefore, it integrates: (i) large germline (non-rearranged) genomic DNA (gDNA) sequences, which may involve a complete locus from several hundred kilobases to one (or more) megabase(s); (ii) rearranged gDNA sequences resulting from the recombination of V, J genes or V, D and J genes; and (iii) rearranged V-J-C and V-D-J-C complementary DNA (cDNA) sequences. Most of the IMGT/LIGM-DB nucleotide sequences come from ENA and from GenBank, using the same accession numbers to facilitate interoperability with the generalist nucleotide databases. More recently, with the extraction of IG and TR loci nucleotide sequences from NCBI genome assemblies, IMGT ® created new IMGT/LIGM-DB accession numbers starting with 'IMGT' followed by 6 digits. IMGT/LIGM-DB sequences are annotated according to IMGT-ONTOLOGY concepts of the DESCRIPTION axiom (5, 6) , with IMGT labels (http://www.imgt.org/ligmdb/ label) and IMGT qualifiers (http://www.imgt.org/ligmdb/ qualifier.action). In order to delimit and annotate a complete IG or TR locus extracted from genome assemblies, a specific IMGT label and a set of IMGT qualifiers has been created for its description (Table 1) . The IMGT/LIGM-DB data are accessible via a user-friendly interface described previously in (7) . IMGT/LIGM-DB can be queried by: Accession number, IMGT-ONTOLOGY concepts (IDENTIFICATION or Keywords, CLASSIFICATION, DESCRIPTION or labels, OBTENTION), or bibliographical references. For each nucleotide sequence, IMGT/LIGM-DB provides 'View details' displaying an IMGT/LIGM-DB entry according to nine topics: annotations, IMGT flat file, coding regions with protein translation, catalogue and external references, sequence in IMGT/LIGM-DB dump format, sequence in FASTA format, sequence with three reading frames, EMBL flat file, and a direct link to IMGT/V-QUEST (17) . As of September 2021, IMGT/LIGM-DB contains 196,516 entries from 358 species and 48,682 IG and TR nucleotide sequences are fully annotated. Weekly release of IMGT/LIGM-DB flat files can be downloaded directly from the IMGT web site (http://www.imgt.org/ download/LIGM-DB/) and from ENA (http://ftp.ebi.ac. uk/pub/databases/imgt/LIGM-DB/). The curated IG and TR genes are entered and managed in IMGT/GENE-DB (8) with all IMGT identified alleles, which highlight the potential high polymorphism of these genes. Each allele is characterized by its IMGT reference allele sequence defined for the coding label V-REGION (with gaps according to the IMGT numbering (18)), D-REGION, J-REGION and C-REGION (or C exons) (with gaps for C-DOMAIN according to the IMGT numbering (19) ) of the V, D, J and C genes respectively. An IMGT allele reference sequence is identified by IMGT/LIGM-DB accession number, IMGT gene and allele names, species, allele functionality and IMGT label. IMGT allele reference sequences compose the IMGT reference directories that are used by IMGT sequence analysis tools and by IMGT databases and IMGT Web resources for sequence comparison. From the IMGT/GENE-DB Query page, search can be performed by IMGT-ONTOLOGY concepts (IDENTIFI-CATION or keywords, LOCALIZATION, and CLASSI-FICATION), LOCALIZATION IN GENOME ASSEM-BLIES or IMGT/GENE-DB direct links. IMGT/GENE-DB provides a full access to characterized genes and alleles displaying an IMGT/GENE-DB entry according to six topics: IMGT gene name and definition, Chromosomal localization, IMGT reference alleles, Annotated IMGT/LIGM-DB cDNA and rearranged genomic DNA sequences, Annotated IMGT/3Dstructure-DB structures, and External links. The section 'LOCALIZATION IN GENOME ASSEM-BLIES' created in 2015, provides the localizations of the genes and alleles, and IMGT labels in the reference genome assemblies available at NCBI. For each gene, its orientation in the locus is mentioned, and the allele identified in the sequence of the assembly is indicated with its characteristics. The 'IMGT/GENE-DB direct links' allows to query IMGT label IMGT-LOCUS-UNIT gDNA of an immunoglobulin (IG) or T cell receptor (TR) IMGT locus unit from chromosome genomic assembly, that starts at the 5 prime (5 ) end of the most 5 IG or TR GENE-UNIT in the locus and ends at the 3 prime (3 ) With the development of new high throughput sequencing technologies for the analysis of IG and TR repertoires, new potential alleles are highlighted by inference from expressed repertoires, particularly in human. Inferred alleles are not systematically integrated within the IMGT databases, because the sequences are not mapped. However, IMGT ® can accept inferred alleles if and only if validated by the Working Group (WG) Inferred Allele Review Committee (IARC), within the Adaptive Immune Receptor Repertoire (AIRR) community. IARC ensures that IMGT data quality requirements are met. Nevertheless, reference sequences of inferred alleles are replaced by the corresponding germline DNA sequence once they are characterized (20) . An overview of IMGT ® annotated data is compiled and knowledge pages are made available in IMGT Web Resources 'IMGT Repertoire' (http://imgt.org/IMGTrepertoire/), the global ImMuno-GeneTics Web Resource for IG, TR, MH of human and other vertebrate species. IMGT Repertoire includes seven organized sections: Locus and genes, Proteins and alleles, 2D and 3D structures, Probes and RFLP, Taxonomy, Gene regulation and expression, Genes and clinical entities. Novel IMGT Repertoire (IG and TR) pages in Locus and genes section were created, focusing on the 'Locus descriptions', including Locus bornes, Locus in genome assembly and Locus gene order. As of September 2021, the number of species present in the IMGT Repertoire reaches 80 species. For each gene analyzed, there are >200 different information fields available in IMGT databases and web pages. Therefore, IMGT Repertoire bridges the gap between curated data resulting from Axis I and IMGT databases and tools ( Table 2) . IMGT® has recently performed the biocuration of the IG and TR loci of several veterinary species which are useful for biotechnological applications that can also be applied to human medicine (21) (22) (23) (24) (25) (26) (27) . IMGT Biocuration makes possible the understanding of the gene characterization and the genomic organization of IG and TR, which provide a better understanding of the adaptive immune response. The analysis of the expressed IG and TR repertoires has become an essential step for the study and the understanding of the adaptive response in normal (infectious diseases, vaccination) and pathological situations (autoimmune diseases, cancers) especially since the advent of high throughput sequencing (HTS) over a decade ago. Basically, this analysis relies on the comparison of the expressed V-DOMAIN with the reference sequences of IG and TR genes and alleles. The dedicated and widely used IMGT tools for the IG and TR V-DOMAIN nucleotide sequence analysis are IMGT/V-QUEST (17) and its high throughput version IMGT/HighV-QUEST (28, 29) . The IMGT/V-QUEST reference directories used by both tools for sequence comparison are defined of IG and TR gene and allele data from species managed in IMGT/GENE-DB and in the IMGT Web resources. They comprise one sequence per V-REGION, D-REGION, J-REGION of functional, ORF and in-frame pseudogenes V, D and J genes and alleles respectively. V-REGION are gapped according to the IMGT unique numbering (18) . Table 3 summarizes the IMGT/V-QUEST reference directories per species and locus available for V-DOMAIN analysis. The classical functionalities of IMGT/V-QUEST and IMGT/HighV-QUEST tools have been described previously (17, (28) (29) (30) and the main results deduced from alignments with the IMGT reference directories by the tools are listed in Table 4 . It should be noticed that the V-DOMAIN analysis based on the IMGT/V-QUEST directories has been extended to two new advanced functionalities, one related to the antibody engineering for analysis and annotation of scFv (sequences comprising 2 IG or TR V-DOMAIN covalently linked by a linker) (30) and the second one related to clinical applications with identification of sequences that could be assigned to stereotyped subsets 2 and 8 of Chronic Lymphocytic Leukemia (CLL), related to a non-favourable prognostic outcome (34, 35) . Interestingly, the characterization of the IMGT clonotypes (AA) and the evaluation of profiles for clonal diversity and expression (36) performed by statistic module of IMGT/HighV-QUEST and the subsequent statistical analysis (37) also rely on the results deduced from the alignment of the IMGT/V-QUEST reference directory sets. IMGT reference directory sets are used by other external tools dedicated to IG and TR analysis based on sequence comparison such as IgBLAST (38) and MiXCR (39) . The IMGT/V-QUEST reference directory sets are regularly enriched with the results of Axis I, whether it is the integration of a new species or the upgrade of existing repertoires. Each update gives rise to a new IMGT/V-QUEST reference directory release (see http://www.imgt. org/IMGT vquest/data releases). Links to the IMGT/V-QUEST reference directory sets per species, locus and gene type are available in IMGT reference directory in FASTA format (IG and TR) from http://www.imgt.org/ vquest/refseqh.html#VQUEST and from the IMGT/V-QUEST Welcome page. Considering the great complexity of the immune proteins, their interactions with the antigens and their high number of published sequences, the classification and the detailed annotation are very difficult tasks, especially at the structural level. Therefore, a specialized 3D immune protein database was established to identify the genes and alleles encoding these proteins through alignment against the amino acid IMGT reference directory, provided by Axis I. Since 2001, IMGT/3Dstructure-DB (9) has provided IMGT annotations and contact analysis for immune proteins structural data. From 2008 onwards, AA sequences of mAb and fusion proteins for immune applications from World Health Organization (WHO) -International Nonproprietary Names (INN) programme (40, 41) are being incorporated in IMGT/2Dstructure-DB, a section of IMGT/3Dstructure-DB. To bring together information about therapeutic proteins and to facilitate their access, IMGT/mAb-DB was made available online in 2010. IMGT/mAb-DB extends 2D and 3D annotations with a unique resource on mAbs and relevant therapeutic meta-data. Figure 2 provides a schematic representation of the whole procedure. The IMGT/3Dstructure-DB structural data are extracted from the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) (42) and annotated according to the IMGT Scientific chart rules based on the IMGT-ONTOLOGY concepts (5, 6, 43) . IMGT/3Dstructure-DB integrates the IMGT/DomainGapAlign tool (44) , which aligns the AA sequences per domain, creates gaps according to the IMGT unique numbering and highlights differences with the closest reference genes and alleles found in the IMGT reference directory. 3D structure analysis includes chain annotation, paratope/epitope description of IG/antigen and TR/pMH complexes and contact analysis. The IMGT/2Dstructure-DB data include AA sequences of immune proteins, which are retrieved from WHO-INN programme (41) and from Kabat database (45) . The AA sequences are analysed with the IMGT ® criteria of the standardized IDENTIFICATION axiom, DESCRIPTION axiom, CLASSIFICATION axiom and NUMEROTATION axiom (5, 6) , and the V, C and G domain sequences are numbered according to the IMGT unique numbering (18, 19, 44) . Amino acid sequences from the WHO-INN programme have been provided since 2008 (IMGT entry type INN). This programme provides names for pharmaceutical substances recognized worldwide in biannual lists. The IMGT INN data include mAb, fusion proteins for immune application (FPIA), composite proteins for clinical applications (CPCA) and related proteins of the immune system (RPI). The INN name, INN number, common name, commercial name, Proposed and Recommended lists are available for each entry, along with the IMGT receptor description, the target and the molecule species. Recently, AA sequences of CAR-T (chimeric antigen receptor T cell) and TR were made available in IMGT/2Dstructure-DB, also from WHO-INN, after translating the nucleotide sequences and analysing them according to standardized IMGT information on chains and domains by IMGT experts. IMGT/2Dstructure-DB and IMGT/3Dstructure-DB use the same interface via which amino acid sequences and 3D structures for immunological proteins can be queried and analysed. Currently, their algorithms have been revisited and they are more robust and efficient. Around 100 new structures are automatically retrieved from PDB per month. As of September 2021, the IMGT/3Dstructure-DB and IMGT/2Dstructure-DB have 7,657 entries, 6,533 PDB, 788 INN and 336 KAB. IMGT/mAb-DB provides a unique resource on mAbs and other therapeutic proteins. This database facilitates access to the therapeutic proteins present in IMGT/2Dstructure-DB and IMGT/3Dstructure-DB. The database is updated The gene name of the target is linked to HGNC or VGNC pages that assign standardized names and unique symbols to genes for human or vertebrate loci, respectively (11) . Other therapeutic metadata such as 'Company', 'Clinical trials' and 'Authority decisions' are also accessible in the result table. The therapeutic monoclonal antibody engineering field represents a real promising potential in medicine (46) (47) (48) . The rich, precise and standardized information available via IMGT/mAb-DB provides a unique and useful resource to the scientific community. IMGT ® provides to the scientific community a huge amount of knowledge and curated data in the field of immunogenetics, from genome to proteome through IMGT databases, IMGT tools and IMGT Web resources, which represent >20 000 html pages. To our knowledge, the richness of the website is still unmatched in 2021. IMGT metadata in the IMGT databases, tools and Web resources are based on IMGT-ONTOLOGY, the first ontology in immunogenetics and immunoinformatics. IMGT research and development rely on three main axes which correspond to the deciphering of the IG and TR loci, genes and alleles in the genomes of jawed vertebrates (Axis I), the exploration of the expressed IG and TR repertoires (Axis II), and the analysis of the 2D and 3D structures of the adaptive immune proteins (Axis III). We focussed on the most recent data integrated in IMGT/LIGM-DB and IMGT/GENE-DB, the extraction of the complete IG and TR loci from genome assemblies and on the creation of terminology and new concepts for their annotation. A new section in IMGT/GENE-DB was created to provide links between genes and alleles of the IG and TR loci and their localization in genome assemblies (for interoperability with genome sites). IMGT tools and IMGT reference directories for the analysis of expressed IG and TR repertoire are regularly updated. Regarding the importance of the chemical interactions in the antibody specificity, affinity and half-life, the IMGT/2Dstructure-DB, IMGT/3Dstructure-DB and IMGT/mAb-DB provide an integrated and standardized approach for the description of new engineered antibody formats. This approach can be used for the construction and expression of engineered antibodies towards targeted and customized therapy in the context of personalized medicine. The three IMGT axes are heavily interconnected and there is a constant flow of information among them. IMGT ® is continuing the standardization efforts and the improvement of application of the FAIR principles (49) in order to enhance the quality, findability, accessibility, interoperability and reusability of IMGT data and metadata. To be Findable, IMGT databases use unique and persistent identifiers (IMGT/LIGM-DB, IMGT/2Dstructure-DB, IMGT/3Dstructure-DB and IMGT/mAb-DB) and are described with rich metadata based on IMGT-ONTOLOGY and IMGT Scientific chart rules. To be Accessible, IMGT data and metadata are freely available for academics. In addition, IMGT/GENE-DB can be dynamically queried through HTML direct links. To be Interoperable and Reusable, IMGT data and metadata have links to their sources and related databases, all IMGT sequence data are available in FASTA format, widely accepted by many bioinformatics programs and are described with their relevant attributes. Furthermore, the IMGT download sections for the IMGT reference directories ensure the follow up of new releases and facilitate the extraction and the reusability of the data by external tools. IMGT ® is freely available online for academics and nonprofit use at http://www.imgt.org/. All the databases and tools referred to in this article are accessible from IMGT ® webpage. Supplementary Data are available at NAR Online. We are very grateful to Marie-Paule Lefranc, IMGT ® founder in 1989, for her great expertise, daily assistance and continuous contribution to the research and development axes at IMGT ® . We thank Gérard Lefranc and all members of the IMGT ® team for their expertise and constant motivation. IMGT ® is a registered trademark of CNRS. IMGT ® is a member of the Confederation of Laboratories for Artificial Intelligence Research in Europe (CLAIRE), https://claire-ai.org/network/. IMGT ® is a member of the International Medical Informatics Association (IMIA), https://imia-medinfo.org/wp/ and a member of the Global Alliance for Genomics and Health (GA4GH), https://www.ga4gh.org/. IMGT ® is currently supported by the Centre National de la Recherche Scientifique (CNRS), the Ministère de l'Enseignement Supérieur, de la Recherche et de l'Innovation (MESRI), the University of Montpellier, and the French Infrastructure Institut Français de Bioinformatique (IFB) ANR-11-INBS-0013. IMGT ® is Immunoglobulin and T cell receptor genes: IMGT® and the birth and rise of immunoinformatics The Immunoglobulin FactsBook The T Cell Receptor FactsBook IMGT ® , the international ImMunoGeneTics information system® 25 years on IMGT-Kaleidoscope, the formal IMGT-ONTOLOGY paradigm the IMGT comprehensive database of immunoglobulin and T cell receptor nucleotide sequences IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes IMGT/3Dstructure-DB and IMGT/DomainGapAlign: a database and a tool for immunoglobulins or antibodies, T cell receptors, MHC, IgSF and MhcSF Immunoglobulins or Antibodies: IMGT ® Bridging Genes, Structures and Functions Genenames.org: the HGNC and VGNC resources in 2021 WHO-IUIS Nomenclature Subcommittee for immunoglobulins and T cell receptors report The European Nucleotide Archive in 2019 Assembly: a resource for assembled genomes at NCBI QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains Inferred Allelic Variants of Immunoglobulin Receptor Genes: A System for their Evaluation, Documentation, and Naming 2020) IMGT ® Biocuration and Comparative Analysis of Bos taurus and Ovis aries TRA/TRD Loci IMGT ® Biocuration and Comparative Study of the T cell Receptor Beta Locus of Veterinary Species Based on The T Cell Receptor (TRB) Locus in Tursiops truncatus: From sequence to structure of the Alpha/Beta Heterodimer in the Human/Dolphin Comparison Genomic analysis of a second rainbow trout line (Arlee) leads to an extended description of the IGH VDJ gene repertoire Topology and expressed repertoire of the Felis catus T cell receptor loci Standardized IMGT ® Nomenclature of Salmonidae IGH Genes, the Paradigm of Atlantic Salmon and Rainbow Trout: from The T cell receptor (TRA) locus in the rabbit (Oryctolagus cuniculus): Genomic features and consequences for invariant T cells IMGT/HighV-Quest: the IMGT ® web portal for immunoglobulin (IG) or antibody and T cell receptor (TR) analysis from NGS high throughput and deep sequencing IMGT ® tools for the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGT/V-QUEST and IMGT/HighV-QUEST for NGS IG and TR single chain fragment variable (scFv) sequence analysis: a new advanced functionality of IMGT/V-QUEST and IMGT/HighV-QUEST IMGT standardized criteria for statistical analysis of immunoglobulin V-REGION amino acid properties IMGT/JunctionAnalysis: the first tool for the analysis of the immunoglobulin and T cell receptor complex Immunogenetics Sequence Annotation: the Strategy of IMGT based on IMGT-ONTOLOGY. Stud Stereotyped B-cell receptors in one-third of chronic lymphocytic leukemia: a molecular classification with implications for targeted therapies Higher-order connections between stereotyped subsets: implications for improved patient classification in CLL IMGT/HighV QUEST paradigm for T cell receptor IMGT clonotype diversity and next generation repertoire immunoprofiling IMGT/StatClonotype for Pairwise Evaluation and Visualization of NGS IG and TR IMGT Clonotype (AA) Diversity or Expression from IMGT/HighV-QUEST IgBLAST: an immunoglobulin variable domain sequence analysis tool MiXCR: software for comprehensive adaptive immunity profiling Antibody nomenclature: from IMGT-ONTOLOGY to INN definition World Health Organization (WHO) (2016) In: International Nonproprietary Names (INN) for biological and biotechnological substances (a review). INN Working Document 05.179. World Health Organization The protein data bank T cell receptor/peptide/MHC molecular characterization and standardized pMHC contact sites in IMGT/3Dstructure-DB IMGT/DomainGapAlign: the IMGT ® tool for the analysis of IG, TR, MH, IgSF, and MhSF domain amino acid polymorphism Kabat database and its applications: 30 years after the first variability plot HER2 signaling and resistance to the anti-EGFR monoclonal antibody cetuximab: a further step toward personalized medicine for patients with colorectal cancer Advances in antibody engineering for rheumatic diseases Developments in therapy with monoclonal antibodies and related proteins The FAIR Guiding Principles for scientific data management and stewardship