key: cord-0865047-eqlp5on2 authors: Chakraborty, Chiranjib; George Priya Doss, C.; Zhu, Hailong; Agoramoorthy, Govindasamy title: Rising Strengths Hong Kong SAR in Bioinformatics date: 2016-03-09 journal: Interdiscip Sci DOI: 10.1007/s12539-016-0147-x sha: 0be2160c6cfc01fe13a42a9a601103694ce58a5f doc_id: 865047 cord_uid: eqlp5on2 Hong Kong’s bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation. Concerning Hong Kong's economic development, Tony Fu-Lai Yu, an economics scholar from the University of New South Wales, Australia, once wrote: 'From barren island, Hong Kong changed into the mart of East Asia'. But what puzzled economists is that Hong Kong is only a small region with approximately 6 million people living in an area of around 1064 km 2 [1] . Today, the population of this city has grown to more than 7 million [2] . Hong Kong, a special administrative region (SAR) of the People's Republic of China (PRC), is situated in the southern part of China that is enclosed by the http://en.wikipedia.org/wiki/ Pearl_River_DeltaPearl River Delta and the South China Sea (http://www.censtatd.gov.hk/FileManager/EN/Con tent_810/geog.pdf). Hong Kong is one of the world's leading financial and trading centres and is also well known as a free trade zone with low taxation. The area became controlled by the Chinese in 221 BC, during the Qin dynasty of China [3] , and remained under Chinese control until after the Anglo-Chinese War of 1839-1842 (also known as the First Opium War) when it became a protectorate of the UK. The territory of Hong Kong was initially restricted to Hong Kong Island, with the British then extending their boundaries in two phases. The first phase, in 1860, saw the British outpost being extended to the Kowloon Peninsula, and in 1898 it was again extended to include the New Territories. Hong Kong underwent its first peaceful and democratic transfer of power in 1997 when China regained sovereignty [4, 5] . The area had been occupied by the Japanese in 1941 after the Battle of Hong Kong, which occupancy ended after 3 years and 8 months [6] when Japan surrendered at the end of the Second World War, in 1945. Hong Kong has now become a major global financial centre and one of East Asia's 'Economic Tigers'. The territory is economically and technically in an excellent position to assist developing nations worldwide [7] . During the past two decades, the Hong Kong SAR government has been actively pursuing growth in high-tech industries such as telecommunications, biotechnology, information technology and electronics to help develop this economic leader of Asia into a global technological giant in today's high-tech world. In Hong Kong, computational biology has adopted bioinformatics as a research and development (R&D) area in its own right. Researchers in the life sciences, biochemistry, molecular biology and computer science have all used bioinformatics as a tool to solve their particular research problems. In this paper, we discuss how the economic growth of Hong Kong and its population age structure have helped to establish computational biology in the territory and have helped with the location of its foundations. To this end, we identify the learning activities, landmark research, activities of bioinformatics companies and role of the government in Hong Kong in establishing computational biology as strength. We have tried to be comprehensive and apologise to any researcher or research group that has not been included in this report. Age Structure Assist the Establishment of Computational Biology Hong Kong has a free-market economy which is highly dependent on international trade and finance. The USA is Hong Kong's second-largest export partner after mainland China, which is an important factor in the Hong Kong economy (http://www.tid.gov.hk/english/trade_relations/ mainland/trade.html). During the past decade, the GDP of Hong Kong has grown steadily and is now about 6.0 % (Fig. 1) . The territory has one of the highest per capita incomes in the world. This dynamic economic environment provides an added advantage in high-tech R&D areas such as computer science and computational biology [8, 9] . The economy of any nation depends on its age structure, which presents various challenges. The age structure of a population is directly related to that society's productivity and economy [10] . Hong Kong has an advantage in that the percentage of its population that is of working age is one of the highest in the world. Hong Kong's working-age population (15-64 years) is about 74.8 %, and the median age in the territory is 43.4 years. The older age group (65 years and over) comprises about 13.5 % of Hong Kong's population (Fig. 1 ). It has been observed that a higher proportion of older people leads to a critical shortage in manpower and tends to undermine productivity. The healthcare costs of an older population are noted to be economically unproductive, which is also a burden for society [10] . Factors such as these highly competitive business environments, the knowledge-based economy, the bestperforming economy in Asia in terms of investment and the vibrancy of the working-age group assist in Hong Kong's rapid growth to become a strong economic force in East Asia and to emerge as a global technological giant able to emphasise its pioneering success in computational biology. The initial step in establishing bioinformatics was the commencement of the Hong Kong government's innovation and technology development programme in 1998-1999, which provided funding of about $5 billion through the Innovation and Technology Fund (ITF). Four types of project are financed through the ITF programmes of innovation and technology support, university-industry collaboration, general support and small entrepreneur research assistance, all involving innovation or technology that should modernise the industry (http://www.gov.hk/en/about/abouthk/factsheets/ docs/technology.pdf). In the late 1980s, especially 1988, work on bioinformatics in Hong Kong was begun by a group of researchers in the life sciences. Two projects are of particular note: first, the use of microcomputers in histopathology [11] and second, the development of the commercial Microbact 24E (MB24E) microsystem for the identification of common clinical isolates of Enterobacteriaceae and non-fermenting Gram-negative bacilli [12] . In 1998, two more bioinformatics papers by different research groups were published in the journal Bioinformatics. Smith and Xue, from the Biochemistry Department of the Hong Kong University of Science and Technology (HKUST), developed a method for summing up and presenting the information contained in a set of aligned sequences to identify patterns within the sequences and represent them in a more accurate and graphical form [13] . Chau et al. [14] , from the Department of Applied Biology and Chemical Technology at the Hong Kong Polytechnic University (PolyU), developed a software package entitled 'TLCQA' for low-cost analysis of thin-layer chromatography images. Computational biology had thus already spread to the different disciplines within Hong Kong In both programmes, students learn fundamental subjects such as an introduction to programming, an introduction to molecular biology and genetics, biocomputing and theories and algorithms in bioinformatics. The M.Sc. programme includes some additional mandatory courses such as systems biology and genome informatics, and a research project carrying 3 credits. Two other courses related to computational biology are available in the CUHK's undergraduate programme leading to a Bachelor of Engineering degree in computer science: 'Algorithms for Bioinformatics' (course code: CSCI3220) and 'Topics in Bioinformatics and Computational Biology' (course code: CSCI5050). Course CSCI3220 includes topics related to sequence alignment and multiple sequence alignment using different approaches such as Markov property, recursive functions, dynamic programming, FASTA and BLAST and typical clustering algorithms for microarray analysis. Course CSCI5050 includes the topics of molecular biology, data mining, data processing, sequence alignment and biological networks. The HKUST does not have a separate programme for computational biology. However, there is one course available in the postgraduate programme of the Department of Electronic and Computer Engineering, 'Introduction to Bioinformatics Algorithms' (course code: ELEC 5810), an introductory course on computational biology at the molecular level. The CityU, founded in 1984 as the City Polytechnic of Hong Kong, has been a fully accredited university since 1994. The Department of Computer Science offers a course entitled 'Computational Biology and Bioinformatics' (course code: CS4465), which aims to introduce students to the concepts and techniques used in computational biology to develop the practical skills required to solve problems in this domain. Several Research activity in the area of computational biology is increasing day by day in Hong Kong (Fig. 3) . Computational biology scientists in Hong Kong are currently inclined towards the areas of database development, sequence analysis and genome analysis. However, other prominent areas of computational biology such as structural bioinformatics, protein networking, gene networking, drug development bioinformatics, systems biology, algorithm development and bioinformatics tools are included in the count of publications shown in Fig. 4a and give a rough indication of the number of publications originating from Hong Kong in those areas. We used the same keywords to find the number of publications in spans of 4 years from 1997 to 2012 and found a significant increase in the number of published papers originating in Hong Kong on different areas of bioinformatics (Fig. 4b) . We have searched research publications in this area in spans of 4 years, i.e. 1997-2000, 2001-2004, 2005-2008 and 2009-2012, using certain keywords. The number of publications provides a rough indication of the expansion in the field. Note, however, that some bioinformatics publications cannot be retrieved using these keywords. Genome analysis and sequence analysis are the main research priorities in Hong Kong, as is clear from Fig. 4 Hong Kong scientists have also been involved in the development of various bioinformatics tools. Some advanced tools include PriVar, a tool for analysing next-generation sequencing data, mutations and linkage analysis [20] ; COPE, a tool for genome assembly using k-mer frequencies [21] ; SOAP3, a rapid graphic processing unit (GPU)-based parallel alignment tool [22] ; FetalQuant, a tool for estimating the foetal DNA concentration from maternal plasma DNA which uses maximum likelihood [23] ; and GBOOST, a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies [24] . Scientists from the Hong Kong Bioinformatics Centre are also working in this area. The most essential tools developed by these scientists include the ABMapper (a tool for multi-location searching and splice-junction mapping) [25] , ViralFusionSeq (a tool for studying soft-clipping, read-pair and targeted de novo assembly to determine and annotate human viral integration and restructure fusion) [26] and Alns (a tool for searchable and filterable sequence alignment) [27] . Further examples of Web servers or tools developed and maintained in Hong Kong are given in Table 3 . Some examples are YY1TargetDB (a Web-based YY1 target loci database) [28] , BSRD (a comprehensive bacterial sRNAs database) [29] and PcarnBase (a searchable database for the brain coral Platygyra carnosus) [30] . Further examples of biological databases developed and maintained in Hong Kong are given in Table 4 . Respiratory diseases, as one of the foremost problems in Hong Kong, are a major burden [31] . Respiratory diseases and related viruses are thus important research areas in the territory, and much landmark research has been published in well-known journals such as Nature and Science. Computational biology researchers are currently investigating the H5N1 virus, strains of which cause H5N1 avian influenza may be transferrable to humans [32] . This virus is highly pathogenic and has caused a pandemic 'bird flu' incident in the Hong Kong SAR. Several researchers have reported the emergence of multiple genotypes of the H5N1 virus in terrestrial poultry, leading to an outbreak of avian influenza in chickens in retail markets in Hong Kong and the molecular changes in the virus associated with this event [33] . One study published in Nature investigated the long-term evolution and dynamics of transmission of swine influenza A virus (SwIV) using a data set of more than 650 SwIV isolates and more than 800 swine sera from 12 years of systematic monitoring in Hong Kong, including the H1N1/2009 virus that caused a human pandemic [34] . Another study characterised a reassortant progeny of H1N1/2009 with swine viruses [35] . A study published in Nature on the origins and evolutionary genomics of swine- MicroRNA-218 MicroRNA-218 is related to cycle progression and apoptosis in colon cancer. This miR-218 slows down cell cycle progression and endorses apoptosis [52] SNP detection Rapid and precise SNP detection algorithm which help to analyse next-generation sequencing data [43] SNP and CNVs at genome-wide scale Study concluded 79 genes obstruct by CNVs in diseases people and recognised de novo DKK4 duplication [53] CD14 gene polymorphism Study explored the relationship between periodontitis and single polymorphic location in two genes which are DEFB1 and CD14 [54] Pathogen Salmonella spp. Study explored occurrence and antimicrobial resistance of Salmonella in meat harvest [55] Mycobacterium tuberculosis This experiments conclude about quick detection of Mycobacteria and fast exposure of drug resistance [56] Rat noroviruses Whole genome sequences [57] Enterobacter cloacae subsp. cloacae strain ENHKU01 Whole genome sequences [58] HIV-1 CRF07_BC variants Detection of drug resistant mutations [59] Bacillus macauensis ZFHKF-1 Rough sketch about genome sequence [60] Influenza B virus Structural starting point for RNA binding and homo-oligomer construction [61] Human coronavirus NL63 Study deals with disease variety and genetic multiplicity [62] origin H1N1 influenza carried out phylogenetic analysis of the gaps in genetic surveillance and applied evolutionary analysis to estimate the timescale of the origins, finding that a remixture of swine influenza lineages may have occurred years before appearing in humans [36] . Scientists in Hong Kong are actively collaborating with those in other countries such as the USA, UK, Singapore, Japan, Taiwan and India. Some important research has resulted. One study, a collaboration between researchers in the USA, Singapore and Hong Kong, deals with a unique data set arising from surveillance of swine influenza at a Hong Kong abattoir from 1998 to 2010 and may advance understanding of the prevalence of influenza and decrease the occurrence of influenza in Hong Kong [37] . Another study aimed to understand the dissimilarity of two new Dehalococcoides mccartyi strains through collaboration between US and Hong Kong scientists. In this study, Lee et al. [38] performed a comparative genomics analysis via a microarray and concluded that the observed functional incongruence between the activity and core genome phylogenies of D. mccartyi strains is probably caused by a horizontal shift in significant reductive dehalogenase- [39, 40] . The authors of this paper are themselves performing collaborative research. Three of the authors (George, Chakraborty and Zhu) have jointly published research findings on the effects of deleterious non-synonymous SNPs in the binding adaptability of flavopiridol with cyclin-dependent kinase 7 (CDK7), a one-cell-cycle regulatory protein [41] . Another example of collaboration between India and Hong Kong is this review paper concerning computational biology in the Hong Kong SAR; bioinformatics scientists from both countries have critically analysed the activity in and the status of computational biology in the territory. Eminent Hong Kong researchers in the field of computational biology include Professor Tsui Kwok Wing Stephen, one of the authors of this review and the director of the Hong Kong Bioinformatics Centre who was a member of the International HapMap Consortium developing the haplotype map of the human genome; two papers on that work were published in Nature in 2005 and 2007 [16, 17] . Professor Tsui's group worked on genomic-sequence variations and the epidemiology of severe acute respiratory syndrome (SARS) [42] . Dr. Jun Wang from the Department of Biochemistry at the HKU is working on computational and transcriptional genomics, structural bioinformatics, SNPs and copy number variation (CNV). He developed an SNP detection algorithm for next-generation sequencing data [43] and also worked on a genome-wide association study on alleles in the FGFR2 gene that is associated with risk of breast cancer [44] . Ascertaining the dynamic nature of gene regulation is a significant challenge in systems biology, an area of research interest in Hong Kong. The corresponding author of this review is working on reconstructing dynamic gene regulatory networks for human cancer, to unravel the dynamic mechanism of cancer development [45] . Dr. Yip Yuk Lap Kevin from the School of Life Sciences at CUHK is researching the use of computational methods to study biological and medical phenomena and networks, and has published work in the journals Nature and Bioinformatics [46, 47] . Other eminent Hong Kong researchers have been working in the field of computational biology for decades; an attempt to outline the activities of these researchers, to provide a bird's-eye view of their research activities, is given in Table 5 . We have also attempted to determine the proportion of total research in Hong Kong that is related to computational biology, finding that publications in the area of bioinformatics comprise about 9 % of total publications (Fig. 5) Hong Kong relies on medical devices being imported to satisfy the territory's rising demand for advanced health care. A major focus of the Hong Kong SAR government, therefore, is assisting companies to fulfil the need for medical products. Recent trends in Hong Kong's population have shown a shift of disease types towards malignant neoplasms, heart disease, pneumonia, cerebrovascular disease, etc. There is a resulting increase in the demand for high-tech products and services that use biotechnology and bioinformatics tools. In response to the outbreak of SARS, many biotechnological companies were established to develop diagnostic kits. One such project involves the development of a biochip by Dr. Yu Cheung-Hoi Albert of the HKUST, which was highlighted in Science magazine [48] . This 'lab-on-a-chip' rapidly diagnoses emerging pathogens, including influenza viruses; Yu subsequently established the company Hai Kang Life to manufacture these DNA chips. Cluster Technology Limited (Clus-terTech) is a computing technology company in Hong Kong which provides solutions for bioinformatics research that combine high-performance computing and cloud computing technology; several other companies have also been established in this area. To assist with world-class infrastructure for such companies, the Hong Kong Science and Technology Parks Corporation (HKSTPC) was established by the Hong Kong SAR government in May 2001 to provide core competency in manufacturing technology, biotechnology, information technology (including computational biology), environmental technology, management systems, etc. The HKSTPC manages three industrial estates and mentors technology-based companies through its incubation programme which assists with business development for potential new entrepreneurs. The Hong Kong SAR government plays a very supportive role in promoting innovation and the development of technology. The government is extremely keen to create an environment which promotes such innovation and development and demonstrates a particular interest in biotechnology, thus fostering favourable conditions for computational biologists. The government supports the innovation and technology support, university-industry collaboration, general support and small entrepreneur research assistance programmes. At the beginning of 2012, for example, 2746 projects were supported from a fund of HK$6.4 billion, with many of the funded projects being related to biotechnology (around 10 %) and information technology (around 20 %), jointly comprising 30 % of the total funding in the territory. The Hong Kong SAR government generally supports fundamental research, applied R&D, technology transfer and technological entrepreneurship, through which it fosters a culture of innovation and a technological environment in the territory. In the past few years, Hong Kong has seen tremendous growth in bioinformatics research, which has been extended to different fields of biology. However, compared with other developed countries around the world, Hong Kong still has some way to go in this field. Research in the life Table 5 Some well-known researchers, their affiliations and areas of research interest Prominent Scientists/researcher Name sciences with the aid of computational biology is occurring on a large scale in Hong Kong and is still gaining impetus. Collaborative research activities between bioinformaticians and laboratory biologists at both a national and an international level are also increasing rapidly. Such collaborative research groups analyse biological data ranging from genes to proteins, attempt to solve important biological questions together with laboratory scientists, make various predictions and validate different hypotheses. Despite this positive outlook, however, several challenges exist. More national centres for bioinformation, similar to the National Center for Biotechnology Information (NCBI) in the USA, are needed to provide a central depository and information source for biological data and tools. During the past few years, development of computational biology has been initiated in Hong Kong through various activities. However, many challenges lie ahead for computational biologists and the Hong Kong SAR government if the territory is to be established as a globally important bioinformatics research centre. The government needs to act to encourage more research projects, teaching modules and conferences and the provision of research grants and support to attract larger numbers to the field of computational biology. Research must be sustained as a key area of investment to improve efficiency and competitiveness in meeting Hong Kong's needs, such as in the medical sector. To identify and support the existing researchers in the field of computational biology, more professional societies and journals indexed in the Science Citation Index should be initiated to drive Hong Kong computational biology towards new heights. This will benefit both national scientists and those in neighbouring Asian countries. Finally, venture capital would be extremely beneficial in the development of bioinformatics companies to assist Hong Kong to become a world-famous centre of computational biology. Entrepreneurship and the economic development of Hong Kong The top 10 of everything 2007. Hamlyn, London The dynasties of China: a history The lion's share: a short history of British imperialism Hong Kong under Chinese rule: the economic and political implications of reversion Hong Kong and the cold war: Anglo-American relations 1949-1957 Improbable art: the creative economy and sustainable cluster development in a Hong Kong industrial district The Hong Kong advantage A tale of two cities: factor accumulation and technical change in Hong Kong and Singapore Population aging and productivity in Asian countries. Japan, Asian Productivity Organization Use of microcomputer for histopathology: system using IBM PC and dBaseIII Evaluation of the Microbact-24E bacterial identification system A major component approach to presenting consensus sequences TLCQA: quantitative study of thin-layer chromatography Bioinformatics research in the Asia Pacific: a 2007 update A haplotype map of the human genome A second generation human haplotype map of over 3.1 million SNPs The International Cancer Genome Consortium (2010) International network of cancer genome projects A mutation in Ihh that causes digit abnormalities alters its signalling capacity and range PriVar: a toolkit for prioritizing SNVs and indels from next-generation sequencing data COPE: an accurate k-merbased pair-end reads connection tool to facilitate genome assembly SOAP3: ultra-fast GPUbased parallel alignment tool for short reads FetalQuant: deducing fractional fetal DNA concentration from massively parallel sequencing of DNA in maternal plasma GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies ABMapper: a suffix arraybased tool for multi-location searching and splice-junction mapping ViralFusionSeq: accurately discover viral integration events and reconstruct fusion transcripts at single-base resolution Alns: a new searchable and filterable sequence alignment format TargetDB: an integral information resource for Yin Yang 1 target loci BSRD: a repository for bacterial small regulatory RNA PcarnBase: development of a transcriptomic database for the brain coral Platygyra carnosus The burden of lung disease in Hong Kong: a report from the Hong Kong Thoracic Society Virology: bird flu in mammals Emergence of multiple genotypes of H5N1 avian influenza viruses in Hong Kong SAR Long-term evolution and transmission dynamics of swine influenza A virus Reassortment of pandemic H1N1/2009 influenza A virus in swine Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic Inferring patterns of influenza transmission in swine from multiple streams of surveillance data Isolation of two new Dehalococcoides mccartyi strains with dissimilar dechlorination functions and their characterization by comparative genomics via microarray analysis Permanent genetic resources added to Permanent genetic resources added to Molecular Ecology Resources Database Extrapolating the effect of deleterious nsSNPs in the binding adaptability of flavopiridol with CDK7 protein: a molecular dynamics approach Coronavirus genomicsequence variations and the epidemiology of the severe acute respiratory syndrome A fast and accurate SNP detection algorithm for next-generation sequencing data A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer Reconstructing dynamic gene regulatory networks from sample-based transcriptional data Architecture of the human regulatory network derived from ENCODE data Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions Biotechnology. Lab-on-a-chip maker looks to put Hong Kong on biotech map Genome-wide association study in a Chinese population identifies a susceptibility locus for type 2 diabetes at 7q32 near PAX4 Predicting mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies Meta-analysis followed by replication identifies loci in or near CDKN1B, TET3, CD80, DRAM1, and ARID5B as associated with systemic lupus erythematosus in Asians MicroRNA-218 inhibits cell cycle progression and promotes apoptosis in colon cancer by downregulating BMI1 polycomb ring finger oncogene Genome-wide copy number variation study in anorectal malformations Clinical application of human b-defensin and CD14 gene polymorphism in evaluating the status of chronic inflammation First detection of oqxAB in Salmonella spp. isolated from food Rapid identification of mycobacteria and rapid detection of drug resistance in Mycobacterium tuberculosis in cultured isolates and in respiratory specimens Complete genome sequences of novel rat noroviruses in Hong Kong Complete genome sequence of the endophytic Enterobacter cloacae subsp. cloacae strain ENHKU01 Identification of drug resistant mutations in HIV-1 CRF07_BC variants selected by nevirapine in vitro Genome of Bacillus macauensis ZFHKF-1, a long-chain-forming bacterium Structural basis for RNA binding and homo-oligomer formation by influenza B virus nucleoprotein Human coronavirus NL63 in children: epidemiology, disease spectrum, and genetic diversity GWAS3D: detecting human regulatory variants by integrative analysis of genome-wide associations, chromosome interactions and histone modifications IPGWAS: an integrated pipeline for rational quality control and association analysis of genome-wide genetic studies IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth SEQanswers: an open access community for collaboratively decoding genomes IGG3: a tool to rapidly integrate large genotype datasets for whole-genome imputation and individual-level meta-analysis EpiRegNet: constructing epigenetic regulatory network from high throughput gene expression data for humans RNASAlign: RNA structural alignment system FastPval: a fast and memory efficient program to calculate very low P-values from empirical distribution DSHIFT: a web server for predicting DNA chemical shifts mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines RedoxDB: a curated database for experimentally verified protein oxidative modification GWASdb: a database for human genetic variants identified by genome-wide association studies Automated identification of medically important bacteria by 16S rRNA gene sequencing using a novel comprehensive database, 16SpathDB OpenADAM: an open source genome-wide association data management system for Affymetrix SNP arrays T3DB: an integrated database for bacterial type III secretion system An integrated web medicinal materials DNA database: MMDBD (Medicinal Materials DNA Barcode Database) SOX9 induces and maintains neural stem cells SOX9 governs differentiation stage-specific gene expression in growth plate chondrocytes via direct concomitant transactivation and repression Genome-wide association study in Asian populations identifies variants in ETS1 and WDFY4 associated with systemic lupus erythematosus ITGAM is associated with disease susceptibility and renal nephritis of systemic lupus erythematosus in Hong Kong Chinese and Thai MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample Structural alignment of RNA with triple helix structure General techniques for comparing unrooted evolutionary trees A decomposition theorem for maximum weight bipartite matchings with applications to evolutionary trees The NGS WikiBook: a dynamic collaborative online training effort with long-term sustainability Complete genome sequence of Bacillus subtilis strain QB928, a strain widely used in B. subtilis genetic studies Bird flu outbreak prediction via satellite tracking Transfer learning in heterogeneous collaborative filtering domains Attenuation of transcriptional bursting in mRNA transport Stripe formation in bacterial systems with density-suppressed motility Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks SNP and gene networks construction and analysis from classification of copy number variations data Error margin analysis for feature gene extraction Evolution-and structure-based computational strategy reveals the impact of deleterious missense mutations on MODY 2 (maturity-onset diabetes of the young Computing the protein binding sites A highly accurate heuristic algorithm for the haplotype assembly problem Acknowledgments This work was supported by the Research Grants Council of Hong Kong (212111 and 212631), Faculty Research Grant (12-13/061) and partially supported by the National Natural Science Foundation of China (61134013 and 91029301) . The authors take this opportunity to thank the management of Galgotias University and VIT University for providing the facilities and encouragement to carry out this work. Conflict of interest The authors declare no conflicts of interest.