The human genetic epidemiology of COVID-19

Niemi, Mari E. K.; Daly, Mark J.; Ganna, Andrea

doi:10.1038/s41576-022-00478-5

Download PDF

Review Article
Published: 02 May 2022

The human genetic epidemiology of COVID-19

Nature Reviews Genetics volume 23, pages 533–546 (2022)Cite this article

23k Accesses
50 Citations
115 Altmetric
Metrics details

Subjects

Abstract

Human genetics can inform the biology and epidemiology of coronavirus disease 2019 (COVID-19) by pinpointing causal mechanisms that explain why some individuals become more severely affected by the disease upon infection by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus. Large-scale genetic association studies, encompassing both rare and common genetic variants, have used different study designs and multiple disease phenotype definitions to identify several genomic regions associated with COVID-19. Along with a multitude of follow-up studies, these findings have increased our understanding of disease aetiology and provided routes for management of COVID-19. Important emergent opportunities include the clinical translatability of genetic risk prediction, the repurposing of existing drugs, exploration of variable host effects of different viral strains, study of inter-individual variability in vaccination response and understanding the long-term consequences of SARS-CoV-2 infection. Beyond the current pandemic, these transferrable opportunities are likely to affect the study of many infectious diseases.

Long COVID: major findings, mechanisms and recommendations

Article 13 January 2023

Risk of death following COVID-19 vaccination or positive SARS-CoV-2 test in young people in England

Article Open access 27 March 2023

Persistence in risk and effect of COVID-19 vaccination on long-term health consequences after SARS-CoV-2 infection

Article Open access 26 February 2024

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus emerged at the end of 2019 and spread rapidly across the world, with the WHO announcing a global pandemic on 11 March 2020. This new betacoronavirus had not been seen before, but it is related to the severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) coronaviruses¹. We now know that SARS-CoV-2 uses the human ACE2 receptor for viral entry², initially infecting and replicating in epithelial cells in the nasopharynx and subsequently gaining access to the distal alveolar space^3,4. The virus is recognized by immune cells through pattern-recognition receptors, prominently by members of the Toll-like receptor group such as TLR3 and TLR7, which promote the synthesis of type I interferons^5,6,7, and by cytoplasmic RNA sensors retinoic acid-inducible gene I (RIGI; also known as DDX58) and interferon-induced helicase C domain-containing protein 1 (IFIH1; also known as MDA5) inducing type I/III interferon responses^8,9. Secreted type I interferons signal via interferon receptors (IFNARs) to switch on Janus kinase 1 (JAK1) and tyrosine kinase 2 (TYK2) and, consequently, promote the expression of interferon‐stimulated genes such as oligoadenylate synthetase 1 (OAS1), OAS2 and OAS3 (ref.¹⁰). Severe forms of coronavirus disease 2019 (COVID-19) involve a dysregulation of the immune response that results in insufficient or delayed type I interferon response^11,12. Eventually, sustained hyperinflammation results in increased immune infiltration in the lungs, reduction in alveolar lacunar space, cell death by apoptosis and lung fibrosis^13,14.

COVID-19 manifests with a wide range of symptoms and degrees of severity. Although most cases are now known to be asymptomatic or mild, some patients develop a severe form of the disease that results in acute respiratory distress syndrome and consequent multi-organ complications^15,16. Disease severity is correlated with several risk characteristics including older age, being of male sex and smoking, various clinical comorbidities such as being obese or immunocompromised¹⁷ and clinical biomarkers such as autoantibodies to type I interferons, cytokines and inflammation markers¹⁸. In the early days of the pandemic, it was already noted that these clinical factors did not fully explain the variability in COVID-19 disease severity between individuals, and severe cases were observed among young individuals without apparent previous pre-conditions, sometimes clustering in families¹⁹, suggesting a role for human genetics as a risk factor.

Finding host genetic factors for infection susceptibility and disease severity is important, because it leads to better understanding of the viral infection, the pathophysiological changes that occur owing to disease and to the discovery of potential drug targets. It can also shed light on the causal relationships between risk factors, biomarkers and disease outcomes, and can inform prevention strategies. Well-known examples of successful human genetic studies of infectious diseases include identification of the CCR5Δ32 mutation for protection against HIV infection^20,21, and the protection against Plasmodium falciparum infection (malaria disease) in individuals who are heterozygous carriers of a sickle cell allele of the haemoglobin-β (HBB) gene^22,23,24. We refer to the Review by Kwok et al.²⁵ for a broader overview of human genetic influences on infectious diseases.

Compared with other common complex diseases, studying the human genetics of infectious disease poses additional challenges including uneven exposure to the virus within a population, the differential treatment of patients with severe disease under a pandemic emergency and the implementation and uptake of vaccination programmes. Nonetheless, the existing worldwide expertise in generation and analysis of human genetic data has allowed for rapid large-scale studies in host genetics of COVID-19. In this Review we provide an overview of current study designs enabling discovery of human genetic variation associated with COVID-19, with a focus on large-scale population-based association studies, the genetic discoveries made so far and what we have learnt in terms of biology and public health impact. Finally, we provide some of the key challenges ahead for the field in this moving pandemic and beyond.

Study designs for COVID-19 host genetics

Many types of study have contributed to host genetic investigations for COVID-19 during the pandemic.

Clinical studies

Clinical studies collect deep and disease-relevant phenotypic information and typically focus on patients with severe COVID-19 (refs^26,27,28,29). Most are of small to medium size with up to a few thousand patients and were initiated after the emergence of SARS-CoV-2 specifically to study COVID-19. However, one of the largest clinical studies, GenOMICC/ISARIC²⁸, predated the pandemic by already studying the genetics of critical illness due to infection. These researchers were able to rapidly harness existing clinical study and recruitment frameworks for the study of COVID-19. Clinical studies are well positioned to study disease severity, once appropriate controls are also collected and can be used to investigate how genetic risk factors affect a patient’s clinical trajectories after infection. To investigate the genetic bases of COVID-19, these studies generally invest in whole-exome sequencing (WES) and/or whole-genome sequencing (WGS) data generation and analysis.

Biobank and cohort studies

Existing biobank and cohort studies can be used to study COVID-19 given a large enough sample size and sufficient infection rate within the population. These studies typically identify COVID-19-positive cases through linkage with electronic health records or questionnaires. Individuals who are not COVID-19 positive or who tested negative can be used as controls. These studies can provide a more representative sample of patients with COVID-19 than clinical studies, although participants enrolling in biobank and cohort studies are often not fully representative of the general population. For some of the established epidemiological cohorts, participants have been extensively recontacted for the collection of longitudinal information about COVID-19 symptoms³⁰. With few exceptions (for example, the UK Biobank and DiscovEHR collaboration³¹) most of these studies use genotyping microarrays and are not well suited to study variants with population frequency below 0.1%.

Direct-to-consumer genetic companies

Direct-to-consumer genetic companies have engaged in COVID-19 research to an unprecedented extent. For example, 23andMe³² and AncestryDNA^33,34, two of the largest companies in this space, have designed surveys allowing collection of detailed self-reported information. Given the large number of customers, these companies were well powered to identify new common genetic variants associated with various COVID-19 phenotypes, including vaccination side effects³⁵ and specific COVID-19 symptoms³⁶. The disadvantage of such studies is that COVID-19-positive status was self-reported and severe cases are under-represented, although SARS-CoV-2 PCR test result and hospitalization from COVID are presumed to be quite reliably self-reportable.

COVID-19 phenotypes

Most of the host genetic studies for COVID-19 have focused on identifying variation in the genome that is associated with susceptibility to infection, disease severity and disease-related symptoms.

Susceptibility to infection

Susceptibility to infection is typically defined as being COVID-19 positive given exposure to the virus. This is the most challenging phenotype to collect because viral exposure is difficult to trace. Roberts and colleagues from the AncestryDNA Science Team³⁷ have best attempted to capture susceptibility by comparing COVID-19 negative and positive individuals who had a housemate with a confirmed COVID-19 diagnosis. The COVID-19 Host Genetics Initiative (HGI)³⁸ used a simpler approach, comparing individuals who are COVID-19 positive versus population controls and named this phenotype ‘reported SARS-CoV-2 infection’. Despite the suboptimal choice of the control group, probably including controls who had not been exposed to the virus, the results overlapped with those from AncestryDNA.

Disease severity and progression

Disease severity is often captured by comparing individuals who are COVID-19 positive who have been hospitalized or who have been admitted to an intensive care unit (ICU) with those who have less severe disease or are asymptomatic but still positive for the virus. Hospitalization, admission to an ICU and requirement for respiratory support represent ad hoc definitions of severity that are robust enough to be captured across studies with heterogeneous designs. The COVID-19 HGI³⁸ and the GenOMICC/ISARIC study²⁸, in their main analyses, used population controls instead of individuals who are COVID-19 positive with non-severe disease. This can result in case misclassification because some controls might turn out to be cases if exposed to the virus. Nonetheless, this approach is more powerful than using individuals who are COVID-19 positive with non-severe disease as controls because of the large availability of population controls, especially within biobank studies³⁸. In support of the usefulness of population controls, the results have shown to be robust once a more appropriate control definition is used³⁸.

Disease-related symptoms

Some genetic studies have focused on a single symptom (for example, loss of taste and smell³⁶) or on a combination of symptoms that can be used to detect undiagnosed COVID-19 cases³⁹. Such study designs were particularly valuable in the absence of widespread testing, as at the beginning of the pandemic.

Complexity in the phenotype definitions

In addition to some of the limitations described above, there are several layers of complexity when studying infectious diseases such as COVID-19 (Fig. 1). First of all, although SARS-CoV-2 has spread rapidly, not all individuals in any population have been exposed at the time of study recruitment. Furthermore, this level of exposure is clearly time dependent throughout the pandemic. There are also large differences in socio-economic and demographic factors that contribute to viral exposure, such as ethnicity, job and age. When the whole population has not yet been exposed to the virus, those identified as cases or controls are not a random sample owing to the selection biases currently present in the population in question⁴⁰. Ongoing vaccination programmes are also shifting the rates and demographics of infection, and there are large differences in epidemic management and inequalities between vaccination programmes across countries. The severity of the disease, as captured by hospitalization or ICU admission, is also dependent on the health practice in different countries, which might have also varied in different phases of the pandemic. Finally, different viral strains can affect infection susceptibility and COVID-19 disease severity. Host genetics can influence all of these stages from the socio-economic factors contributing to the chance of exposure, through infection and the development of initial symptoms, to progression to severe disease.

**Fig. 1: Schematic of the disease progression trajectory for individuals exposed to SARS-CoV-2.**

Genetic findings

Genetic association studies can identify genomic regions linked to infection susceptibility and disease, but these studies are also susceptible to various biases that may arise during sample collection, data generation and processing. Furthermore, such findings require additional analyses and functional follow-up to pinpoint the specific variants and genes that directly affect the observed phenotypes. We next discuss the current findings primarily from the largest genetic studies for SARS-CoV-2 infection and COVID-19 disease. In Table 1 we summarize the key evidence for some of the most robust and interpretable associations and report our confidence for the suspected causal gene.

Table 1 Genetic loci associated with SARS-CoV-2 infection susceptibility and COVID-19 severity, including the putative causal genes

Full size table

Rare variants

There is an extensive literature on rare variants that cause inborn errors of innate immunity that can result in severe, idiosyncratic outcomes from common infectious diseases. We refer readers to the work by Casanova and Abel⁴¹ for further details on the topic. These rare variants have been typically discovered by studying small family pedigrees and individuals with extreme phenotypic manifestations. By contrast, well-powered population-based WES and WGS studies have been lacking, and more widely available genotyping microarray data are not as useful for this purpose (for further information, see the sections covering common variants), as such variants can be extremely rare and specific to individual families. Sequence data have the advantage of capturing variants that have usually occurred in relatively recent generations or de novo and may have large effects on a disease outcome. Typically, variants with large effects remain at low frequency in the population or are purged out owing to selective pressure. Rare non-synonymous coding variants are of particular interest because they can easily point out the causal gene and, thus, reveal potential for therapeutic targets.

Van der Made and colleagues⁴² published one of the first studies on rare variants in the context of COVID-19 severity. They searched for rare non-synonymous and possibly damaging variants in a group of genes with known associations with immunodeficiencies. Their analyses on data from two families with affected males (brother pairs) pointed to X chromosome variants in the TLR7 gene, which is involved in the pathogen recognition pathway and innate and adaptive immunity. This finding has been replicated by Fallerini et al.⁷ in 561 individuals and Asano et al.⁵ in a larger sample of 1,533 individuals. Both studies and a further follow-up study by Mantovani et al.⁴³ performed functional investigations highlighting the role of TLR7 loss of function in impaired type I interferon responses.

A larger case–control study was conducted by Zhang et al.⁴⁴ by comparing exome sequence data from 659 patients with life-threatening COVID-19, including children, with data from 534 individuals with mild or asymptomatic COVID-19. They focused on 13 candidate genes previously associated with monogenic immunological disorders or that are involved in these pathways and concluded that at least 3.5% of patients with life-threatening COVID-19 pneumonia had genetic defects in some of these genes implicated in the type I interferon pathway.

Because the aforementioned studies focus on candidate genes instead of using a hypothesis-free genome-wide approach that requires a larger sample size and a more stringent significance threshold, the results need to be carefully scrutinized and replicated. Only the TLR7 association reached exome-wide significance in unpublished work by the COVID-19 HGI WES/WGS working group, which now includes up to 23,000 cases and 500,000 controls (G. Butler-Laporte, personal communication). A smaller study of 7,491 patients who were critically ill and 48,400 controls did not identify any significant rare variant associations⁴⁵. This and two other studies^31,46 have not been able to replicate the rare variant associations with the 13 immune genes reported by Zhang et al.⁴⁴, despite substantially larger sample sizes. These differences may be partially due to different definitions of COVID-19 severity, age distribution and in silico versus experimental validation of non-synonymous variants^47,48. The power of rare variant discovery in COVID-19 will be improved by increased sample sizes of WES and WGS data sets, which in time may provide definitively conclusive associations. To summarize, TLR7 is currently the only gene uniformly replicated for association of rare non-synonymous variants with severe COVID-19, although it is expected that more findings will be confirmed as studies increase in power.

Common variant: introduction

Although rare variant studies for COVID-19 are still in their nascent phase, there is now robust, replicated evidence for multiple loci harbouring common variants associated with infection susceptibility and disease severity. These studies have mainly used microarray-based genotyping technology, which is scalable and cost-effective. Genotyping microarrays are designed to capture the more common variation across the genome using a sparse number of genetic markers in coding and non-coding regions, followed by statistical imputation of the remaining known sites of genetic variation, both common and rare. Genome-wide association studies (GWAS) using genotype data are powerful for capturing associations for variants with population frequency >0.1% that typically have mild to moderate effects on the phenotype^49,50. Owing to the relatively quick and cheap generation of genotype data, GWAS have proved an important starting point for distinguishing between the genetic variants that affect susceptibility to SARS-CoV-2 infection and those increasing the risk of developing a severe form of COVID-19 disease once infected.

Common variant: infection susceptibility

We have previously mentioned how current genetic studies can only imprecisely capture susceptibility to SARS-CoV-2 infection. Nonetheless, well-powered analyses clearly point to a group of loci that are associated with COVID-19 disease, but are not specific to disease severity. The COVID-19 HGI has recently formalized this observation, by developing a Bayesian framework to assign posterior probability for a variant to belong to either disease severity or susceptibility to infection⁵¹. Briefly, by contrasting effect sizes in severe COVID-19 with those seen in COVID-19 populations with severe cases removed, one can analytically distinguish those variants involved in susceptibility to infection (equal in the two groups when compared with controls) and those specifically involved in severe progressions that manifest uniquely or much more substantially in the severe group.

The strongest signal within the susceptibility group of loci is the ABO (histo-blood group ABO system transferase) gene, which was initially identified by the Severe Covid-19 GWAS group²⁶. The ABO alleles determine an individual’s blood group by enzymatically catalysing the production of A and B antigens in human cells. There is now robust evidence that ABO is associated with susceptibility to SARS-CoV-2 infection, with both Shelton et al.³² and HGI³⁸ reporting similar effect sizes for the infection susceptibility and disease severity phenotypes. The data suggest that individuals with O blood group, who have neither A nor B antigens, are protected against the viral infection (odds ratio (OR) ≈0.90). This result is consistent with several observational studies that found that blood group A was associated with infection susceptibility⁵². The exact mechanism is, however, unclear. It has been suggested that this association can be attributed to protective effects exerted by anti-A IgG antibodies and not the blood group itself⁵³. Others have shown that the ABO variant associates with higher levels of CD209 protein, which has been shown to directly interact with the spike protein of SARS-CoV-2 (ref.⁵⁴). Nonetheless, the association between ABO and susceptibility to infection adds to an extensive list of evidence linking blood type with infectious diseases⁵⁵, including the recent observation by Shelton et al.³² that blood group O appeared to be a risk-increasing factor for influenza symptoms in the years before the COVID-19 pandemic.

A second infection susceptibility locus is ACE2, which is worth mentioning because the gene encodes a key protein involved in the viral entry pathway of SARS viruses^2,3,4. GWAS by Horowitz et al.⁵⁶ and COVID-19 HGI⁵¹ point to a protective variant (rs190509934) 60 bp upstream of the ACE2 gene. This variant, which is rare among individuals of European ancestry (0.2% in the Genome Aggregation Database (gnomAD)), but more common in South Asians (2.7%) was associated with a 39% reduction in ACE2 expression in liver tissues.

A third infection susceptibility signal lies in the 3p21.31 locus and it is independent of the largest signal for severe COVID-19 disease, which is also in the same region (Fig. 2). This rather surprising proximity has caused this signal for susceptibility to be overlooked in some studies. Roberts et al.³⁴ were the first to highlight the presence of a susceptibility signal in 3p21.31, and later the COVID-19 HGI³⁸ has shown that there are several independent signals (r² ≈ 0) associated with SARS-CoV-2 infection susceptibility, all located within the gene body of SLC6A20, which encodes an amino acid transporter protein that is known to functionally interact with the SARS-CoV-2 receptor ACE2 (ref.⁵⁷). We discuss some of the functional work that has been done to decipher this locus in more detail in Box 1.

**Fig. 2: Genetic association patterns in the chromosome 3p21.31 region from COVID-19 HGI meta-analysis.**

In addition to the three loci highlighted above, there are additional loci that can be linked to SARS-CoV-2 infection susceptibility and for which we describe the potential causal genes in Table 1.

Box 1 The 3p21.31 locus

The 3p21.31 locus has sparked great interest in the genetics community owing to the complexities of deciphering the causal genes for the two strong, yet independent, genetic signals for coronavirus disease 2019 (COVID-19) in this region. This region of the genome is characterized by a haplotype block spanning 49.4 kb with variants in high linkage disequilibrium (LD) (r² > 0.98) (Fig. 2) and a longer haplotype block of up to 333.8 kb with weaker linkage disequilibrium (r² > 0.32)⁹⁷, both of which were derived from Neanderthals⁹⁷. Hundreds of alternative haplotypes exist at this locus in modern humans, but alleles with strongest association to COVID-19 localize to the 49.4 kb Neanderthal haplotype block. This shorter haplotype exists at strikingly different frequencies in different populations: it is more common among people of European or South Asian ancestry, and reaches the highest frequency among Bangladeshi populations (63% carry at least one copy of the risk haplotype) while it is almost absent in East Asian and African populations (≤2%)⁹⁷. The authors who reported these findings have speculated that such a peculiar frequency pattern might indicate selection in the past⁹⁷. In the present, the length of the haplotype poses challenges for the identification of the causal variant and the target gene. We describe the independent signal for SARS-CoV-2 infection susceptibility in the section ‘Common variant: infection susceptibility’ in the main text, and focus in this Box on the second signal that is associated with COVID-19 disease severity. Common variants in 3p21.31 show by far the strongest association with COVID-19 disease severity across every genome-wide association study (GWAS).

The 3p21.31 locus contains many potential gene targets for severe COVID-19 risk that have plausible biology, albeit some are better characterized than others. The most recent evidence from multi-omic analyses¹²⁹ indicated LZTFL1 as the candidate for the association with severe disease. The authors showed that a lead variant from the early studies of respiratory failure due to COVID-19 (refs^26,28) rs17713054 is a gain-of-function enhancer motif variant that leads to increased expression of LZTFL1 and SLC6A20 (ref.¹²⁹). However, LZTFL1 is expressed in lung epithelial cells whereas SLC6A20 is not. In the context of COVID-19, the lung epithelium is of interest for understanding mechanisms of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, and these cells showed signs of activation of an immune response mechanism termed epithelial–mesenchymal transition (EMT)¹²⁹. The EMT response potentially acts as an acute pathway to hinder infection efficiency by downregulating known host entry receptors^2,130 in the respiratory tract and to eventually allow for repair of the affected tissue. Increased expression of LZTFL1 is known to downregulate the EMT pathway, potentially explaining the association of the enhancer variant with worse outcome and indicating the relevant cell type for the effect¹²⁹.

As mentioned in the main text, SLC6A20 also has plausible involvement in disease susceptibility owing to its functional interaction with the SARS-CoV-2 receptor, ACE2 (ref.⁵⁷). Additionally, the 3p21.31 locus harbours several important chemokine receptor genes: CCR9, CXCR6 and XCR1. In particular, CXCR6 recruits CD8-resident memory T cells in the respiratory tract to combat respiratory pathogens¹³¹. The involvement of CXCR6 (and also CCR9) has been supported by transcriptome-wide association analysis^28,132. In vitro genome-editing studies have indicated that SARS-CoV-2 infection susceptibility is not conferred by only a single causal gene at 3p21.31. Yao and colleagues¹³³ deleted part of the 3p21.31 locus using CRISPR and identified CCR9 and SLC6A20 as potential target genes. Kasela and colleagues¹³⁴ integrated genome-scale CRISPR loss-of-function screens and expression quantitative trait locus (eQTL) analyses and identified SLC6A20 and CXCR6 as putative causal genes.

The lack of pleiotropic effect — the lead variant at 3p21.31 has not been associated with any other traits from previous GWAS — further hampers the functional understanding of this locus. However, on a positive note, it also suggests that the underlying biological mechanism might be specific to COVID-19 disease, making it an interesting candidate for further drug target evaluation.

Although the biological function is still unclear, genetic epidemiological observations clearly show that carriers of the lead variant for severe COVID-19 in the 3p21.31 locus have moderate to large odds to progress towards a severe form of COVID-19 once infected: 50% increased risk of hospitalization (odds ratio (OR) = 1.5, 95% confidence interval (CI) = 1.3–1.7; P = 9.1 × 10⁻¹⁰), 50% increased risk of intensive care unit admission once hospitalized (OR = 1.5, 95% CI = 1.3–1.8; P = 1.4 × 10⁻⁶) and 70% overall increased risk of death or severe respiratory failure (OR = 1.7, 95% CI = 1.5–2.1; P = 7.7 × 10⁻¹⁰)²⁹. Increased risk was consistently observed across several COVID-19-related complications (hepatic and kidney injury, cardiovascular complications and venous thromboembolism) as well as biomarkers tracking disease severity, but was not specific to any complication or biomarker. Thus, individuals carrying the risk allele are overall sicker but do not manifest clear clinical features that separate them from other patients with severe COVID-19.

Common variant: COVID-19 severity

GWAS of severity phenotypes (that is, hospitalization, admission to ICU or death due to COVID-19) have identified more loci than GWAS of infection susceptibility. The largest signal in the 3p21.31 locus was described by the Severe Covid-19 GWAS Group only 3 months after the declaration of the pandemic and posted as a preprint in June 2020 (ref.²⁶). Given the relevance of this locus to severe COVID-19 disease we provide more detailed insights in Box 1 and show the associations graphically in Fig. 2.

The next leap in the discovery of new severity-associated loci came by the combined effort of the GenOMICC/ISARIC²⁸ and COVID-19 HGI studies^38,58. The GenOMICC/ISARIC study²⁸ included just over 2,000 patients who were critically ill with COVID-19 from ICUs across the UK, and their strategy of enrichment for very severe cases resulted in improved power and discovery of eight new loci. The COVID-19 HGI³⁸ provided replication for the GenOMICC/ISARIC study, and independent results were released online. The main findings from these severity analyses directly indicate that both genes involved in immune response and others involved in lung disease pathology are central to severe COVID-19 progression.

First, we highlight three instances in which genes modulating the immune response to viral infection are plausibly implicated. TYK2 has been extensively explored in the human genetic literature owing to its relevance as a potential therapeutic target for autoimmune diseases and cancer. Individuals with complete loss of TYK2 function present with immunodeficiencies^59,60, whereas individuals heterozygous or homozygous for low-frequency hypomorphic variants that cause lowered TYK2 signalling (via decreased phosphorylated STAT) have a more complex presentation. Although these individuals do not seem to be impacted in health measures or mortality in a large cohort study and are in fact protected from common autoimmune diseases⁶¹, they are susceptible to tuberculosis infection owing to impaired immune signalling^62,63,64. By current understanding, TYK2 is involved in balancing the cytokine response and is therefore an interesting target for drug development. Of note, the missense variant (rs34536443:G>C or p.Pro1104Ala) previously associated with protection from certain autoimmune diseases, increases the risk for severe COVID-19.

The second locus points to IFNAR2 (IFNα and IFNβ receptor subunit 2/3), which has been replicated in multiple studies^28,38,56 and proposed as a druggable target through Mendelian randomization (MR) studies⁶⁵. However, we note the close proximity between IFNAR2, IL10RB and IFNAR1, and it is not yet fully established that IFNAR2 is the only relevant gene in this locus. Patients with severe COVID-19 show evidence of a dysregulated type I interferon response to the SARS-CoV-2 virus^66,67,68, and drugs inducing the interferon pathway in the early stages of infection have also been shown to be beneficial⁶⁹. This could imply that the timing of either stimulation or down-regulation of the interferon pathway during the course of infection could affect the outcome in patients⁶⁶.

The third locus overlaps the OAS gene cluster, which encodes proteins involved in viral clearance. Several lines of evidence point to OAS1 as the causal gene^4,70,71: genetically predicted higher levels of circulating OAS1 are protective against severe COVID-19 (ref.⁴), and the causal haplotype is associated with decreased nonsense-mediated decay of OAS1 transcripts, and thereby potentially faster initial responses to viral infections and viral clearance⁷¹. Through a detailed functional study, Wickenhagen and colleagues⁷² showed that SARS-CoV-2 was inhibited by the action of OAS1 interacting with several regions of the SARS-CoV-2 genome, with the most prominent sites mapping to the first 54 nucleotides of the 5′ untranslated region, which is present in all SARS-CoV-2 positive-sense viral RNAs. These findings are interesting in the light of COVID-19 treatment, as OAS1-activating drugs already exist. Additionally, a recent targeted fine-mapping study identified a candidate causal splice variant, leading to a more active OAS1 enzyme and downstream antiviral activity⁷³.

The other major insight gained from the human genetic findings comes from the overlap between genetic signals for COVID-19 severity and lung diseases. This overlap is consistent with the epidemiological evidence associating pre-existing lung conditions with COVID-19 severity^74,75 and respiratory failure being the major cause of death among hospitalized patients with COVID-19 (ref.⁶⁹). At least four loci associated with COVID-19 severity have been previously linked to interstitial lung disease, lung fibrosis, lung carcinomas and/or decreased lung function^28,38. Genes harboured within these published loci include dipeptidyl peptidase 9 (DPP9), Forkhead box protein P4 (FOXP4), surfactant protein D (SFTPD) and mucin 5B (MUC5B)⁵¹. The lead variant at the MUC5B locus (rs35705950-T) is associated with increased MUC5B expression in lung tissue⁷⁶, which has been associated with muco-ciliary dysfunction and increased bleomycin-induced fibrosis in mice⁷⁷. This specific variant is protective against severe COVID-19 but is the strongest known association for substantially increased risk of idiopathic pulmonary fibrosis (IPF)⁷⁶. This opposite direction of effect is intriguing given the concordant direction observed for two other genome-wide significant loci and the overall positive genetic correlation between IPF and COVID-19 (ref.⁷⁸). Nonetheless, this result is also consistent with the MUC5B promoter variant being associated with twofold improved survival among patients with IPF⁷⁹. For FOXP4, a promoter region signal is associated with increased COVID-19 severity^38,51 and is also associated with increased expression of FOXP4. This specific variant is infrequent in samples with European ancestry and much more common in East and South Asia and in admixed Hispanic–Latino samples of the Americas⁸⁰, underscoring the importance of taking a global approach for more comprehensive and equitable gene discovery. Importantly, this same association has been previously noted in lung cancer^81,82 and in interstitial lung diseases⁸³ — all in a concordant direction — suggesting another potential therapeutic target. For SFTPD, the missense variant identified by the HGI⁵¹ is consistent with emerging results pointing to the involvement of surfactant proteins in severe COVID-19 risk. Surfactant proteins are secreted by alveolar cells in the lung, and maintain healthy lung function and facilitate pathogen clearance⁸⁴. SFTPD is involved in the immune response pathway and the SFTPD missense variant has been linked to reduced lung function and severe COVID-19 (ref.⁸⁵).

Together with the other findings, these paint an overall picture in which variants in genes involved in upkeep of healthy lung tissue and maintenance of the immune system and its regulation upon viral exposure can affect the course of the disease in an individual.

The human leukocyte antigen (HLA) system orchestrates immune regulation, and the largest GWAS of common infections have implicated HLA in 13 of them⁸⁶. Thus, it was thought that this region would have a prominent role in explaining variability in COVID-19 severity and infection susceptibility, yet the region is far from being the strongest signal in GWAS. However, associations for HLA class II have now been detected by GenOMICC/ISARIC²⁸ and COVID-19 HGI⁵¹. Additionally, smaller targeted studies that were able to impute the HLA genotypes and thus gain better resolution of the region have also implicated HLA class I genotypes^87,88. What is still needed are definitive large-scale studies that properly account for the complexity in linkage disequilibrium (LD) and ancestry differences in the region. Therefore, the lack of HLA associations from some GWAS of COVID-19 severity might partially reflect limitations of the study designs rather than a genuine lack of biological association. The recent availability of multi-ancestry HLA imputation panels⁸⁹ and integration with imputation servers might facilitate this much-needed activity.

Overall, what has perhaps come as a surprise from GWAS of COVID-19 is how relatively many loci point to plausible biology, compared with other complex traits and considering the challenges in defining a reliable and consistent phenotype during an ongoing pandemic. Nonetheless, these results have been mainly used to confirm existing biological hypotheses and have not yet provided profoundly novel insights into COVID-19 disease, thus highlighting the challenges in rapidly connecting variants to function.

Effect of age

The genetic architecture of complex disease is not fixed, and genetics tends to have a larger proportional contribution to disease burden in younger age groups⁹⁰. Given the extreme importance of age as a risk factor for severe COVID-19 (refs^26,91), age should be considered in genetic analyses. Some evidence is emerging for age-specific effects at candidate rare variant loci^7,44 and one common risk locus²⁹. Large meta-analyses with access to detailed individual-level data will be needed to better understand the relationship of age and severe disease, particularly for individuals with rare variants.

Effect of sex

Male sex is one of the most impactful epidemiological risk factors for hospitalization and severe respiratory syndrome due to COVID-19, but initially large-scale genetic studies did not report sex-specific effects for infection susceptibility or severe disease. However, some reports of sex-specific effects are starting to emerge for loci containing immune-related genes^34,92. Moreover, the rare variants in the chromosome X gene TLR7 affect males and are associated with severe COVID-19 outcomes⁹³. Overall, genetics is unlikely to explain much of the increased COVID-19 severity among men. The general lack of sex-specific factors is not totally surprising as the genetics of numerous, well-studied immune-mediated diseases that significantly differ in their prevalence between sexes have not demonstrated a significant contribution of sex-specific genetic factors to such differences.

Population genetics and ancestry differences

Epidemiological studies have shown that people from non-white ethnic backgrounds are more at risk of infection and of severe COVID-19 (refs^17,94,95), raising questions about whether human genetics can explain some of these differences. Generally, non-genetic factors are much more relevant than genetic factors in explaining health disparities. However, the scale and diversity of participants in the COVID-19 HGI provide an opportunity to determine whether any of this difference might be explained by genetic variants that are risk factors for COVID-19 having higher frequencies in certain ancestries, and/or genetic variants having similar frequencies, but different magnitude of effects, across ancestries or environments.

Heterogeneity of variant effects across populations has been compared in several studies. Shelton et al.³² showed no significant difference in effect across several genetically defined ancestry groups at the most prominent risk loci, the 3p21.31 and ABO loci. However, with increasing sample sizes and improved representation of non-European ancestry groups, the COVID-19 HGI has recently reported a significantly different effect between ancestry groups for the FOXP4 locus⁵¹. Apart from this locus, the authors suggest that the observed heterogeneity at the remaining loci is more likely to be due to differences in study inclusion criteria (for example, variable definition of COVID-19 severity owing to different thresholds for testing, hospitalization and patient recruitment). Additionally, a smaller study by Parikh et al.⁹⁶ used admixture mapping — a method of gene mapping that uses differential risk by ancestry to identify ancestry-specific effects — and identified two genomic regions associated within local ancestries, suggesting that some ancestry-specific effects might exist.

Where the magnitudes of effect at currently established loci seem to be consistent across ancestry groups, lead variants at several loci show substantial frequency differences across populations (see the example of the 3p21.31 locus in Box 1). Some of the differences can be explained by negative selection as in the case of TYK2 (ref.⁶⁴). However, for other loci such as the 3p21.31 locus and the OAS gene cluster in which variants originated from Neanderthal introgression^70,97, it is as yet unknown whether the introgression drove selection or whether (as for other loci) the allele frequency differences might simply be consistent with genetic drift. Overall, we do not observe any specific ancestry group with consistently higher or lower frequencies at established COVID-19-associated variants. However, in-depth analysis of this issue has not been conducted, and existing analysis reporting that signatures of adaptation might be linked to an ancient epidemic in East Asian populations did not use GWAS-associated loci⁹⁸. Furthermore, as we do not know the exact causal variants for COVID-19 severity and susceptibility, it is difficult to draw conclusions even from accurate comparisons of ancestry-specific effect sizes. Beyond answering some key population genetics questions, more samples from diverse ancestries are needed to build a more comprehensive map of the effects of host genetics and to improve the statistical refinement of functional underpinnings of the loci associated with COVID-19, by, for example, co-localization and fine-mapping.

Overall, current evidence does not suggest that human genetics has a major role in explaining differences in COVID-19 severity and infection susceptibility across different ancestry groups. Thus, the most likely explanation is that, like most health disparities, differences observed between ancestry groups are likely to be due to differences in environmental and socio-economic factors that impact an individual’s chance of contracting COVID-19 and/or obtaining rapid and effective health-care interventions upon infection. Larger sample sizes in continental ancestry groups other than Europeans will allow further investigation of these questions.

Clinical and public health impact

Genetic instruments to identify causal risk factors

Genetics can be used to identify risk factors and biomarkers that correlate with COVID-19 and to support causal relationships with new or established risk factors^99,100,101. For example, large-scale genetic studies can identify shared genetic effects between COVID-19 and other traits. This is typically achieved using genetic correlations¹⁰⁰. The main advantage of genetic correlations compared with phenotypic correlations is that risk factors and COVID-19 phenotypes do not need to be measured on the same set of individuals. Genetic correlations for genetic liability to SARS-CoV-2 infection or more severe disease have recapitulated most of the established phenotypic (clinical) correlations with severe COVID-19 (for example, increased body mass index (BMI), smoking, diabetes, ischaemic stroke and educational attainment)^28,38. However, these results alone need to be interpreted with caution as they are subject to the same set of biases and confounders as standard epidemiological analyses, with the additional caveat that genetic studies are normally conducted on non-representative populations.

Genetic correlations can be combined with MR studies, which aim to identify causal associations between exposures and outcomes^101,102. This MR approach can reveal which risk factors might be causal for COVID-19 severity and which might be merely comorbid. For example, the HGI used MR to show that type 2 diabetes (T2D) was not a causal risk factor for severe COVID-19, but instead the association might be mediated by increased BMI. However, the most valuable application of MR studies in the context of COVID-19 is to evaluate the causal relationship with protein products that are targets of currently licensed drugs (drug repurposing) or drugs in clinical development. Specifically, if a putative drug target can be shown to have a causal effect on COVID-19 severity, then there can be more confidence that targeting that protein might be able to modify the disease course. An important consideration when honing in on potential drug targets though, is their potential pleiotropic effects; a drug target with specific downstream effects may be more desirable than modifying the function of a target that is involved in multiple pathways or biological processes. We note here that although MR analyses can pinpoint interesting candidates for follow-up, various in silico analyses and in vitro and in vivo models have a crucial role in preclinical target identification.

MR studies on COVID-19 have now suggested several proteins as potential drug targets, some of which are already targeted by existing drugs. For example, Gaziano et al.⁶⁵ found the best potential for druggable COVID-19 targets to be IFNAR2 and ACE2, which are known players in immune response and SARS-CoV-2 entry, respectively. The GenOMICC/ISARIC study²⁸ also performed MR for an a priori list of candidate genes, which were targets of drugs that at the time had been proposed as potentially effective treatments for COVID-19. Their analysis for causal associations with the risk of developing severe COVID-19 prioritized IFNAR2 and TYK2, which were previously implicated by GWAS. Another GWAS-implicated gene, OAS1, has also been supported by a study from Zhou et al.⁴ who investigated the levels of hundreds of circulating proteins in individuals (non-infectious state) and identified a causal relationship between higher plasma OAS1 levels and COVID-19 severity.

Perhaps the clearest example of where MR supports clinical findings is the IL-6 receptor (IL-6R). During the early pandemic, IL-6R inhibition was proposed as a potentially effective mechanism for treating severe COVID-19 (refs^103,104). Elevated levels of IL-6, which is a known immune-stimulating cytokine, have been regarded as a biomarker of severe COVID-19 in hospitalized patients who have elevated or dysregulated immune responses¹⁵. An MR analysis by Bovijn et al.¹⁰⁵ found a significant causal relationship between IL-6R genetic variants that resulted in reduced levels of the receptor and improved outcome in patients with COVID-19. Indeed, a recent meta-analysis of 27 randomized trials showed that administration of IL-6 antagonists, compared with usual care or placebo, was associated with lower 28-day all-cause mortality in patients hospitalized for COVID-19 (ref.¹⁰⁶), supporting the results of the MR analysis. Some debate on the similarities of the mechanism of action between the naturally occurring variants and the molecular inhibitors exist, as Garbers and Rose-John¹⁰⁷ have suggested that IL-6R inhibitors block both soluble and cell-bound IL-6R, thus eliminating the IL-6 signalling pathway, but functional genetic variants in the IL6R gene might instead affect the proportion of soluble to membrane-bound protein. Nevertheless, as the treatment has been shown to be beneficial, understanding the specific mechanisms of natural versus pharmacological modulation of the protein is likely to be of academic interest but will not affect the introduction of these drugs into clinical use in patients with COVID-19.

Polygenic scores

A polygenic score (PS; also known as polygenic risk score (PRS)) summarizes the measurable individual genetic risk for a chosen trait or disease based on the genotypes at several loci from GWAS. These are constructed typically either from variants in loci that are statistically significantly trait associated or also including variants across loci that did not reach genome-wide statistical significance. At a population level, PS alone or in combination with other risk factors can be used to assign an estimate of risk to each individual^108,109. A few studies have now tried to calculate PSs for COVID-19, but these have so far been generally weakly powered, and most variation in the phenotype explained by PS is due to the inclusion of a few of the most significant signals, for example, the 3p21.31 locus^29,31,56.

A clinical application for PS of SARS-CoV-2 infection susceptibility or severity is unlikely in the short term. First, in a clinical setting, genetic information is not routinely collected at scale or available for consultation by clinicians. Second, although many risk prediction tools for COVID-19 have been developed^110,111,112, to our knowledge none has been used in clinical practice. Thus, it would be unlikely for a COVID-19 PS to be widely adopted. However, there might be some value for PS in identifying individuals who are at higher risk of developing severe COVID-19 symptoms amongst younger individuals without pre-existing risk factors. A study by Nakanishi et al.²⁹ showed that in COVID-19-positive individuals younger than 60 years, a single genetic risk factor (the 3p21.31 locus) can be as predictive of death and respiratory failure as some established comorbidities such as T2D. Nonetheless, more research is needed not only to evaluate more powerful PSs, but also to address inherent limitations such as the lack of PS transferability across ancestry groups.

Research applications of PS are nonetheless valuable. PS can be used to summarize our current knowledge on the genetic risk factors that underlie infection susceptibility and COVID-19 severity. For example, are individuals at higher genetic risk more likely to develop vaccine breakthrough infection, to experience more severe side effects or to develop post-COVID syndrome?

In conclusion, GWAS results can be used to construct PSs that are valuable for research purposes, but are unlikely to have a clinical value in the short term.

Conclusions and future perspectives

Genetic association studies have been exceptionally fast in delivering new genetic signals underlying COVID-19 severity and infection susceptibility. On a sobering note, these discoveries have had a limited impact on the management of the COVID-19 pandemic thus far, and it is our hope that the next phase of the pandemic will see more application of human genetics results and better functional insights. Here, we provide some perspective on the key opportunities ahead for the field, while taking for granted that increased sample size will fuel new discoveries.

Expanding COVID-19 phenotypes and post (long) COVID-19

As reviewed here, most genetic studies of COVID-19 to date have focused on pinpointing factors that make some individuals more susceptible to SARS-CoV-2 infection and explaining why others develop severe symptoms. However, with ever-expanding understanding of the disease and the data collected, future genetic studies may expand to investigating, at scale, particular symptoms associated with the infection or severe comorbid conditions such as multisystem inflammatory syndromes^113,114,115. Furthermore, some individuals who have contracted COVID-19 experience long-term symptoms that may result in a considerable health burden in the years to come^116,117. There is large variability in the symptoms experienced by those affected by post (long) COVID-19 (refs^116,117). Human genetics can be helpful in this context because some of the post-COVID-19 symptoms have directly or indirectly been studied by GWAS. For example, one might test the hypothesis that COVID-19 accelerates existing genetic predispositions to some of the symptoms. Together with observational epidemiological analysis, MR can be used as an additional pillar to triangulate evidence of causal relationship between COVID-19 and downstream consequences. Global networks such as the COVID-19 HGI can play a key part in such undertakings because they bring together studies with different designs, including biobank studies with longitudinal medical information pre- and post-infection and direct-to-consumer studies that can capture self-reported symptoms on a large number of individuals.

Interaction between host genomes and viral genomes

The interaction between host and viral genomes is surprisingly understudied, partially reflecting the lack of interaction between the corresponding scientific communities, but, most importantly, the lack of studies in which both types of information have been collected at scale^118,119. A recent report¹²⁰ showed that the protective effect of the sickle cell allele of host HBB against severe malaria is not detected in the presence of certain alleles in the parasite’s genome. These parasite alleles are particularly common in strains found in Africa, illustrating the importance of host–pathogen interaction analyses for understanding regional disease epidemiology and selective pressures in infectious disease. Variability in symptoms and resulting disease severity have also been observed across SARS-CoV-2 strains^121,122, but it is not clear whether the underlying host genetic factors are the same. Parikh et al.⁹⁶ have conducted an initial study combining viral and human genetic data information, but they did not find significant results from the phylogenetic information constructed from the viral RNA. To overcome the lack of large samples, one might perform targeted studies focusing on genome-wide significant loci or PSs. Additionally, with recent temporal waves of disease dominated by delta and then omicron variants, the time and location of infection could potentially be used to infer a proxy for the likely variant.

Vaccination response and breakthrough infections

Rollout of vaccines brings challenges and opportunities to the study of the human genetic epidemiology of COVID-19. On one hand, the different strategies employed by countries can shape the epidemic differently in different parts of the world, inevitably changing the major demographic groups who become infected or severely affected by the disease, and can ultimately challenge the interpretation of genetic discoveries. On the other hand, widespread vaccination opens the possibility to study vaccination side effects and breakthrough infections. Bolze et al.³⁵ have reported that individuals who carry the HLA-A✳03:01 allele were more likely to experience severe difficulties with daily routine after vaccination. For other more severe and rare side effects, it will be of paramount importance to leverage existing international collaboration to obtain robust and replicable results.

Data sharing

Although this pandemic has shown the importance of rapid data sharing, open methodological reporting and academic–commercial partnership science, the sharing of individual-level data is still far from being a reality. Widespread, yet safe, access to individual-level data can foster discoveries and methodological developments beyond what is currently possible with sharing of summary statistics. Yet despite repeated evidence showing that study participants endorse data sharing^123,124,125, legal and data protection challenges have hindered these efforts within and beyond the human genetics community¹²⁶. Consortia such as the COVID-19 HGI^38,51,58 have clearly demonstrated the impact of transparent science: despite the challenges of the pandemic, they set common goals early on and prioritized the sharing of resources and data, and the result was one of the largest genetic studies ever performed so far with representation from almost every continent. These types of effort should be considered as a roadmap to future collaborative initiatives. Currently, with the exception of the UK Biobank and a small subset of the HGI initiative (EGAC00001002188), there is no large data set with human genetic and COVID-19 disease information that is accessible to the entire scientific community via established repositories. We hope the next phase of the pandemic will see a shift in the attitude towards sharing of individual-level data.

Outlook for COVID-19 host genetics

Continued investigations into host genetic factors that contribute to severe COVID-19 and susceptibility to SARS-CoV-2 viral infection will be essential to maximize the chances of finding new therapeutic avenues to treating the disease, whether it be through drug repurposing or the longer-term endeavour of new drug development. These findings should be integrated with multi-omics results to provide clearer biological insights. As for any other complex disease, genetic risk prediction is likely to add value to clinical risk prediction in a hospital setting for identification of patients who are more likely to develop further severe symptoms, and thus continued efforts on the identification of risk factors and the development of predictive biomarkers are warranted. Host genetics is not the sole key to cracking the code to successful and effective treatment of COVID-19, but with continuation of open science and partnerships between academic, industry, health-care providers and policy-makers, we will hopefully see large leaps towards that goal in the near future.

References

Hu, B., Guo, H., Zhou, P. & Shi, Z.-L. Characteristics of SARS-CoV-2 and COVID-19. Nat. Rev. Microbiol. 19, 141–154 (2021).
Article CAS PubMed Google Scholar
Hoffmann, M. et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 181, 271–280.e8 (2020).
Article CAS PubMed PubMed Central Google Scholar
Grant, R. A. et al. Circuits between infected macrophages and T cells in SARS-CoV-2 pneumonia. Nature 590, 635–641 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhou, S. et al. A Neanderthal OAS1 isoform protects individuals of European ancestry against COVID-19 susceptibility and severity. Nat. Med. 27, 659–667 (2021).
Article CAS PubMed Google Scholar
Asano, T. et al. X-linked recessive TLR7 deficiency in ~1% of men under 60 years old with life-threatening COVID-19. Sci. Immunol. 6, eabl4348 (2021).
Article PubMed PubMed Central Google Scholar
Kasuga, Y., Zhu, B., Jang, K.-J. & Yoo, J.-S. Innate immune sensing of coronavirus and viral evasion strategies. Exp. Mol. Med. 53, 723–736 (2021).
Article CAS PubMed PubMed Central Google Scholar
Fallerini, C. et al. Association of Toll-like receptor 7 variants with life-threatening COVID-19 disease in males: findings from a nested case-control study. eLife 10, e67569 (2021).
Article CAS PubMed PubMed Central Google Scholar
Thorne, L. G. et al. SARS-CoV-2 sensing by RIG-I and MDA5 links epithelial infection to macrophage inflammation. EMBO J. 40, e107826 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yamada, T. et al. RIG-I triggers a signaling-abortive anti-SARS-CoV-2 defense in human lung cells. Nat. Immunol. 22, 820–828 (2021).
Article CAS PubMed Google Scholar
Schneider, W. M., Chevillotte, M. D. & Rice, C. M. Interferon-stimulated genes: a complex web of host defenses. Annu. Rev. Immunol. 32, 513–545 (2014).
Article CAS PubMed PubMed Central Google Scholar
Schultze, J. L. & Aschenbrenner, A. C. COVID-19 and the human innate immune system. Cell 184, 1671–1692 (2021).
Article CAS PubMed PubMed Central Google Scholar
Osuchowski, M. F. et al. The COVID-19 puzzle: deciphering pathophysiology and phenotypes of a new disease entity. Lancet Respir. Med. 9, 622–642 (2021).
Article CAS PubMed PubMed Central Google Scholar
Rendeiro, A. F. et al. The spatial landscape of lung pathology during COVID-19 progression. Nature 593, 564–569 (2021).
Article CAS PubMed PubMed Central Google Scholar
Merad, M. & Martin, J. C. Pathological inflammation in patients with COVID-19: a key role for monocytes and macrophages. Nat. Rev. Immunol. 20, 355–362 (2020).
Article CAS PubMed PubMed Central Google Scholar
Huang, C. et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395, 497–506 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhou, F. et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 395, 1054–1062 (2020).
Article CAS PubMed PubMed Central Google Scholar
Williamson, E. J. et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature 584, 430–436 (2020).
Article CAS PubMed PubMed Central Google Scholar
Bastard, P. et al. Autoantibodies neutralizing type I IFNs are present in ~4% of uninfected individuals over 70 years old and account for ~20% of COVID-19 deaths. Sci. Immunol. 6, eabl4340 (2021).
Article PubMed PubMed Central Google Scholar
Yousefzadegan, S. & Rezaei, N. Case report: death due to COVID-19 in three brothers. Am. J. Trop. Med. Hyg. 102, 1203–1204 (2020).
Article CAS PubMed PubMed Central Google Scholar
Samson, M. et al. Resistance to HIV-1 infection in caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor gene. Nature 382, 722–725 (1996).
Article CAS PubMed Google Scholar
Liu, R. et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86, 367–377 (1996).
Article CAS PubMed Google Scholar
Allison, A. C. Protection afforded by sickle-cell trait against subtertian malareal infection. Br. Med. J. 1, 290–294 (1954).
Article CAS PubMed PubMed Central Google Scholar
Aidoo, M. et al. Protective effects of the sickle cell gene against malaria morbidity and mortality. Lancet 359, 1311–1312 (2002).
Article CAS PubMed Google Scholar
Williams, T. N. et al. Sickle cell trait and the risk of Plasmodium falciparum malaria and other childhood diseases. J. Infect. Dis. 192, 178–186 (2005).
Article PubMed Google Scholar
Kwok, A. J., Mentzer, A. & Knight, J. C. Host genetics and infectious disease: new tools, insights and translational opportunities. Nat. Rev. Genet. 22, 137–153 (2021).
Article CAS PubMed Google Scholar
Severe Covid-19 GWAS Group. Genomewide association study of severe Covid-19 with respiratory failure. N. Engl. J. Med. 383, 1522–1534 (2020). The first GWAS of COVID-19 severity describing the ABO signal and the strongest signal for COVID-19 severity on chromosome 3.
Article Google Scholar
Casanova, J.-L. & Su, H. C., COVID Human Genetic Effort. A global effort to define the human genetics of protective immunity to SARS-CoV-2 Infection. Cell 181, 1194–1199 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pairo-Castineira, E. et al. Genetic mechanisms of critical illness in COVID-19. Nature 591, 92–98 (2021). A large GWAS of COVID-19 focusing on individuals with a critical illness.
Article PubMed CAS Google Scholar
Nakanishi, T. et al. Age-dependent impact of the major common genetic risk factor for COVID-19 on severity and mortality. Preprint at medRxiv https://doi.org/10.1101/2021.03.07.21252875 (2021).
Article PubMed PubMed Central Google Scholar
Mc Intyre, K. et al. Lifelines COVID-19 cohort: investigating COVID-19 infection and its health and societal impacts in a Dutch population-based cohort. BMJ Open 11, e044474 (2021).
Article Google Scholar
Kosmicki, J. A. et al. Pan-ancestry exome-wide association analyses of COVID-19 outcomes in 586,157 individuals. Am. J. Hum. Genet. 108, 1350–1355 (2021).
Article CAS PubMed Google Scholar
Shelton, J. F. et al. Trans-ancestry analysis reveals genetic and nongenetic associations with COVID-19 susceptibility and severity. Nat. Genet. 53, 801–808 (2021). GWAS linking the UGT2A1–UGT2A2 locus with loss of taste and smell in individuals with COVID-19.
Article CAS PubMed Google Scholar
Knight, S. C. et al. COVID-19 susceptibility and severity risks in a survey of over 500,000 individuals. Preprint at medRxiv https://doi.org/10.1101/2020.10.08.20209593 (2021).
Article PubMed PubMed Central Google Scholar
Roberts, G. H. L. et al. AncestryDNA COVID-19 host genetic study identifies three novel loci. Preprint at medRxiv https://doi.org/10.1101/2020.10.06.20205864 (2020).
Article PubMed PubMed Central Google Scholar
Bolze, A. et al. HLA-A*03:01 is associated with increased risk of fever, chills, and stronger side effects from Pfizer-BioNTech COVID-19 vaccination. HGG Adv. 3, 100084 (2022). Implicates HLA in the severity of side-effects experienced after vaccination.
CAS PubMed PubMed Central Google Scholar
Shelton, J. F. et al. The UGT2A1/UGT2A2 locus is associated with COVID-19-related loss of smell or taste. Nat. Genet. 54, 121–124 (2022).
Article CAS PubMed Google Scholar
Roberts, G. H. L. et al. Novel COVID-19 phenotype definitions reveal phenotypically distinct patterns of genetic association and protective effects. Preprint at bioRxiv https://doi.org/10.1101/2021.01.24.21250324 (2021).
Article PubMed PubMed Central Google Scholar
COVID-19 Host Genetics Initiative Mapping the human genetic architecture of COVID-19. Nature 600, 472–477 (2021). The largest GWAS of COVID-19 infection susceptibility and severity. The consortium regularly release results online (https://www.covid19hg.org/) and describes them via publication; see also reference 51, COVID-19 Host Genetics Initiative & Ganna, A. (2021).
van Blokland, I. V. et al. Using symptom-based case predictions to identify host genetic factors that contribute to COVID-19 susceptibility. Preprint at bioRxiv https://doi.org/10.1101/2020.08.21.20177246 (2020).
Article Google Scholar
Griffith, G. J. et al. Collider bias undermines our understanding of COVID-19 disease risk and severity. Nat. Commun. 11, 5749 (2020). This study describes how collider bias challenges the interpretation of many COVID-19 observational studies.
Article CAS PubMed PubMed Central Google Scholar
Casanova, J.-L. & Abel, L. Lethal infectious diseases as inborn errors of immunity: toward a synthesis of the germ and genetic theories. Annu. Rev. Pathol. 16, 23–50 (2021).
Article CAS PubMed Google Scholar
van der Made, C. I. et al. Presence of genetic variants among young men with severe COVID-19. JAMA 324, 663–673 (2020). The first report of rare deleterious mutations in TLR7 being associated with severe COVID-19.
Article PubMed CAS Google Scholar
Mantovani, S. et al. Rare variants in Toll-like receptor 7 results in functional impairment and downregulation of cytokine-mediated signaling in COVID-19 patients. Genes. Immun. 23, 51–56 (2022).
Article CAS PubMed Google Scholar
Zhang, Q. et al. Inborn errors of type I IFN immunity in patients with life-threatening COVID-19. Science 370, eabd4570 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kousathanas, A. et al. Whole genome sequencing reveals host factors underlying critical Covid-19. Nature https://doi.org/10.1038/s41586-022-04576-6 (2022).
Article PubMed PubMed Central Google Scholar
Povysil, G. et al. Rare loss-of-function variants in type I IFN immunity genes are not associated with severe COVID-19. J. Clin. Invest. 131, e147834 (2021).
Article CAS PubMed Central Google Scholar
Zhang, Q. et al. Association of rare predicted loss-of-function variants of influenza-related type I IFN genes with critical COVID-19 pneumonia. J. Clin. Invest. 131, e152474 (2021).
Article CAS PubMed Central Google Scholar
Povysil, G. et al. Association of rare predicted loss-of-function variants of influenza-related type I IFN genes with critical COVID-19 pneumonia. Reply. J. Clin. Invest. 131, e152475 (2021).
Article CAS PubMed Central Google Scholar
Bomba, L., Walter, K. & Soranzo, N. The impact of rare and low-frequency genetic variants in common disease. Genome Biol. 18, 77 (2017).
Article PubMed PubMed Central CAS Google Scholar
Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Prim. 1, 1–21 (2021).
CAS Google Scholar
COVID-19 Host Genetics Initiative, Ganna, A. Mapping the human genetic architecture of COVID-19: an update. Preprint at bioRxiv https://doi.org/10.1101/2021.11.08.21265944 (2021).
Article Google Scholar
Liu, N. et al. The impact of ABO blood group on COVID-19 infection risk and mortality: a systematic review and meta-analysis. Blood Rev. 48, 100785 (2021).
Article CAS PubMed Google Scholar
Gérard, C., Maggipinto, G. & Minon, J.-M. COVID-19 and ABO blood group: another viewpoint. Br. J. Haematol. 190, e93–e94 (2020).
Article PubMed PubMed Central CAS Google Scholar
Anisul, M. et al. A proteome-wide genetic investigation identifies several SARS-CoV-2-exploited host targets of clinical relevance. eLife 10, e69719 (2021).
Article CAS PubMed PubMed Central Google Scholar
Anstee, D. J. The relationship between blood groups and disease. Blood 115, 4635–4643 (2010).
Article CAS PubMed Google Scholar
Horowitz, J. E. et al. Genome-wide analysis provides genetic evidence that ACE2 influences COVID-19 risk and yields risk scores associated with severe disease. Nat. Genet. https://doi.org/10.1038/s41588-021-01006-7 (2022).
Article PubMed PubMed Central Google Scholar
Vuille-dit-Bille, R. N. et al. Human intestine luminal ACE2 and amino acid transporter expression increased by ACE-inhibitors. Amino Acids 47, 693–705 (2015).
Article CAS PubMed Google Scholar
COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715–718 (2020).
Article CAS Google Scholar
Minegishi, Y. et al. Human tyrosine kinase 2 deficiency reveals its requisite roles in multiple cytokine signals involved in innate and acquired immunity. Immunity 25, 745–755 (2006).
Article CAS PubMed Google Scholar
Kreins, A. Y. et al. Human TYK2 deficiency: mycobacterial and viral infections without hyper-IgE syndrome. J. Exp. Med. 212, 1641–1662 (2015).
Article CAS PubMed PubMed Central Google Scholar
Dendrou, C. A. et al. Resolving TYK2 locus genotype-to-phenotype differences in autoimmunity. Sci. Transl. Med. 8, 363ra149 (2016).
Article PubMed PubMed Central CAS Google Scholar
Boisson-Dupuis, S. et al. Tuberculosis and impaired IL-23-dependent IFN-γ immunity in humans homozygous for a common TYK2 missense variant. Sci. Immunol. 3, eaau8714 (2018).
Article PubMed PubMed Central Google Scholar
Kerner, G. et al. Homozygosity for TYK2 P1104A underlies tuberculosis in about 1% of patients in a cohort of European ancestry. Proc. Natl Acad. Sci. USA 116, 10430–10434 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kerner, G. et al. Human ancient DNA analyses reveal the high burden of tuberculosis in Europeans over the last 2,000 years. Am. J. Hum. Genet. 108, 517–524 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gaziano, L. et al. Actionable druggable genome-wide Mendelian randomization identifies repurposing opportunities for COVID-19. Nat. Med. 27, 668–676 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lee, J. S. & Shin, E.-C. The type I interferon response in COVID-19: implications for treatment. Nat. Rev. Immunol. 20, 585–586 (2020).
Article CAS PubMed PubMed Central Google Scholar
Acharya, D., Liu, G. & Gack, M. U. Dysregulation of type I interferon responses in COVID-19. Nat. Rev. Immunol. 20, 397–398 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ziegler, C. G. K. et al. Impaired local intrinsic immunity to SARS-CoV-2 infection in severe COVID-19. Preprint at bioRxiv https://doi.org/10.1101/2021.02.20.431155 (2021).
Article PubMed PubMed Central Google Scholar
Wang, N. et al. Retrospective multicenter cohort study shows early interferon therapy is associated with favorable clinical responses in COVID-19 Patients. Cell Host Microbe 28, 455–464.e2 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zeberg, H. & Pääbo, S. A genomic region associated with protection against severe COVID-19 is inherited from Neandertals. Proc. Natl Acad. Sci. USA 118, e2026309118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Banday, A. R. et al. Genetic regulation of OAS1 nonsense-mediated decay underlies association with risk of severe COVID-19. Preprint at medRxiv https://doi.org/10.1101/2021.07.09.21260221 (2021).
Article PubMed PubMed Central Google Scholar
Wickenhagen, A. et al. A prenylated dsRNA sensor protects against severe COVID-19. Science 374, eabj3624 (2021). This study links a prenylated OAS1 haplotype, which is common among humans and also present in horseshoe bats, with COVID severity.
Article CAS PubMed PubMed Central Google Scholar
Huffman, J. E. et al. Multi-ancestry fine mapping implicates OAS1 splicing in risk of severe COVID-19. Nat. Genet. 54, 125–127 (2022). This paper identifies the causal variant for the OAS1 locus associated with COVID-19 severity.
Article CAS PubMed PubMed Central Google Scholar
Grasselli, G. et al. Risk factors associated with mortality among patients with COVID-19 in Intensive Care Units in Lombardy, Italy. JAMA Intern. Med. 180, 1345–1355 (2020).
Article CAS PubMed Google Scholar
Guan, W.-J. et al. Comorbidity and its impact on 1590 patients with COVID-19 in China: a nationwide analysis. Eur. Respir. J. 55, 2000547 (2020).
Article CAS PubMed PubMed Central Google Scholar
Seibold, M. A. et al. A common MUC5B promoter polymorphism and pulmonary fibrosis. N. Engl. J. Med. 364, 1503–1512 (2011).
Article CAS PubMed PubMed Central Google Scholar
Hancock, L. A. et al. Muc5b overexpression causes mucociliary dysfunction and enhances lung fibrosis in mice. Nat. Commun. 9, 5363 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fadista, J. et al. Shared genetic etiology between idiopathic pulmonary fibrosis and COVID-19 severity. EBioMedicine 65, 103277 (2021).
Article CAS PubMed PubMed Central Google Scholar
Peljto, A. L. et al. Association between the MUC5B promoter polymorphism and survival in patients with idiopathic pulmonary fibrosis. JAMA 309, 2232–2239 (2013).
Article CAS PubMed PubMed Central Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wang, Z. et al. Meta-analysis of genome-wide association studies identifies multiple lung cancer susceptibility loci in never-smoking Asian women. Hum. Mol. Genet. 25, 620–629 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dai, J. et al. Identification of risk loci and a polygenic risk score for lung cancer: a large-scale prospective cohort study in Chinese populations. Lancet Respir. Med. 7, 881–891 (2019).
Article PubMed PubMed Central Google Scholar
Manichaikul, A. et al. Genome-wide association study of subclinical interstitial lung disease in MESA. Respir. Res. 18, 97 (2017).
Article PubMed PubMed Central CAS Google Scholar
Wright, J. R. Immunoregulatory functions of surfactant proteins. Nat. Rev. Immunol. 5, 58–68 (2005).
Article CAS PubMed Google Scholar
Hsieh, M.-H. et al. Human surfactant protein D binds spike protein and acts as an entry inhibitor of SARS-CoV-2 pseudotyped viral particles. Front. Immunol. 12, 641360 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tian, C. et al. Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections. Nat. Commun. 8, 599 (2017).
Article PubMed PubMed Central CAS Google Scholar
Shkurnikov, M. et al. Association of HLA class I genotypes with severity of coronavirus disease-19. Front. Immunol. 12, 641900 (2021).
Article CAS PubMed PubMed Central Google Scholar
Douillard, V. et al. Current HLA investigations on SARS-CoV-2 and perspectives. Front. Genet. 12, 774922 (2021).
Article CAS PubMed PubMed Central Google Scholar
Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ethnic fine-mapping in HIV host response. Nat. Genet. 53, 1504–1516 (2021).
Article CAS PubMed PubMed Central Google Scholar
Feng, Y.-C. A. et al. Findings and insights from the genetic investigation of age of first reported occurrence for complex disorders in the UK Biobank and FinnGen. Preprint at bioRxiv https://doi.org/10.1101/2020.11.20.20234302 (2020).
Article PubMed PubMed Central Google Scholar
Docherty, A. B. et al. Features of 20 133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: prospective observational cohort study. BMJ 369, m1985 (2020).
Article PubMed PubMed Central Google Scholar
Carracedo, A., Spanish COalition to Unlock Research on host GEnetics on COVID-19 (SCOURGE). A genome-wide association study of COVID-19 related hospitalization in Spain reveals genetic disparities among sexes. Preprint at medRxiv https://doi.org/10.1101/2021.11.24.21266741 (2021).
Article Google Scholar
Zhang, Q., Bastard, P., Cobat, A. & Casanova, J.-L. COVID Human Genetic Effort Human genetic and immunological determinants of critical COVID-19 pneumonia. Nature 603, 587–598 (2022).
Article CAS PubMed Google Scholar
Price-Haywood, E. G., Burton, J., Fort, D. & Seoane, L. Hospitalization and mortality among black patients and white patients with Covid-19. N. Engl. J. Med. 382, 2534–2543 (2020).
Article CAS PubMed Google Scholar
Millett, G. A. et al. Assessing differential impacts of COVID-19 on black communities. Ann. Epidemiol. 47, 37–44 (2020).
Article PubMed PubMed Central Google Scholar
Parikh, V. N. et al. Deconvoluting complex correlates of COVID19 severity with local ancestry inference and viral phylodynamics: Results of a multiomic pandemic tracking strategy. Preprint at bioRxiv https://doi.org/10.1101/2021.08.04.21261547 (2021).
Article Google Scholar
Zeberg, H. & Pääbo, S. The major genetic risk factor for severe COVID-19 is inherited from Neanderthals. Nature 587, 610–612 (2020). Description of the haplotype structure of the strongest common signal for COVID-19 risk and how this is linked with Neanderthal introgression.
Article CAS PubMed Google Scholar
Souilmi, Y. et al. An ancient viral epidemic involving host coronavirus interacting genes more than 20,000 years ago in East Asia. Curr. Biol. 31, 3504–3514.e9 (2021).
Article CAS PubMed PubMed Central Google Scholar
Burgess, S., Foley, C. N. & Zuber, V. Inferring causal relationships between risk factors and outcomes from genome-wide association study data. Annu. Rev. Genomics Hum. Genet. 19, 303–327 (2018).
Article CAS PubMed PubMed Central Google Scholar
van Rheenen, W., Peyrot, W. J., Schork, A. J., Lee, S. H. & Wray, N. R. Genetic correlations of polygenic disease traits: from theory to practice. Nat. Rev. Genet. 20, 567–581 (2019).
Article PubMed CAS Google Scholar
Pingault, J.-B. et al. Using genetic data to strengthen causal inference in observational research. Nat. Rev. Genet. 19, 566–580 (2018).
Article CAS PubMed Google Scholar
Smith, G. D. & Ebrahim, S. Mendelian Randomization: Genetic Variants as Instruments for Strengthening Causal Inference in Observational Studies (National Academies Press, 2008).
Crisafulli, S., Isgrò, V., La Corte, L., Atzeni, F. & Trifirò, G. Potential role of anti-interleukin (IL)-6 drugs in the treatment of COVID-19: rationale, clinical evidence and risks. BioDrugs 34, 415–422 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jones, S. A. & Hunter, C. A. Is IL-6 a key cytokine target for therapy in COVID-19? Nat. Rev. Immunol. 21, 337–339 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bovijn, J., Lindgren, C. M. & Holmes, M. V. Genetic variants mimicking therapeutic inhibition of IL-6 receptor signaling and risk of COVID-19. Lancet Rheumatol. 2, e658–e659 (2020). This study exemplifies how MR can be used to inform drug repurposing in the context of COVID-19.
Article PubMed PubMed Central Google Scholar
WHO Rapid Evidence Appraisal for COVID-19 Therapies (REACT) Working Group. et al. Association between administration of IL-6 antagonists and mortality among patients hospitalized for COVID-19: a meta-analysis. JAMA 326, 499–518 (2021).
Article CAS Google Scholar
Garbers, C. & Rose-John, S. Genetic IL-6R variants and therapeutic inhibition of IL-6 receptor signalling in COVID-19. Lancet Rheumatol. 3, e96–e97 (2021).
Article PubMed Google Scholar
Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).
Article CAS PubMed PubMed Central Google Scholar
Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).
Article CAS PubMed Google Scholar
Wynants, L. et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ 369, m1328 (2020).
Article PubMed PubMed Central Google Scholar
Dite, G. S., Murphy, N. M. & Allman, R. Development and validation of a clinical and genetic model for predicting risk of severe COVID-19. Epidemiol. Infect. 149, e162 (2021).
Article CAS PubMed Google Scholar
Dite, G. S., Murphy, N. M. & Allman, R. An integrated clinical and genetic model for predicting risk of severe COVID-19: A population-based case–control study. PLoS ONE 16, e0247205 (2021).
Article CAS PubMed PubMed Central Google Scholar
Galeotti, C. & Bayry, J. Autoimmune and inflammatory diseases following COVID-19. Nat. Rev. Rheumatol. 16, 413–414 (2020).
Article CAS PubMed PubMed Central Google Scholar
Chou, J. et al. Mechanisms underlying genetic susceptibility to multisystem inflammatory syndrome in children (MIS-C). J. Allergy Clin. Immunol. 148, 732–738.e1 (2021).
Article CAS PubMed PubMed Central Google Scholar
Verdoni, L. et al. An outbreak of severe Kawasaki-like disease at the Italian epicentre of the SARS-CoV-2 epidemic: an observational cohort study. Lancet 395, 1771–1778 (2020).
Article CAS PubMed PubMed Central Google Scholar
Menges, D. et al. Burden of post-COVID-19 syndrome and implications for healthcare service planning: A population-based cohort study. PLoS ONE 16, e0254523 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ayoubkhani, D., Pawelek, P. & Bosworth, M. Prevalence of ongoing symptoms following coronavirus (COVID-19) infection in the UK. Office for National Statistics https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/prevalenceofongoingsymptomsfollowingcoronaviruscovid19infectionintheuk/5august2021 (2021).
Callaway, E. The coronavirus is mutating — does it matter? Nature 585, 174–177 (2020).
Article CAS PubMed Google Scholar
Jones, J. E., Le Sage, V. & Lakdawala, S. S. Viral and host heterogeneity and their effects on the viral life cycle. Nat. Rev. Microbiol. 19, 272–282 (2021).
Article CAS PubMed Google Scholar
Band, G. et al. Malaria protection due to sickle haemoglobin depends on parasite genotype. Nature 602, 106–111 (2022).
Article CAS PubMed Google Scholar
Twohig, K. A. et al. Hospital admission and emergency care attendance risk for SARS-CoV-2 delta (B.1.617.2) compared with alpha (B.1.1.7) variants of concern: a cohort study. Lancet Infect. Dis. 22, 35–42 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bager, P. et al. Risk of hospitalisation associated with infection with SARS-CoV-2 lineage B.1.1.7 in Denmark: an observational cohort study. Lancet Infect. Dis. 21, 1507–1511 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mello, M. M., Lieou, V. & Goodman, S. N. Clinical trial participants’ views of the risks and benefits of data sharing. N. Engl. J. Med. 378, 2202–2211 (2018).
Article PubMed PubMed Central Google Scholar
Richter, G. et al. Patient views on research use of clinical data without consent: Legal, but also acceptable? Eur. J. Hum. Genet. 27, 841–847 (2019).
Article PubMed PubMed Central Google Scholar
Haga, S. B. & O’Daniel, J. Public perspectives regarding data-sharing practices in genomics research. Public Health Genomics 14, 319–324 (2011).
Article CAS PubMed PubMed Central Google Scholar
Bentzen, H. B. et al. Remove obstacles to sharing health data with researchers outside of the European Union. Nat. Med. 27, 1329–1333 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pietzner, M. et al. ELF5 is a respiratory epithelial cell-specific risk gene for severe COVID-19. Preprint at bioRxiv https://doi.org/10.1101/2022.01.17.22269283 (2022).
Article Google Scholar
Boughton, A. P. et al. LocusZoom.js: interactive and embeddable visualization of genetic association study results. Bioinformatics 37, 3017–3018 (2021).
Article CAS PubMed Central Google Scholar
Downes, D. J. et al. Identification of LZTFL1 as a candidate effector gene at a COVID-19 risk locus. Nat. Genet. 53, 1606–1615 (2021). In silico functional analysis identifies LZTFL1 as the candidate gene beyond the strongest common locus for COVID-19 severity.
Article CAS PubMed PubMed Central Google Scholar
Stewart, C. A. et al. Lung cancer models reveal severe acute respiratory syndrome coronavirus 2–induced epithelial-to-mesenchymal transition contributes to coronavirus disease 2019 pathophysiology. J. Thorac. Oncol. 16, 1821–1839 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wein, A. N. et al. CXCR6 regulates localization of tissue-resident memory CD8 T cells to the airways. J. Exp. Med. 216, 2748–2762 (2019).
Article CAS PubMed PubMed Central Google Scholar
Dai, Y. et al. Association of CXCR6 with COVID-19 severity: delineating the host genetic factors in transcriptomic regulation. Hum. Genet. 140, 1313–1328 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yao, Y. et al. Genome and epigenome editing identify CCR9 and SLC6A20 as target genes at the 3p21.31 locus associated with severe COVID-19. Signal. Transduct. Target. Ther. 6, 85 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kasela, S. et al. Integrative approach identifies SLC6A20 and CXCR6 as putative causal genes for the COVID-19 GWAS signal in the 3p21.31 locus. Preprint at medRxiv https://doi.org/10.1101/2021.04.09.21255184 (2021).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank M. Kanai for providing Fig. 2 and the accompanying figure legend. The authors also thank G. Butler-Laporte, A. Renieri and the COVID-19 Host Genetics Initiative for personal communication regarding TLR7 variants in sequencing studies, and G. Butler-Laporte, B. Richards and the Biobanque Quebec COVID19, C. Fallerini, A. Renieri and the GEN-COVID study, and J. Jung, M. S. Althagafi, S. Mangul and the Saudi Human Genome Program for discussions regarding effect of age in individuals with rare variants. A.G. was supported by the Academy of Finland (grant nos 323116, 340539, 340541) and by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant no. 945733). A.G. has also received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 101016775.

Author information

Authors and Affiliations

Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
Mari E. K. Niemi, Mark J. Daly & Andrea Ganna
Broad Institute, Cambridge, MA, USA
Mark J. Daly & Andrea Ganna
Analytical and Translational Genetics Unit, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
Mark J. Daly & Andrea Ganna

Authors

Mari E. K. Niemi
View author publications
You can also search for this author in PubMed Google Scholar
Mark J. Daly
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Ganna
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.E.K.N. and A.G. researched data for and wrote the article. All authors substantially contributed to the discussion of content and reviewing/editing the manuscript before submission.

Corresponding author

Correspondence to Andrea Ganna.

Ethics declarations

Competing interests

M.E.K.N. is an employee of Novartis. M.J.D. and A.G. declare no competing interests.

Peer review

Peer review information

Nature Reviews Genetics thanks Yukinori Okada and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Expression quantitative trait locus: (eQTL). A variant that affects the expression of a gene proximally (cis-regulation) or distally (trans-regulation).
Hypomorphic: Describes an allele that confers reduced signalling or expression of the gene product.
Mendelian randomization: (MR). A statistical method that can use genetic variants as instruments to study the causal relationship between risk factors and a disease outcome.
Haplotype: The combination of alleles on a chromosome.
Lead variant: Typically the variant in a locus with the strongest statistical association with the trait in a genome-wide association study (GWAS). The distinction between lead and causal variant is that the causal variant reflects underlying biological cause, whereas a test for association factors such as allele frequency and linkage disequilibrium (LD) patterns can lead to another variant having a stronger association test statistic. Identifying the causal variants requires downstream fine-mapping of GWAS results and functional follow-up.
Linkage disequilibrium: (LD). When two alleles from different loci segregate together non-randomly in a population. Physical proximity, recombination rate and evolutionary time shape the patterns of LD across a chromosome.
Pleiotropic: Pertains to pleiotropy, which is when a genetic variant influences more than one phenotype. The effect (trait risk increasing or decreasing) across the phenotypes can be in the same direction or opposing directions.
Summary statistics: Results summarizing the findings from the study population without disclosing individual-level data. Typically for a genome-wide association study (GWAS), these results will include all the variants tested, the estimated effect size and direction of effect, and the outcome of a test for statistical significance (for example, P value). Summary statistics can be meta-analysed with other studies, or used for other types of analysis, such as causal inference testing or calculation of PSs in a new study population.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Niemi, M.E.K., Daly, M.J. & Ganna, A. The human genetic epidemiology of COVID-19. Nat Rev Genet 23, 533–546 (2022). https://doi.org/10.1038/s41576-022-00478-5

Download citation

Accepted: 18 March 2022
Published: 02 May 2022
Issue Date: September 2022
DOI: https://doi.org/10.1038/s41576-022-00478-5

This article is cited by

A regional genomic surveillance program is implemented to monitor the occurrence and emergence of SARS-CoV-2 variants in Yubei District, China
- Fangyuan Liu
- Peng Deng
- Jin Yan
Virology Journal (2024)
Chromosome-Y haplogroups in Asturias (Northern Spain) and their association with severe COVID-19
- Mar González-Fernández
- Daniel Vázquez-Coto
- Eliecer Coto
Molecular Genetics and Genomics (2024)
Associations between polygenic risk score and covid-19 susceptibility and severity across ethnic groups: UK Biobank analysis
- Raabia Farooqi
- Jaspal S. Kooner
- Weihua Zhang
BMC Medical Genomics (2023)
Causal associations between chronic hepatitis B and COVID-19 in East Asian populations
- Zhenguo Liu
- Linnan Song
- Yongyin Li
Virology Journal (2023)
Interfer-on time: lessons from genetically diverse mouse models of SARS-CoV-2 infection
- Shelly J. Robertson
- Sonja M. Best
Genes & Immunity (2023)