Introduction

Many psychiatric disorders are characterized by a strong sexual difference including different prevalence, age of onset, symptom severity, and responses to medications. For example, males are 3–4 times more likely to develop autism spectrum disorder (ASD) [1, 2], and typically have an earlier age of onset and a worse course of treatment for schizophrenia (SCZ) [3]. Females are 2–3 times more likely to develop major depression disorder (MDD) [4] and exhibit greater symptom severity, greater functional impairment, more atypical depressive symptoms and higher rates of comorbid anxiety [5]. Understanding the basis of sex difference in these disorders can provide important insights into their etiology and offer an opportunity to deliver sex-specific treatments and care.

At least four models have been proposed to explain sex bias of psychiatric diseases [6,7,8]: specific susceptibility genes that reside on the X or Y chromosome [9], differential genetic liability thresholds between the sexes [7], major influences of hormonal levels in the sexes [10], and gene–sex interactions [11]. A recent study that systematically evaluated the four models proposed that genetic–environmental interaction has a strong contribution of sex bias in psychiatric disorders [12]. However, the molecular mechanisms that link genetic–environmental factors to sex-biased phenotypes are unknown.

Epigenetics is the product of genetic and environmental influences [13], thus epigenetic modifications of DNA are attractive candidates for explaining sexual differences. DNA methylation, the best-studied type of DNA modification, has been reported to play important roles in sexually differential characteristics of the human brain [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]. For example, McCarthy et al. [26] conducted a meta-analysis on multiple tissues including brain and found sex-specific methylated genes related to immune response, RNA splicing, and DNA repair. Xu et al. [21] reported sex-specific methylated genes that participate in ribosome structure and function, RNA binding, and protein translation in adult postmortem prefrontal cortex. Spiers et al. [21] analyzed sex-differential methylation in fetal brain and found a highly significant correlation with results from Xu et al., indicating that most sex differences in the brain methylome occur early in fetal development and are stable throughout life. However, these studies only focused on DNA methylation and did not study the regulatory networks associated with this epigenetic modification. It is unknown whether or how these regulatory components, which contain upstream genetic regulators [30] and a cascade of downstream gene expression and associated protein networks, could influence sex bias of psychiatric disorders.

The purpose of this study is to describe the landscape of sex-differential DNA methylation, explore its regulatory networks, and evaluate their potential involvement in psychiatric disorders. Our hypotheses are: (1) sex-differences exist at both DNA methylation and its regulatory networks; (2) psychiatric disorder-related genes have different methylation levels or different methylation regulation between male and female. We compiled data from 1408 postmortem brain samples from three collections and investigated sex-associated individual CpG loci (differentially methylated positions, DMPs) and genomic regions (differentially methylated regions, DMRs). Then we investigated the related genetic, transcriptomic and proteomic regulatory networks of DMPs or DMRs. Further, we explored their contribution to the sex bias of psychiatric disorders. We found 2080 genes with sex-differential methylation that have been previously associated with psychiatric disorders. These genes are enriched in synapse-related and signaling pathways.

Materials and methods

To systematically explore sex-differential DNA methylation profiles and related regulatory networks in human brain, we obtained data of 1408 human postmortem brain samples from three collections, the Religious Orders Study and the Rush Memory and Aging Project (ROSMAP) [30], Jaffe et al. [19], and Horvath et al. [31]. (Fig. 1). The ROSMAP dataset was generated from dorsolateral prefrontal cortex (DLPFC) of 698 nonpsychiatric controls which contained 227 males and 471 females. For the collection of Jaffe et al., we used the DLPFC data from 450 controls without any known history of psychiatric disorders (158 female, 292 males) across the lifespan. For the collection of Horvath et al., 260 control samples (130 females, 130 males) from multiple brain regions were collected, including caudate nucleus (n = 12), cingulate gyrus (n = 12), cerebellum (n = 32), frontal cortex (n = 41), hippocampus (n = 25), midbrain (n = 18), motor cortex (n = 12), occipital cortex (n = 33), parietal lobe (n = 23), sensory cortex (n = 12), temporal cortex (n = 29), and visual cortex (n = 11) [31].

Fig. 1
figure 1

Overview of the study design

DNA methylation data

DNA methylation was characterized using Illumina HumanMethylation450 BeadChips to interrogate more than 485,000 methylation sites in the three collections. Raw data (idat format) were provided by both ROSMAP and Jaffe et al. (GSE74193), while the β value matrix was provided by Horvath et al. (GSE64509). We used the ROSMAP data as the discovery dataset for sex-differential DNA methylation profiling since it had the largest sample size. GSE74193 and GSE64509 were used as the replication datasets. We used data from all the brain regions in GSE64509 to replicate the results from discovery dataset.

Gene-expression data

Gene-expression data were obtained from ROSMAP samples using RNA-sequencing from DLPFC of 540 individuals (a subset of DNA methylation samples). Gene-expression data were normalized using fragments per kilobase of transcript per million (FPKM) values. Detailed descriptions of data acquisition, RNA-seq protocols, and the process pipeline are as described previously [32].

MeQTL data

The meQTL data were obtained from the Jaffe et al. and ROSMAP. In the ROSMAP study, Ng et al. performed meQTL mapping between single polynucleotide polymorphisms (SNPs) and methylation in 5 kb windows among 463 individuals. In total 9,939,236 SNP-methylation pairs were tested which contained 2,358,873 SNPs and 412,152 CpG sites, resulting in 693,696 significant meQTL pairs (383,920 SNPs with 56,973 CpG sites) using a Bonferroni corrected p value threshold (adj. p < 0.05, two-tailed) (detailed procedure of meQTL mapping is described previously [32]). Jaffe et al. performed meQTL mapping in 20 kb windows among 258 individuals. In total 7,426,085 SNPs and 477,636 CpG sites were analyzed, resulting in 4,107,214 significant meQTLs at a false-discovery rate (FDR) < 1%. In this study, we combined these two meQTL datasets, and used only the reproducible meQTLs pairs that were statistically significant in both datasets.

Protein–protein interaction data

The protein–protein interaction (PPI) data, for building downstream regulatory network, was derived from the Pathway Commons resource [33] based on the procedure described by West et al. [34]. The PPI network consists of 8434 genes (annotated to NCBI Entrez identifiers) and 303,600 interactions.

Quality control and preprocessing

We used the R package ChAMP (version 1.2.1) [35] to process the raw idat format methylation data. The function champ.load was used to remove probes meeting the following criteria: (1) probes with a detection p value above 0.01 in one or more samples; (2) probes with beads count <3 in at least 5% of samples; (3) probes with SNPs as identified in Zhou et al. [36]; and (4) probes that align to multiple locations as identified in Zhou et al. [36]. Probes with a β value of 0 were replaced with 1.00e−6, and probes with missing β values were imputed using a k-nearest neighbor algorithm by impute.knn function in impute package [37]. Samples with more than 1% of probes filtered were removed. We next used beta mixture quantile dilation (BMIQ) in function champ.norm to adjust the β values of type II probes into a statistical distribution characteristic of type I probes, which has previously been shown to best minimize the variability between replicates [38]. After BMIQ normalization, we further filtered the probes based on the high-quality probes [39] defined by Naeem et al. Probes were removed when they had: (1) the variants based on the 1000 Genomes database, (2) small insertions and deletions, (3) repetitive DNA, and (4) regions with reduced genomic complexity that may affect probe hybridization.

Considering the impact of variable cell-types’ compositions on DNA methylation, we calculated the cell-type compositions of the brain tissue using a reference-based method, RefbaseEWAS [40]. We downloaded DNA methylation reference data from 28 control brains, that had been processed by fluorescence activated cell sorting to extract different cell types [41]. We calculated cell-type proportions and used the values as covariates in further analysis. We applied the singular value decomposition method (SVD) [42] to identify unknown covariates. ComBat function was used to correct batch effects and position effects [43,44,45]. Other confounders such as age and postmortem interval were controlled using a linear regression model. Confounder removal was confirmed by surrogate variable analysis [44].

Quality control for gene-expression data involved selecting genes with FPKM > 0.1 in at least ten samples which removed the low-expressed genes. Potential confounders such as batch effects, age, and cell component were removed by SVD. We used the log 2 transformed FPKM value for further association analysis.

Sex-differentially methylated positions and regions

After removal of all confounders, statistical analysis was implemented to identify sex-DMPs and regions (DMRs). Since the M-values (log2 ratio of the intensities of methylated probe versus unmethylated probe) are more statistically valid for the differential analysis of methylation levels [46], we calculated the M value from the β value and used the M value to calculate the differential methylation signal between males and females using limma [47]. After correcting the multiple test burden, we defined the features with FDR < 0.05 as DMPs (Fig. 1a). To detect the sex-differential DNA methylation regions, we used the DMR-finding algorithm DMRcate [48], which clustered the groups of significant probes (FDR < 0.05) within 1 kb as DMR, and excluded DMRs containing less than three CpG sites.

Sex-differential regulatory network

To comprehensively understand the DNA methylation regulatory network, we integrated upstream genetic regulators, downstream gene expression, and protein–protein intereraction (PPI) networks with DMPs and DMRs. The meQTL data from both ROSMAP and GSE74193, gene expression from ROSMAP, and PPI data from Pathway Common were used in this regulatory network (Fig. 1b). We searched for DMPs and potential upstream regulators using reproducible meQTLs from ROSMAP and GSE74193 formed as SNP–DMP pairs. Then we tested the association of DMPs with gene expression by calculating the Spearman correlation between the methylation level of DMPs (β value) and the expression levels of nearby genes (10 kb). This calculation was based on methylation and expression data from 468 brain samples (ROSMAP methylation and expression profiling). FDR was used for multiple testing correction. The associated DMP–gene pairs were defined using absolute value of correlation coefficient >0.3 and FDR < 0.05. Then, using the DMPs as index, we connected the SNP–DMP and the DMP–gene pairs to SNP–DMP–Gene groups.

Protein–protein interaction subnetwork related to sex-differential DNA methylation

We used a functional supervised algorithm, functional epigenetic modules (FEM) [49] to identify subnetworks containing genes exhibiting sex-related differential DNA methylation in promoter regions. Using probe-level analysis by the champ.EpiMod function, the most differentially methylated probe was assigned to each gene, and the PPI subnetworks which inferred the differential methylated module was extracted.

Overrepresentation of psychiatric disorder-related signals in sex-differential loci

To explore whether genes associated with psychiatric disorder show sex-differential methylation or regulation, we tested for enrichment between sexually different DNA methylation genes and SCZ-, ASD-, and MDD-associated genes/loci. We completed a series of comparisons at the SNPs, CpGs, gene, and protein levels (Fig. 1c).

For SNPs level comparisons, we investigated whether genome-wide significant SNPs associated with SCZ, ASD, and MDD were enriched in SNPs which regulated DMPs (SNP–DMP pairs from meQTLs). For CpG-level comparison, we tested whether CpG sites that were associated with diseases from epigenome-wide association studies (EWAS) were enriched in DMPs. For the gene level comparisons, we determined if genes associated with SCZ, ASD, and MDD show sex-differential manner. The sex-differential genes contained DMR genes, DMP associated expression genes, sex-differential expression genes, and genes in the sex-related PPI network. The disorder-related genes came from genetic association, differential expression and co-expression studies. Due to data availability limitations, we studied SCZ, ASD, and MDD separately in SNP and gene analysis, SCZ and ASD in network analysis, and SCZ only in methylation site analysis. Fisher’s exact test was used in the enrichment test. We defined significant enrichment as FDR < 0.05 and odds ratio (OR) > 1.

Psychiatric disorder-related signals

The psychiatric risk gene sets or variants were collected from publications and databases (Table S1). For SNP analysis, we used the latest GWAS results of SCZ [50], ASD [51], and MDD [52]; for CpG analysis, we collected EWAS of SCZ [19, 53]; for gene analysis, we collected the genes from multiple resource which were classified into 36 categories. The gene identifiers were converted to Ensembl Gene IDs in Gencode (GRCh38.p12) using BioMart (https://useast.ensembl.org/index.html).

  1. 1.

    For ASD gene sets, using studies on genetics, differential expression, and co-expression, we examined (1) genes with rare, de novo, loss of function or missense single-nucleotide variants from the NP de novo database [54]; (2) FMRP (Fragile X mental retardation protein) binding targets [55]; (3) candidate genes from the gene reference resource for ASD research database, AutDB [56]; (4) differential expression genes from a recent meta-analysis [57] and the PsychENCODE project [58]; (5) two ASD-associated co-expression modules in postmortem cortex from subjects diagnosed with ASD [59], three ASD-associated co-expression modules from a subsequent RNA-seq study by Gupta et al. [60], and six ASD-associated co-expression modules reported by Parikshak et al. [61].

  2. 2.

    For SCZ gene sets, we examined (1) genes affected by copy number variants (CNVs) [62]; (2) genes identified by linkage and association study [63,64,65]; (3) genes with de novo variants from NP de novo database [54]; (4) genes identified by convergent functional genomics (CFG) [66]; (5) genes identified by Sherlock integrative analysis [67, 68]; (6) genes identified by Pascal gene-based test [67]; (7) genes expressed differentially in SCZ [57, 58]; and (8) two SCZ-associated co-expression modules [69].

  3. 3.

    For MDD gene sets, we examined only genes expressed differentially in MDD [57].

Prioritize the sex-differential psychiatric genes

To prioritize psychiatric candidate genes that are also related to sex bias, we completed a comprehensive integration of the multiple-layers of sex-related genes with the multiple sources of psychiatric-related genes (Fig. 1d). The multiple-layers of sex-related genes contained four types: the DMR genes, DMP-correlated genes, sex-differential expressed genes, and sex-differential network. We identified genes as sex-related psychiatric genes (SRPG) by counting the recurrence of a gene in each category. We developed a generalized score to rank disease related genes, calculated by multiplying the number of times a gene occurred in sex-related genes categories by the number of times the same gene occurred in related psychiatric disease categories.

Functional enrichment

R package missMethyl [70], which can adjust for the different number of probes per gene (also called selection bias), was used to identify the functionally enriched pathways for DMPs and DMRs. We used WebGestatle [71] and WebGestalt-KEGG pathway [71] for functional enrichment tests of psychiatric disorder-related genes, and SRPG, respectively. The minimum number of Entrez gene IDs in the category was set to 5, and the maximum was 2000. Genome-expressed genes were used as reference. The Benjamini–Hochberg test was used for multiple testing. We defined significant threshold as adjusted p value <0.05.

Results

Sex-differential DNA methylated positions and regions

We identified 20,450 DMPs significantly associated with sex (FDR < 0.05) in DLPFC from 166,022 CpG sites (Fig. 2, Fig. S1. Table S2). For the convenience of classification, we named the DMP with higher methylation in females than in males as hypermethylated, and hypomethylated otherwise. Of those 20,450 hits, 75.39% DMPs were mapped to autosomes, which contained 8693 hypomethylated DMPs (56.39% out of the 15,417 DMPs at autosomes). There were 26.50% DMPs mapped to the X chromosome, which contained 1530 hypomethylated DMPs (28.23% out of the 5419 DMPs at X chromosome).

Fig. 2
figure 2

Significance and difference of sex-differential DNA methylated positions. a Chromosome density plot of sex-differential DNA methylated positions, colored by the −log p value in 1 MB window size; b distribution of the effect size of DMPs (variation between male and female average methyaltion levels). The violin plots shows two DMP examples

The DMPs were well-replicated in the two independent replication datasets. In the replicate data of prefrontal cortex, GSE74193, 86.8% autosomes DMPs were replicated (FDR < 0.05), 92.8% X chromosome DMPs were replicated, all of them were consistent in direction with the discovery dataset. In another replicate dataset of multiple brain regions (GSE64509), 72.8% autosomes DMPs were replicated (FDR < 0.05) and 98.6% of those replicated had the same direction as the discovery dataset, while 95.9% X chromosome DMPs were replicated and all of them had the same direction as the discovery dataset.

There were 2428 sex-differential DMRs mapped to 2513 genes (Table S3), containing 1085 genes with only hypermethylated DMPs (DMR_hyper), 1351 genes with only hypomethylated DMPs (DMR_hypo), and 77 genes with both hypermethylated and hypomethylated DMPs (DMR_both). The DMR genes were strongly enriched for gene sets of neuronal function or potentially related to psychiatric diseases such as axon guide (Benjamini–Hochberg adjusted p value (adj.p) = 2.04e−07), MAPK (adj.p = 2.27e−05), and calcium signaling (adj.p = 4.95e−05) (Table 1).

Table 1 DMRs mapped genes-enriched KEGG pathways (Top 10)

Regulatory networks related to sex-differential DNA methylation

To comprehensively understand the regulatory network of DNA methylation, we considered upstream genetic regulators, downstream gene expression and the PPIs which may be influenced by methylation difference. The genetic regulators were defined by meQTL and the downstream gene-expression analysis was based on the correlation between methylation and gene expression.

For upstream regulation, we used meQTLs to study the relationship between genetic variants (SNP) and DMPs. We started by overlapping the meQTL data of ROSMAP and GSE74193 to obtain a list of meQTL with good reproducibility, which included 434,312 meQTL pairs (253,471 SNPs and 45,049 CpGs) that were significant (with FDR < 0.05) in both dataset. From the reproducible meQTLs, we found 22,782 sex-related meQTLs (SNP–DMP pairs) that included 2644 DMPs (12.9 % of the 20,450 DMPs) associated with 18,349 SNPs (Table S4). These results indicated that 12.9% DMPs were regulated by genetic variants.

For target gene expression, we performed correlation analysis between methylation and gene expression in data of DLPFC from the ROSMAP. The correlation test of 20,450 DMP with nearby genes’ expression (10 kb) showed that 1363 DMPs had a significant correlation with 627 genes (FDR < 0.05), forming 1525 DMP–gene pairs (Table S5). These results showed that 6.7% DMPs may influenced the gene expression.

We further used the DMP as a linker between SNPs and genes, and built 3161 SNP–DMP–gene groups, containing 2054 SNPs, 276 DMPs, and 200 genes (Table S6). These SNP–DMP–gene groups connected the genetic variants to gene expression through sex-differential DNA methylation. For example, rs10143703 can regulate cg04842215, and methylation of cg04842215 correlated with expression of CBLN3 (Fig. 3a).

Fig. 3
figure 3

Sex-differential regulatory network. a Example of SNP–DMP–Gene groups. There are 12 SNP–DMP–Gene groups in this region on Chromosome 14: 24,895,387–24,912,111, involving two SNPs, five DMPs and three genes. The diagram shows the location of them while the cartoon diagram shows their relationship. The gray line represents meQTLs with FDR < 0.05. The blue lines represent negative correlation and red lines represent positive correlations. b Example of sex-differential PPI subnetworks. Every node represents a gene. The color of nodes represents differential methylation levels in corresponding promoters (Yellow: hypermethylated in the female; Blue: hypomethylated in female). The edges were built based on the protein-protein interaction in Pathway Common. The width of the edge is the estimation of effect sizes. Stars represent the candidate genes (Green: ASD candidate genes, Red: SCZ candidate genes, Purple: both ASD and SCZ candidate genes)

We further extended regulatory network of DNA methylation by adding PPIs. Expression of many genes may not be influenced by sex directly, but they interact with differently expressed genes to execute their specific functions. To retrieve these related genes, we obtained 19 PPI subnetworks (Fig. 3b, Fig. S2) that were related to our sex-differential methylated DNA. For example, promoters of FOXO4, FTL, BRF2, GREB3L3, and TBCB in these subnetworks exhibited hypermethylation in females (Fig. 3b). In contrast, GADD45A, AKT2, TRO, and RNF220 exhibited hypomethylation in females. Many genes in these subnetworks did not show sex difference, such as CEBPG, CREB3L1, HECW1, TSPYL5, PLEKHO1, USP7, ARNT2, NPAS4, CAT, FOXO3, FOXG1, RBL. Through the interaction with the genes showed sex-biased methylation, these genes who did not show sex difference may function in a sex different way.

Overrepresentation of psychiatric disease signals in sex-differential loci

To learn whether psychiatric disorder-related genes show sex-differential methylation or regulation, we tested for enrichment between the signals related to sex-differential methylation networks and genetic signals associated with psychiatric disorders. Focusing on SCZ, ASD, and MDD, we found significant enrichment at the SNP, methylation site, gene, and network levels.

At the SNP level, we compared the SNPs which regulated DMPs by meQTLs with the GWAS SNPs which associated with SCZ, ASD and MDD. We extracted 9138 SNPs associated with SCZ at p < 5.00e−08 from PGC [72] and found 63 SNPs that regulated DMPs. These SNPs were more likely to regulate DMPs (63 of 9138 compared with the background 18,349 of 8,379,106, Odds ratio for enrichment (OR) = 3.15, p = 8.47e−15). For ASD, 93 SNPs were extracted at p < 5.00e−08 from PGC [51], but none of them regulated DMPs. For MDD, we extracted 912 SNPs associated with MDD at p < 5.00e−08. We did not observe enrichment of MDD-associated SNPs in those that regulate DMPs. However, one SNP, rs61990288, which was associated with MDD, was also a meQTL SNP that regulated a DMP.

At the CpG level, we compared our sex-related DMPs with the EWAS results of Jaffe et al. [19] (n = 750 samples), who tested SCZ brains. Using an EWAS p value less than 5.00e−05 as the cut-off, we extracted 1059 CpG loci associated with SCZ. Among these 1059 CpG sites, 21 CpGs that was associated with SCZ and show sex-differential methylation. We did not find enrichment of sex-differential DMPs (OR = 0.16, p = 1.68e−35). We also used SCZ EWAS results from Hannon et al. [53], who quantified DNA methylation from blood samples. The DMPs were not enriched for CpGs associated with SCZ (OR = 0.01, enrichment p = 4.39e−05), but 81 CpGs was associated with SCZ and show sex-differential methylation.

To determine whether genes associated with ASD, SCZ, and MDD show sex-related differential methylation, we collected disease candidate genes that covered genetics, differential expression, and co-expression studies (Fig. 4a, Table S7). For ASD-related gene analysis, we observed significant enrichment of DMR genes with ASD-related risk genes with loss of function de novo variants (OR = 1.77, p = 5.44e−3), FMRP gene set (OR = 1.71, p = 3.00e−9) and candidate genes from AutDB (OR = 2.86, p  = 2.84e−10), but genes with missense de novo mutations were not enriched in DMR genes (OR = 1.34, p = 5.37e−2). DMR genes were also enriched in ASD-related differentially expressed gene sets and co-expression modules. For SCZ-related gene analysis, we observed significant enrichment of DMR genes with missense de novo mutation genes, loss of function de novo mutation genes, differentially expressed SCZ genes, and also SCZ-associated co-expression genes. However, in contrast with ASD, the DMR genes were not enriched in genes identified by linkage [63,64,65], Sherlock [67, 68], Pascal [67], and CFG [66] in SCZ. For MDD, we did not find enrichment of DMR genes with differentially expressed or co-expressed genes.

Fig. 4
figure 4

A compressive overrepresentation of psychiatric candidate gene sets in sex-biased genes. a Overrepresentation of psychiatric candidate gene sets in DMR genes, DMP-correlated expressed genes, differentially expressed genes, and PPI network genes, clustered by the enrichment value. b Overrepresentation of psychiatric candidate gene sets in DMR genes and subset of DMR genes. The x-axis shows 34 gene sets divided based on the psychiatric disorder and labeled by type; the y-axis shows the DMR genes, DMP-correlated expressed genes and differentially expressed genes. The color of the box shows the odds ratio for enrichment (red for enrichment, blue for deletion). “*” indicates enrichment is statistically significant (p < 0.05), “**” indicates p < 0.001

To take the direction of the DMR into account, we tested for enrichment of disease-related genes in the DMR_hyper genes, DMR_hypo genes and DMR_both genes (Fig. 4b, Table S7). We found upregulated different expression gene sets and upregulated co-expression gene sets in ASD were enriched in DMR_hyper genes, whereas downregulated different expression gene sets and downregulated co-expression genes sets in ASD were enriched in DMR_hypo genes. In contrast, in gene sets of MDD, we found downregulated differential expression genes enriched in DMR_hyper genes.

We also tested the enrichment of DMP-correlated expression genes and sex-differential expressed genes (Table S7, Fig. 4a) against genes associated with ASD, SCZ, and MDD. By clustering analysis, we found DMP-correlated genes had a similar enrichment pattern as DMR genes, but not the sex-differential expressed genes.

While some sex-differential methylated genes are also risk genes, some other risk genes may interact with sex-differential genes through protein networks. Differential methylation can impact functions of both types of genes at the system level. The ASD- and SCZ-related genes were mapped to the PPI subnetworks that exhibit sex-related differential DNA methylation. The sex-differential PPI subnetworks connected the sex-differential genes to psychiatric disorder candidate genes. For example, the sex-differential network—with FOXO4 as a hub gene, a sex-differential gene, interacts with 29 other genes and 20 of them were ASD-related genes (Fig. 3b). Even though they were not sex-differential methylated genes, their functions were affected by their sex-dependent patterns.

Prioritize the psychiatric risk genes that involve sex bias

Since enrichment of sex-related genes was observed among psychiatric disorder-associated genes, we attempted to identify specific risk genes that are under sex-dependent regulation. We defined SRPGs as genes that were associated with sex at least once and associated with at least one of the psychiatric disorders (SCZ, MDD, or ASD) (Table S10). For example, complexin/synaphin gene, CPLX1 was a DMR gene, and its expression level correlated with a DMP. CPLX1 was related with ASD and SCZ from multiple studies involving genetic variants [55] and co-expression changes in postmortem brain of ASD patients [57, 59, 60]. Therefore, CPLX1 was a SRPG.

Of the 13,055 studied genes, we identified 2080 SRPGs (1498 ASD-related, 1349 SCZ-related, and 51 MDD-related). These genes were subgroup of psychiatric disorder genes which enriched in synapse and signaling pathways (Table 2, Table S11). Of the 1498 SPRGs related to ASD, 98 genes were associated with sex-differential features and ASD-associated features in several analyses. The top ten ranking of SPRGs for ASD were CPLX1, HEBP2, SYP, CD99L2, ZC3HAV1, SAT1, HECW1, TRO, CD40, STS, and NRXN3. Among the 1349 SCZ-related SRPGs, 55 genes were supported by multiple lines of disease risk and differential methylation data. The genes ANOS1, MAGI2, CHRDL1, GNG12, MSL3, SMC1A, ITM2A, PLS3, CDK16, ZC3HAV1, and UBTF were ranked in the top ten. Eight genes (AR, WWC3, NOS1, PAX8, GRB7, SYTL1, CLIC6, BEGAIN) of the MDD-related SRPGs were supported by multiple data. Functional enrichment tests showed that the SPRGS with more than two associations with psychiatric disease and sexual differences (n = 653) were enriched in synapse-related pathways like dopaminergic synapse (adj.p = 2.3e−4), glutamatergic synapse (adj.p = 2.9e−4), GABAergic synapse (adj.p = 2.9e−2), and serotonergic synapse (adj.p = 3.8e−2). These SPRGs were also enriched in signaling pathways such as the cAMP (adj.p = 5.3e−3), calcium (adj.p = 2.8e−2), MAPK (adj.p = 2.9e−2), and FoxO (adj.p = 3.8e−2) (Table 2).

Table 2 Prioritized genes enriched KEGG pathway

Discussion

We identified sex-differential DNA methylation and regulatory networks in one of the largest studies of postmortem human brain tissue to date. Thousands of sex-differential DMPs and DMRs were identified and replicated. Regulatory networks that connect the DMPs with SNPs, gene expression, and PPI were built up. Enrichment of psychiatric disease-associated genes in DMPs, DMRs, and networks was detected.

To assess the consistency between our findings and prior results on sex-differential DNA methylation, we compared DMPs in the current analysis with five relevant publications (Table 3). These studies differed from ours either in DNA methylation analysis platform (27K in McCarthy et al. [26]), tissue types (cord blood in Yousefi et al. [20] and whole blood in Singmann et al. [22]), or subjects’ age range (fetal brain in Spiers et al.). The sample size in the current study was much larger than in previous studies. Our results replicated from 10.9 to 45.3% of the probes that passed QC. Totally, 68.4% of our DMPs results are novel findings. These novel findings were based on our strict criteria that contained only the high-quality probes [39] and controlled for potential artifacts such as batch effects, position effects, and cell-type component.

Table 3 Comparison of DMPs in autosome with other published studies

Our data show that sex-differential genes are enriched in pathways known to be important in neurons including axon guidance, MAPK signaling, and calcium signaling. These pathways have been previously suggested as being involved in psychiatric risks. For example, axon guidance pathways strongly influence human speech and language, and deficits in language and communication are hallmarks of ASD [73]. The MAPK singling pathway is reported to determine depression-like behavior and anxiety [74], which may be contribute to the different prevalence between males and females for MDD. Calcium signaling pathways regulate many neural functions involving the generation of brain rhythms, information processing and the changes in synaptic plasticity [75]. Dysregulation of calcium signaling pathways has been implicated in the development of psychiatric diseases such as SCZ [75]. The discovery that sex-differential genes are enriched in these important pathways may help us to better understand the sex-related mechanisms underlying psychiatric disorders.

Our study curated a regulatory system related to sex-differential DNA methylation, which supports our first hypothesis that sex-differences exists in both DNA methylation and its regulatory network. For each of the DMPs, putative upstream genetic regulators and downstream target genes were identified by connecting DMPs with meQTL, genes, and PPI. Therefore, a more complete biological system that either contributes to or is affected by sex-differences come together for their potential involvement to disease risks.

We conducted a comprehensive comparison between the sex-differential methylation regulation system and psychiatric disorder risk factors, providing support for our second hypothesis that psychiatric disorder-related genes have different methylation levels between males and females. We took advantage of numerous types of data including GWAS, rare variants studies, EWAS, differential expression and co-expression studies to capture different aspects of genetic, environmental or the genetic–environmental interaction effects and provide new insights into the disease etiology. We found different methylation regulation systems between male and female enriched in these different types of psychiatric risk factors.

We found common variants that regulate DMPs were enriched in SCZ GWAS signals, which expands the multiple liability model. The multiple liability model assumes the same genetic variants have the same effect on males and females. Our study demonstrated that the same genetic variants have different influences on males and females in DNA methylation. For example, GWAS signals of SCZ such as rs4702 and rs12332385, can regulate sex-related DMPs through meQTL. Therefore, despite only calculating the accumulation of risk alleles in the multiple liability model, both the number of risk alleles and the effect size should be included in the model to explain the sex-bias feature of the disorders. Beside the common variants from GWAS studies, we observed that sex-differential DMR genes were enriched in de novo mutation genes related to SCZ and ASD, which provides evidence that the rare variant genes contribute to SCZ and ASD and show a sex-differential methylated pattern in DLPFC. We identified EWAS signals showing sex-differential methylation, suggesting the baseline methylation level of these EWAS signals of SCZ is different between males and females. In the downstream genes, we found DMP-correlated genes enriched in candidate genes of ASD, SCZ, and MDD. For example, we found significant enrichment of differentially expressed genes in MDD enriched in sex-related DMR_hyper genes. Four CpG sites (cg22466678, cg15296664, and cg08802841 at intergenic region, cg20722088 at 3′UTR) on DUSP6 genes show hypomethylation in female. The DUSP6 has been reported to be a female-specific hub gene which influenced stress susceptibility in females [76].

One of the most interesting findings in our study is from the comparison of sex-differential methylation and expression results with ASD risk genes. We observed enrichment of sex-differential methylation (DMR genes and DMP-correlated genes) in both ASD-risk genes and ASD-related pathways, suggesting the ASD-risk genes may contribute to the sex difference of the disease through DNA methylation but not gene expression. This result expanded the results of Werling et al. [6], who reported that an ASD-related pathway, but not the ASD-risk genes, were enriched in sex-differential expressed genes. ASD-related differentially expressed genes and co-expressed genes, not the ASD-risk genes that had genetic variants, were enriched in sex-differential expressed genes in the current study. However, we observed enrichment of sex-differential methylation (DMR genes and DMP-correlated genes) in both ASD-risk genes and ASD-related pathways. One of the possible explanations is that the DNA methylation as the upstream regulator is more sensitive than gene expression [32]. We found the DMP-correlated genes are more likely to be differentially expressed, which provided an in-directed support to the explanation. Therefore, comprehensive analyses that combine methylation and gene expression are crucial to reach a better understanding of complex diseases and their sex differences.

We found ASD loss of function de novo genes are enriched for DMR genes, while genes with missense de novo mutation are not enriched for DMR genes. Although both loss of function de novo genes and missense de novo mutation are associated with ASD, these different enrichment result remain us to consider the genic intolerance. Genic intolerance is a quantitative assessment of how well genes tolerate functional genetic variation on a genome-wide scale [77]. Genes with de novo mutations in ASD are generally intolerant genes, having important function. It is possible that methylation levels are strictly regulated in such intolerant genes, which result in smaller variation across individuals and cause the statistically significant sex difference. Through the gene intolerant analysis, we compared the residual variation intolerance score (RVIS) of DMR genes and non-DMR genes, and we found the DMR genes have a significant lower mean value of RVIS score (mean DMR = −0.13, mean non-DMR = 0.01, p = 1.754e−10), which means the DMR genes are intolerant genes with important functions (Supplementary method).

Most notably, we found upregulated genes in ASD were enriched in hypermethylated DMR genes in females. Hypermethylation may result in low-gene expression level, which means compare to males, females have a lower expression level of these upregulated genes in ASD. In other words, the relative amount of gene change required for female to reach ASD diagnosis is larger than males, which can explain the different prevalence between male and female. In the contrast, the downregulated genes in MDD were enriched in hypermethylated genes in MDD, which means for the females the relative change is smaller than males to reach the MDD diagnosis. These results provide compelling evidence for the multifactorial model which hypothesis the sex-specific genetic and environmental factors in the sex with lower incidence shift its’ total liability distribution away from the diagnostic threshold.

We prioritized psychiatric genes related to sex-bias and highlighted some important pathways which were sex-differential and related to psychiatric disorders, including important psychiatric disease candidate genes like NRXN1, NRXN2, NRXN3, PDE4A, SHANK2. Our study suggests that the synapse-related pathway and several signaling pathways differ by sex and may be disrupted in psychiatric disorders. For example, dopaminergic, glutamatergic, and GABAergic synapse, all suspected of being involved in psychiatric disorders, all differ between male and female. Studies targeting these genes and pathways should take sex into account in design and analysis. Studies of these genes and pathways may reveal the biology that drives sex-related features of the disorders.

The current study has several limitations. DNA methylation exhibits spatiotemporal patterns that cannot be fully captured. Our analyses utilized gene methylation and expression data from the human adult prefrontal cortex. Other brain regions known to be robustly sex-differential were not represented in our data sets, such as the hypothalamic nuclei. The current study used bulk tissues, not specific cell types, and expression of genes related to psychiatric disorders may vary among brain cell types. Cell-type specific studies based on the single-cell or deconvolutional data should be explored in the future. The majority of our discovery samples were from an older population that was post-reproductive age. This age range does not coincide with the typical age of onset for the major psychiatric disorders. Samples from children, adolescent, young adults need to be explored in the future.