Introduction

Bipolar disorder (BD) is a severe chronic mood disorder characterized by episodes of mania, hypomania, and alternating or intertwining episodes of depression, with a worldwide prevalence of ~1% [1,2,3]. Acute bipolar mania can be a medical emergency, often leading to psychiatric hospitalization to protect individuals from hyperactive and impulsive activity, and sometimes involving the intervention of law enforcement agencies responding to dangerous behavior [1,2,3].

Pharmacotherapy is one of the main treatments for acute bipolar mania [1,2,3]. Recent guidelines recommend various second generation antipsychotics (SGAs), lithium, and valproate as first-line monotherapy for adults with acute mania [4,5,6]. The acute mania section in these guidelines was developed evidence-based recommendations citing two important network meta-analyses [7, 8]. However, clinical trials of some newer drugs have been conducted for individuals with acute mania after publication of these meta-analyses. Moreover, these network meta-analyses did not evaluate the following important outcomes: clinical remission, efficacy for psychotic symptoms, and the risk of individual adverse events. Therefore, we conducted a systematic review and network meta-analysis for 21 outcomes related to the efficacy, acceptability, tolerability, and safety of 23 drugs in the treatment of adults with acute bipolar mania.

Materials and methods

This study was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines [9] (Supplementary Table S1) and was registered on the Open Science Framework (https://osf.io/tcd9a/). At least two authors double-checked the literature search, data transfer accuracy, and calculations.

Search strategy and inclusion criteria

Detailed information about the search strategy is shown in Supplementary Fig. S1. The inclusion criteria for studies were as follows: (1) published and unpublished randomized controlled trials (RCTs) of oral monotherapy lasting for ≥10 days, (2) studies of adults with acute bipolar mania, and (3) double- and single-blind studies. The exclusion criteria were as follows: (1) open-label studies, (2) studies in which selection bias was evaluated as high risk according to the Cochrane risk of bias (ROB) criteria [10], (3) studies including children/adolescents with mania, (4) studies that included individuals with a dual diagnosis of BD and other disorders, (5) studies that allowed antipsychotics as a rescue medication during a trial, and (6) studies that terminated early without efficacy analysis. We searched PubMed, the Cochrane Library, and Embase databases for studies published before March 14, 2021.

Data synthesis, outcome measures, and data extraction

The primary outcomes for efficacy and acceptability were response to treatment and all-cause discontinuation, respectively. The secondary outcomes were improvement of mania symptoms and discontinuation due to inefficacy. Other outcomes included clinical remission, improvement of psychotic symptoms, discontinuation due to adverse events, discontinuation due to withdrawal consent, depression, and individual adverse events. We targeted outcome assessments at 3 or 4 weeks. For studies without 3- or 4-week data, we used data at the points closest to 3 weeks over 10 days to 12 weeks. All flexible dose studies were included because they allow investigators to titrate to the optimum dose for each individual. Fixed dose studies that used the dose recommended for mania treatment according to recent treatment guidelines were also included [5]. For drugs in which the recommended dose was not stated, we included fixed dose studies that employed clinically used doses [11]. As the therapeutic dose for nonpsychotropic drugs for mania (e.g., tamoxifen) was unknown, all treatment arms of these drugs were included. For studies involving two or more treatment arms of the same drug with different doses, data from the treatment arms were pooled for analysis, provided that they were administered within a therapeutic dose range [5, 11].

The extracted data were analyzed on the basis of intention-to-treat or modified intention-to-treat principles. If required data were missing in the studies, we searched for the data in published systematic review articles. We also attempted to contact the original investigators to obtain unpublished data. While double-blind studies were included to avoid performance and detection bias for subjective outcomes, single-blind studies were included for objective outcomes [12].

Meta-analysis methods

Both pairwise and frequentist network meta-analyses were performed using the random-effects model [13, 14]. The risk ratio (RR) for dichotomous variables or the standardized mean difference (SMD) for continuous variables was calculated, with 95% confidence intervals (95% CI). Network heterogeneity was assessed using τ² statistics. For pairwise meta-analyses, heterogeneity was assessed using I2 statistics. Statistical evaluation of incoherence was performed using the design-by-treatment test (globally) [15] and the Separate Direct from Indirect Evidence (SIDE) test (locally) [16]. To rank the treatments for each outcome, we used P-scores (Supplementary Table S1) [17]. The assumption of transitivity was evaluated by extracting potential effect modifiers (e.g., sample size, duration of study, and mean age; Supplementary Table S2) and comparing their distribution across comparisons in the network. We classified an overall ROB for every RCT based on the individual ROB items (Supplementary Fig. S2) [18]. A meta-regression analysis was performed to determine whether potentially confounding factors (e.g., publication year, mean age, number of total individuals, male individuals [%], and individuals with psychotic features [%]) were associated with the extent of the effect on primary outcomes for efficacy and acceptability. A sensitivity analysis was performed for primary outcomes, in which only half the weight was given to studies (1) with a placebo arm, (2) supported by industry sponsors, (3) without a high-quality design, (4) without 3–4-week data, (5) including individuals with rapid-cycling, (6) including individuals with mixed state/episode, (7) with a low-dose arm, and (8) that did not use common definition of response to treatment (The common definition is ≥50% improvement in the mania rating scale score; this analysis was performed for the primary efficacy outcome only. Supplementary  Appendix S1) [19]. Moreover, we performed additional network meta-analyses for the response to treatment along the time course lasting for 7–10 days, 3 weeks, 4–6 weeks, and 8–12 weeks to examine when the antimanic effects of these drugs appeared. Funnel plots were created to explore potential publication bias. Finally, the results were incorporated into the Confidence in Network Meta-Analysis (CINeMA) application, an adaptation of the Grading of Recommendations Assessment, Development, and Evaluation approach, to assess the credibility of the findings of each of the network meta-analyses [20,21,22].

Results

Study characteristics

A flowchart of the literature search and a detailed explanation of the process are shown in Supplementary Fig. S1. Of the 13489 articles initially identified, 3572 were duplicates, 9835 were excluded after reviewing the titles and abstracts, and 10 were excluded after reviewing the full texts. In total, 72 articles on eligible studies were selected, and 2 articles were detected from previous review articles. Two articles each included data from two RCTs [23, 24], and one article included data from four RCTs [25]. Of 79 eligible RCTs, 5 single-blind RCTs did not report available data for performing a meta-analysis regarding objective outcomes [26,27,28,29,30]. Two double-blind RCTs did not report any available data for performing a meta-analysis [31, 32]. Finally, 72 double-blinded RCTs (n = 16442, males = 50.93%, mean age = 39.55 years, mean study duration = 3.96 ± 2.39 weeks) were included with the following treatment arms (number of studies (N)/individuals (n)): aripiprazole (9/1205), asenapine (3/620), brexpiprazole (2/321), carbamazepine (6/305), cariprazine (3/612), chlorpromazine (1/10), endoxifen (2/55), eslicarbazepine (2/148), haloperidol (10/1023), lamotrigine (3/173), licarbazepine (1/324), lithium (20/965), olanzapine (14/1565), oxcarbazepine (1/30), paliperidone (2/542), quetiapine (5/630), risperidone (7/676), tamoxifen (2/43), topiramate (4/659), valnoctamide (1/71), valproate (14/981), verapamil (1/17), ziprasidone (3/458), and a placebo (48/5009). The study characteristics are summarized in Supplementary Table S2. In addition, 56 studies were industry sponsored; 14 included individuals with rapid-cycling, and 26 excluded these individuals; 38 included individuals with mixed state/episode, and 11 excluded these individuals; and 35 included individuals with psychosis, and 4 excluded these individuals. While 21 studies were evaluated as low overall ROB, other studies were evaluated as moderate overall ROB (Supplementary Fig. S2).

Network meta-analysis results

The network meta-analysis results are shown in Supplementary  Appendixes S1–S21.

Response to treatment

Aripiprazole, asenapine, carbamazepine, cariprazine, haloperidol, lithium, olanzapine, paliperidone, quetiapine, risperidone, tamoxifen, valproate, and ziprasidone showed a better response to treatment than the placebo (N = 56, n = 14503; Fig. 1, Table 1); the RR (95% CI) ranged from 7.461 (1.876, 29.678) for tamoxifen to 1.281 (1.049, 1.563) for asenapine. Aripiprazole, cariprazine, and quetiapine outperformed eslicarbazepine, licarbazepine, and topiramate; asenapine, lamotrigine, paliperidone, and ziprasidone outperformed topiramate; carbamazepine outperformed asenapine, endoxifen, eslicarbazepine, lamotrigine, licarbazepine, and topiramate; haloperidol, olanzapine, and risperidone outperformed asenapine, eslicarbazepine, licarbazepine, and topiramate; lithium and valproate outperformed eslicarbazepine and topiramate; and tamoxifen outperformed all active-drugs other than carbamazepine and verapamil. Global heterogeneity was low, and the network did not show significant global inconsistency. There was statistical agreement between direct and indirect estimates, except for three comparisons: aripiprazole vs. placebo (aripiprazole outperformed the placebo in both direct and indirect comparisons), paliperidone vs. quetiapine (quetiapine outperformed paliperidone in the indirect comparison but not in the direct comparison), and ziprasidone vs. placebo (ziprasidone outperformed the placebo in the direct comparison but not the indirect comparison). No comparisons included at least 10 studies.

Fig. 1: Response to treatment.
figure 1

Drugs were compared with the placebo. Colors indicate the presence or absence of a significant difference: blue, the drug was superior to the placebo; black, the drug was similar to the placebo. 95% CI 95% confidence interval, ARI aripiprazole, ASE asenapine, CARB carbamazepine, CARI cariprazine, END endoxifen, ESL eslicarbazepine, HAL haloperidol, LAM lamotrigine, LIC licarbazepine, LIT lithium, OLA olanzapine, OXC oxcarbazepine, PAL paliperidone, QUE quetiapine, RIS risperidone, RR risk ratio, TAM tamoxifen, TOP topiramate, VALP valproate, VER verapamil, ZIP ziprasidone.

Table 1 Head-to-head comparisons for response to treatment (left lower half) and all-cause discontinuation (right upper half).

Meta-regression analyses showed that older studies had a higher RR for the response to treatment (Supplementary Appendix S1). Studies including more male individuals had a higher RR for the outcome (Supplementary Appendix S1). The between-study variance of these meta-regression analyses were decreased compared with the primary analysis (Supplementary Appendix S1). Five sensitivity analyses (focusing on studies without a placebo arm, nonindustry-sponsored studies, studies with high-quality design, studies not including individuals with rapid-cycling, and studies not including individuals with mixed state/episode) reduced the between-study variance compared with the primary analysis (Supplementary Appendix S1). However, when compared with the placebo, the effect size for each drug on the outcome of the primary analysis was similar to that of the adjusted analyses (Supplementary Appendix S1).

The data of response to treatment at the time of two or more observational points were available for seven drugs (Supplementary Appendix S1). Compared with the placebo, the effect size of most drugs other than lamotrigine seemed to increase over time. However, the number of studies included in the meta-analysis at all observational points other than at 3 weeks was small.

All-cause discontinuation

Compared with the placebo, aripiprazole, olanzapine, quetiapine, and risperidone had lower all-cause discontinuation (RR [95% CI] ranged from 0.647 [0.552–0.758] for olanzapine to 0.840 [0.719–0.980] for aripiprazole; N = 70, n = 16324; Fig. 2, Table 1), whereas topiramate had higher all-cause discontinuation (1.335 [1.032–1.728]). Aripiprazole, carbamazepine, haloperidol, valproate, and ziprasidone outperformed topiramate and valnoctamide; olanzapine outperformed aripiprazole, asenapine, brexpiprazole, cariprazine, haloperidol, lamotrigine, licarbazepine, lithium, topiramate, valnoctamide, valproate, verapamil, and ziprasidone; paliperidone and risperidone outperformed topiramate, valnoctamide, and verapamil; and quetiapine outperformed lithium, topiramate, valnoctamide, and verapamil. Although global heterogeneity was low, we detected significant global inconsistency. There was statistical agreement between direct and indirect estimates, with the exception of the following three comparisons: aripiprazole vs. haloperidol (aripiprazole outperformed haloperidol in the direct comparison but not in the indirect comparison), aripiprazole vs. placebo (aripiprazole outperformed the placebo in the indirect comparison but not in the direct comparison), and haloperidol vs. quetiapine (quetiapine outperformed haloperidol in the indirect comparison but in the direct comparison).

Fig. 2: All-cause discontinuation.
figure 2

Drugs were compared with the placebo. Colors indicate the presence or absence of a significant difference: blue, the drug was superior to the placebo; black, the drug was similar to the placebo; red, the drug was inferior to the placebo. 95% CI 95% confidence interval, ARI aripiprazole, ASE asenapine, BRE brexpiprazole, CARB carbamazepine, CARI cariprazine, CHL chlorpromazine, END endoxifen, ESL eslicarbazepine, HAL haloperidol, LAM lamotrigine, LIC licarbazepine, LIT lithium, OLA olanzapine, PAL paliperidone, QUE quetiapine, RIS risperidone, RR risk ratio, TAM tamoxifen, TOP topiramate, VALN valnoctamide, VALP valproate, VER verapamil, ZIP ziprasidone.

Meta-regression analyses showed that studies involving more individuals with psychotic features had a lower RR for all-cause discontinuation (Supplementary Appendix S2). The between-study variance of the meta-regression analysis decreased compared with the unadjusted analysis (Supplementary Appendix S2). Compared with the placebo, cariprazine, haloperidol, lithium, paliperidone, tamoxifen, valproate, and ziprasidone had lower all-cause discontinuation (Supplementary Appendix S2). For other drugs, the adjusted analyses had similar results with the unadjusted analysis (Supplementary Appendix S2).

Three sensitivity analyses (focusing on studies without a placebo arm, nonindustry-sponsored studies, and studies with high-quality design) reduced the between-study variance compared with the primary analysis (Supplementary Appendix S2). These sensitivity analyses showed that chlorpromazine and endoxifen had higher, and valproate had lower all-cause discontinuation compared with the placebo (Supplementary Appendix S2). For the other drugs, the adjusted analyses had similar results with the unadjusted analysis (Supplementary Appendix S2).

Mania rating scale scores

Compared with the placebo, aripiprazole, asenapine, carbamazepine, cariprazine, haloperidol, lithium, olanzapine, paliperidone, quetiapine, risperidone, tamoxifen, valproate, and ziprasidone showed better improvement of the mania rating scale scores (N = 61, n = 15466; Fig. 3); the SMD (95% CI) ranged from −1.806 (−2.454, −1.159) for tamoxifen to −0.216 (−0.371, −0.061) for valproate.

Fig. 3: Mania rating scale scores.
figure 3

Drugs were compared with the placebo. Colors indicate the presence or absence of a significant difference: blue, the drug was superior to the placebo; black, the drug was similar to the placebo. 95% CI 95% confidence interval, ARI aripiprazole, ASE asenapine, BRE brexpiprazole, CARB carbamazepine, CARI cariprazine, ESL eslicarbazepine, HAL haloperidol, LAM lamotrigine, LIC licarbazepine, LIT lithium, OLA olanzapine, OXC oxcarbazepine, PAL paliperidone, QUE quetiapine, RIS risperidone, SMD standardized mean difference, TAM tamoxifen, TOP topiramate, VALP valproate, VER verapamil, ZIP ziprasidone.

Discontinuation due to inefficacy

Compared with the placebo, aripiprazole, asenapine, carbamazepine, cariprazine, haloperidol, lithium, olanzapine, paliperidone, quetiapine, risperidone, valproate, and ziprasidone had lower discontinuation due to inefficacy, with the RR (95% CI) ranging from 0.349 (0.216–0.564) for paliperidone to 0.716 (0.534–0.961) for lithium (N = 50, n = 14284; Fig. 4).

Fig. 4: Discontinuation due to inefficacy.
figure 4

Drugs were compared with the placebo. Colors indicate the presence or absence of a significant difference: blue, the drug was superior to the placebo; black, the drug was similar to the placebo. 95% CI 95% confidence interval, ARI aripiprazole, ASE asenapine, BRE brexpiprazole, CARB carbamazepine, CARI cariprazine, HAL haloperidol, LAM lamotrigine, LIT lithium, OLA olanzapine, PAL paliperidone, QUE quetiapine, RIS risperidone, RR risk ratio, TAM tamoxifen, TOP topiramate, VALN valnoctamide, VALP valproate, VER verapamil, ZIP ziprasidone.

Clinical remission and psychotic symptoms

Aripiprazole, asenapine, cariprazine, haloperidol, lithium, olanzapine, paliperidone, quetiapine, risperidone, and tamoxifen outperformed the placebo for clinical remission (N = 31, n = 9320); the RR (95% CI) ranged from 8.441 (1.116, 63.841) for tamoxifen to 1.259 (1.007, 1.576) for lithium.

Compared with the placebo, aripiprazole, cariprazine, haloperidol, olanzapine, quetiapine, risperidone, tamoxifen, and ziprasidone showed better improvement of psychotic symptoms (N = 30, n = 7029); the SMD (95% CI) ranged from −1.640 (−2.335, −0.945) for tamoxifen to −0.266 (−0.490, −0.041) for aripiprazole.

Tolerability and safety outcomes

Compared with the placebo, asenapine (RR [95% CI] = 1.896 [1.117–3.218]), haloperidol (1.867 [1.255–2.776]), and lithium (1.791 [1.093–2.936]) had higher discontinuation due to adverse events (N = 52, n = 14629), while olanzapine had lower discontinuation due to withdrawal consent (0.643 [0.466–0.889], N = 42, n = 11968).

No drug was associated with the incidence of depression compared with the placebo (N = 19, n = 5740).

In addition, compared with the placebo, olanzapine (RR [95% CI] = 0.881 [0.800–0.971]) and quetiapine (0.767 [0.665–0.885]) were associated with a lower frequency of anxiolytic use (N = 28, n = 8082); aripiprazole, cariprazine, haloperidol, paliperidone, risperidone, and ziprasidone were associated with a higher frequency of anticholinergic use (RR [95% CI] ranged from 2.374 [1.384–4.072] for paliperidone to 6.299 [4.159–9.541] for haloperidol, N = 20, n = 6256); aripiprazole, brexpiprazole, cariprazine, haloperidol, paliperidone, risperidone, and ziprasidone were associated with a higher incidence of akathisia (RR [95% CI] ranged from 2.586 [1.188–5.631] for paliperidone to 5.579 [3.959–7.862] for haloperidol, N = 25, n = 8711); aripiprazole, asenapine, cariprazine, haloperidol, lithium, olanzapine, risperidone, and ziprasidone were associated with a higher incidence of extrapyramidal symptoms (RR [95% CI] ranged from 1.817 [1.012, 3.261] for olanzapine to 5.337 [3.997–7.126] for haloperidol, N = 31, n = 9265); aripiprazole, asenapine, carbamazepine, cariprazine, haloperidol, lithium, olanzapine, paliperidone, quetiapine, risperidone, valproate, and ziprasidone were associated with a higher incidence of somnolence (RR [95% CI] ranged from 1.609 [1.055, 2.453] for lithium to 5.158 [1.515, 17.561] for cariprazine, N = 37, n = 10395); asenapine, carbamazepine, haloperidol, olanzapine, quetiapine, valproate, and ziprasidone were associated with a higher incidence of dizziness (RR [95% CI] ranged from 2.037 [1.334, 3.110] for valproate to 3.552 [2.369, 5.323] for carbamazepine, N = 33, n = 8775); carbamazepine (RR [95% CI] = 4.079 [1.109, 15.010]), olanzapine (3.758 [2.147–6.577]), and quetiapine (3.630 [2.243–5.876]) were associated with a higher incidence of dry mouth (N = 16, n = 3967); aripiprazole, cariprazine, olanzapine, and quetiapine were associated with a higher incidence of constipation (RR [95% CI] ranged from 1.735 [1.152–2.613] for aripiprazole to 2.866 [1.537–5.345] for quetiapine, N = 27, n = 6670); and asenapine, olanzapine, paliperidone, quetiapine, valproate, and ziprasidone were associated with a higher incidence of weight gain (RR [95% CI] ranged from 2.928 [1.259–6.807] for ziprasidone to 8.180 [4.419–15.142] for olanzapine, N = 31, n = 8704).

Compared with the placebo, quetiapine was associated with a lower incidence of nausea (RR [95% CI] = 0.313 [0.130–0.758]), whereas aripiprazole, carbamazepine, cariprazine, lithium, risperidone, and valproate were associated with a higher incidence of nausea (N = 29, n = 7915); the RR (95% CI) ranged from 1.558 (1.164–2.085) for aripiprazole to 4.664 (1.320–16.479) for risperidone.

There were no significant differences in the incidence of headache (N = 37, n = 10330) and diarrhea (N = 20, n = 4981) between each drug and the placebo.

Heterogeneity, inconsistency, and network meta-analysis results graded using the CINeMA application

Global heterogeneity was low or low–moderate for most outcomes, moderate–high for discontinuation due to adverse events and diarrhea, and high for depression (Supplementary Appendixes S1–S21). There was considerable local heterogeneity for most of the outcomes in specific comparisons. We detected significant global inconsistency for all-cause discontinuation (as mentioned before), psychotic symptoms, discontinuation due to adverse events, and depression. The SIDE test for local inconsistency showed some hotspots: haloperidol vs. lithium, haloperidol vs. placebo, lithium vs. risperidone, risperidone vs. valnoctamide, and valnoctamide vs. placebo for psychotic symptoms; valproate vs. placebo for discontinuation due to adverse events; and olanzapine vs. placebo for depression. The proportion of comparisons with evidence of inconsistency was few for all outcomes (0.00%–20.83%). However, the within-study bias of most of the comparisons was evaluated “Some concerns.” Moreover, because funnel plots with less than 10 studies were not meaningful [10], all comparisons for publication bias were evaluated as “Suspected.” Furthermore, if the comparison had only indirect evidence, the comparison was downgraded one level. Consequently, the confidence in the evidence was generally evaluated as low or very low.

Discussion

This systematic review and network meta-analysis was conducted to compare the efficacy, acceptability, tolerability, and safety of pharmacological interventions for adults with acute bipolar mania. We included only double-blind RCTs and extended a recent study by including brexpiprazole, endoxifen, and eslicarbazepine, and by investigating many more adverse events [7]. Supplementary Fig. S3 shows two-dimensional graphs of the primary efficacy and acceptability outcomes. The agents that outperformed the placebo in the primary and secondary outcomes were aripiprazole, olanzapine, quetiapine, and risperidone. These SGAs also outperformed the placebo in terms of clinical remission and improvement of psychotic symptoms. Therefore, they appear to have a better balance of efficacy and acceptability in the treatment of acute mania than that of other drugs. As treatments prescribed for an acute mood episode are usually continued into maintenance treatment, clinicians and individuals with mania should consider the efficacy and safety in the maintenance phase when selecting a treatment for the acute phase [4]. A recent network meta-analysis of BD in the maintenance phase showed that these SGAs prevent the recurrence/relapse of any mood episode [33]. However, because these agents have several adverse events, clinicians must monitor individuals with BD for health conditions.

Tamoxifen is the best treatment for all efficacy outcomes except for discontinuation due to inefficacy. However, our results were based on two small studies (<40 individuals in each treatment arm) [34, 35]. Tamoxifen, which is approved to treat breast cancer, has serious and specific side effects, including uterine malignancy, thromboembolic events, and embryo-fetal toxicity [11]. Therefore, large-scale trials are needed to examine the efficacy and safety of tamoxifen in cases with ineffective existing treatments.

Haloperidol and carbamazepine ranked high in most of the efficacy outcomes. However, we found that haloperidol was not well-tolerated and had a high risk of akathisia and extrapyramidal symptoms. Carbamazepine carries a risk of cutaneous adverse reactions, such as Stevens–Johnson syndrome and toxic epidermal necrolysis [11], for which the presence of a specific allele are strong risk factor in some races/ethnicities [36].

Although our network meta-analysis confirmed that lithium and valproate were effective for mania symptoms, they ranked lower in efficacy outcomes compared with most antipsychotics. While these drugs were gradually increased according to the individual’s condition, antipsychotics were used at relatively high doses from the start of treatment (Supplementary Table S2). Moreover, we found that lithium was not well-tolerated. These factors might affect the efficacy outcomes of lithium and valproate. Meanwhile, a recent network meta-analysis demonstrated that lithium and valproate prevented the recurrence/relapse of any mood episode [33]. A Finnish nationwide cohort study also showed that these drugs prevented hospitalization [37]. Moreover, a meta-review reported that lithium had anti-suicidal effects for individuals with mood disorders [38]. Taken together, these findings suggest that lithium and valproate are still key drugs for BD treatment, although clinicians must pay close attention to the side effects of these drugs [33, 39, 40].

Although most antipsychotics improved psychotic symptoms, carbamazepine, lithium, and valproate did not. Thus, these antipsychotics should be reserved for individuals with psychotic features. Although the network showed significant global inconsistency, when performing a sensitivity analysis using only the Positive and Negative Syndrome Scale [41] data, significant global inconsistency disappeared (Supplementary Appendix S6). Furthermore, compared with the placebo, the effect size for each drug on the outcome of the primary analysis was similar to that of the adjusted analysis.

Our study had some limitations. First, 70.8% of the studies included in our meta-analysis were evaluated as moderate overall ROB. Although global heterogeneity was low for the primary efficacy and acceptability outcomes, the sensitivity analyses focusing on studies with high-quality design reduced the between-study variance for both outcomes compared with the primary analysis. Second, mixed episodes are not clustered with hypomania and mania in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) [42]. In the DSM-5, the term “mixed episode” has been changed to “mixed features.” Third, the range of the study duration included in our meta-analysis was 1–12 weeks. Thus, the long-term efficacy and safety of drugs still need to be verified. Fourth, we did not examine whether the magnitude of the placebo-response influenced our results. A recent meta-regression analysis including RCTs of antipsychotics and mood stabilizers compared with placebo demonstrated that the effect size for efficacy was influenced by the magnitude of both the drug- and placebo-response [43]. Further studies will need to explore whether there is an interaction between the drug-response and the placebo-response regarding the effect size. It will also be important to identify modifiers of the drug-response and explore how they interplay with modifiers of the placebo-response in the formation of effect sizes. Finally, we did not cover important clinical issues that might inform treatment decision-making in routine clinical practice (e.g., combination with nonpharmacological treatments and cost-effectiveness).

In conclusions, the aforementioned antipsychotics, carbamazepine, lithium, tamoxifen, and valproate were found to have efficacy for acute bipolar mania. However, only aripiprazole, olanzapine, quetiapine, and risperidone had better acceptability than the placebo. Because these agents carry the risk of several adverse events, clinicians must monitor individuals with BD for health conditions.