Finding discrepancies between the predictions of fundamental theories and experimental observations is the main driver to develop physics further — the route to more advanced theories (‘new physics’) that fix the discrepancies. In that sense, quantum electrodynamics (QED) is currently seen as the most advanced fundamental theory, serving as the blueprint for any other quantum field theory. Progress is expected to come from ever more precise testing through comparison of theoretical predictions and experimental data. A good test compares values that can be both computed and measured with high accuracy. Some QED predictions excel in that respect, such as for the transition frequencies of atomic hydrogen1 and the gyromagnetic ratio of the electron2.

Most theories, including QED, depend on parameters that have to be adjusted to the experimental data. This means that the number of measurements must exceed the number of parameters, so that one can obtain several values for each of the parameters. The test is passed if the values all agree within their respective uncertainties.

Precision-spectroscopy determinations and computations of transition frequencies of atomic hydrogen provide the best test for QED. The Committee on Data for Science and Technology (CODATA) regularly surveys experimental results and derives the most likely values of associated parameters. The QED expression for the hydrogen energy levels effectively comes with two parameters: the Rydberg constant R and the proton radius rp. Other parameters, such as the fine structure constant and the electron-to-proton mass ratio, appear as well, but can be better determined from other experiments. (Note that the Rydberg constant may be expressed as a combination of other physical constants, but they are known with lower accuracy.)

Until 2010, the 15 distinct measurements of transition frequencies in atomic hydrogen as used by CODATA gave 13 value pairs for R and rp. The values were consistent and hence QED passed the test. Averaging the values for the proton radius gave rp = 0.8764(89) fm (pictured; the ‘large radius’ labelled ‘H world data’ in the figure)3. This situation changed, however, when the frequency of a particular transition (the 2s–2p transition) in muonic hydrogen was measured. Muonic hydrogen is just like regular hydrogen but with the electron replaced by its big brother, the muon4. With this replacement the proton-radius term in the theoretical description — and thus the sensitivity to this parameter — is seven orders of magnitude larger than for regular hydrogen. The result was a much more precise but also significantly smaller value of rp = 0.84087(39) fm (the ‘small radius’ labelled ‘Muonic hydrogen’)5. This meant the QED test failed and CODATA cannot use this value for averaging. The discrepancy between the small and large charge radius amounts to four combined standard deviations (4σ). This problem was dubbed the proton radius puzzle.

In addition to the hydrogen data the CODATA team uses data for the proton charge radius obtained from electron–proton scattering (point labelled ‘CODATA’). This increases the discrepancy to 5.6σ and triggered intense discussions in the community whether or not this should be seen as a hint of new physics. It should be mentioned though that electron–proton scattering experiments are notoriously difficult to evaluate and values for rp from different groups disagree.

The cleaner way to test QED is to compare only quantities that should obey the same physics, namely various transitions in regular and muonic hydrogen. After publication of the muonic hydrogen results, our group remeasured one of the broader hydrogen lines with better accuracy. Our motivation was that the discrepancy with the muonic value only shows up when all available hydrogen data is averaged. Our latest result for the 2s–4p transition frequency is as accurate as the previous ‘world data’1 and supports the ‘muonic’ proton radius (point labelled ‘MPQ 2s–4p’ on the figure). As a result, the new hydrogen ‘world data’ is no longer consistent with QED, and the mystery deepens.

It should be mentioned that one finds the same discrepancy for R (upper scale in the figure), so the issue could just as well have been termed the Rydberg puzzle. But misnomers are widespread in physics because names for phenomena are typically given before they are fully understood. In this context it is interesting to note that one obtains a similar discrepancy for the deuteron radius when comparing regular and muonic deuterium6. This may mean that the deuteron got smaller along with its proton inside, but there are several other possible explanations ranging from as yet undiscovered experimental uncertainties to computational errors in applying QED. Nobody really knows at this time.