Publication bias occurs when published studies differ systematically from all conducted studies on a topic. Publication bias arises when studies with statistically significant results or positive results in a specific direction are more likely to be published compared to studies without statistically significant results or negative results. Reviewers should make all reasonable efforts to include in their systematic review all or most of all relevant studies, regardless of the nature of reports (published or unpublished. Publication bias can have a detrimental effect on the validity of systematic reviews (Deeks et al 2008). Funnel plots are a method of investigating the located studies in a meta-analysis for publication bias, they are scatter plots in which an effect estimate of each study is plotted against a measure of size or precision (i.e. standard error) (Deeks et al 2008). The largest studies should be closest to the ‘true’ value, with the smaller studies spread on either side; creating the shape of a funnel if publication bias is not present. If publication bias has had an effect on the studies available (and there are no other confounding factors) then the ‘funnel’ should be incomplete with an area missing (Deeks et al 2008). Generally the best way to minimise the impact of publication bias on a systematic review is the inclusion of trial registries and unpublished studies or grey literature (Lau et al 2006; Sterne et al 2011). Funnel plots suffer from numerous issues including low power, numerous alternative explanations for asymmetrical distribution of studies, and inaccurate researcher interpretations of plots (Lau et al 2006; Sterne et al 2011). However, they remain a useful and popular way of investigating publication bias (Deeks et al 2008). Potential reasons for funnel plot asymmetry other than publication bias include: poor methodological quality leading to exaggerated effects in smaller studies (which can be the result of poor methodological design, inadequate analysis, or fraud), true heterogeneity, artefactual causes (in some situations sampling variation can lead to an association between the two factors (effect estimate and measure of size or precision)) and chance (Sterne et al 2011). The visual inspection of funnel plots introduces great uncertainty and subjectivity. In a survey utilizing simulated plots, researchers had only 53% accuracy at identifying publication bias (Lau et al 2006). A very liberal minimum number of studies for the performance of a funnel plot to be justified is ten (Lau et al 2006).

Statistical tests for funnel plot asymmetry (also known as tests for publication bias) investigate the association between effect size estimate and measure of study size or precision. The most popular statistical tests for funnel plot asymmetry are Egger test, Begg test, and the Harbord test. These tests were developed based on the following assumptions: large studies are more likely to be published regardless of statistical significance; small studies are at the greatest risk for being lost; in small studies only the large effects are likely to be statistically significant therefore published small studies often show larger effect sizes compared to larger studies; small and unfavorable effects are more likely to be missing; small studies with large effect sizes are likely to be published (Jin et al 2015). Null statistical hypotheses for these tests reflect the hypothesis of symmetry of the plot, that is, the hypothesis of no publication bias. A finding of not statistically significant P-value for the asymmetry test does not exclude bias. These tests are known to have low power.

A statistical test for funnel plot asymmetry investigates whether the association between effect estimate and measure of study size or precision is larger than what can be expected to have occurred by chance (Sterne et al 2011). These tests are known to have low power and consequently a finding of no evidence of asymmetry does not serve to exclude bias (Sterne et al 2011).

The Begg’s Test was proposed by Begg and Mazumdar in 1994. It is used for dichotomous outcomes with intervention effects measured as odds ratios. It is an adjusted rank correlation test (Jin et al 2015). It explores the correlation between the effect estimates and their sampling variances (Jin et al 2015). It is a very popular test, however, it has low power; some statisticians do not recommend its use. It is “fairly powerful” for meta-analysis of 75 studies; it has “moderate power” for meta-analysis of 25 studies (Begg and Mazumdar 1994). It is considered that the test has “appropriate” type I error rate (Jin et al 2015).

The Egger’s test was proposed by Egger et al in 1997. It is used for continuous outcomes with intervention effects measured as mean differences. It is a “regression test”, that is, it uses a linear regression approach (Jin et al 2015). The standard normal deviate (estimated effect size/estimated standard error) is regressed against the estimate’s precision. It is a very popular test. It is considered that the test has “inappropriate” type I error rate when heterogeneity is present and the number of included studies is large (Jin et al 2015). The Egger test for funnel asymmetry is the most cited statistical test for publication bias.

The Harbord Test was proposed by Harbord et al in 2006. It is used for dichotomous outcomes with intervention effects measured as odds ratios. The test uses “a weighted regression model” (Jin et al 2015). It is considered that the test has “inappropriate” type I error rate when heterogeneity is present. It was contended that the Harbord Test has better error rate compared to Egger’s test in balanced trials with little or no heterogeneity (Jin et al 2015).