To Ne or not to Ne? that is the question

This blog post is written with guest blogger Bruno Fady, so the pronoun used is “we”. Welcome Bruno!

A few days ago, On 13 June 2023, we attended a very interesting webinar about the estimation of effective population size (Ne) and its usage as a genetic indicator for (forest population) conservation. The webinar is part of the EUFORGEN webinar series.

The webinar consisted of a trio of very insightful talks given by one of us (Bruno Fady, INRAE URFM, Avignon, France), by Juan José Robledo-Arnuncio (INIA-CSIC, Spain), and by Sean Hoban (Morton Arboretum, IL, USA). We should perhaps mention that Robin S. Waples (University of Washington, Seattle, Washington, USA), one of the largest single contributors to the literature on effective population size, also attended and took part to the very dense debate that followed the talks. Overall, it was a great piece of science!

Here, we’d like to highlight some key points of the discussion and to develop it a little further with some provocative thoughts.

The first point we’d like to stress: notwithstanding the complications in estimating it, effective population size is a key demographic parameter in the field of population genetics. It has a direct relationship with genetic drift and we should do all we can to obtain estimates of it, to characterise the evolutionary potential of species and populations.

In nature, reproductive success is very often unequal. Fierce struggles are sparked by the drive to gain (exclusive) access to reproduction (Picture credits: Heather Smithers, from Wikimedia Commons)

That said, we can only notice that the three speakers, and all those who asked questions and made comments, spent a good deal of time and words to explain how biased Ne estimates can be, and how easy it is that they are biased. It is actually way more likely that an estimate is biased than unbiased. Truckloads of reasons for this; indeed, the only case in which genetically derived estimators are unbiased seems to be an ideal Wright-Fisher population (a finite, constant-sized, panmictic, isolated population without selection nor overlapping generations). In all other cases, Ne estimates will likely end up estimating something (sub-population size, neighbourhood size, inbreeding, you name it), but not Ne itself… and there is no way to tell exactly what is being estimated. This is particularly true when one tries to estimate contemporary Ne (as opposed to historical Ne), the one that matters most for conservation, the one susceptible to change dramatically in case of population collapse or breakdown of population mating system.

Alas, it seems almost impossible to get a decent estimate of Ne from genetic data, in the case of large, continuous, structured populations, such as the ones that we very often deal with in forestry.

(we’re currently running some simulations to check the effect of limited sample size and demographic change on genetically-based estimations of contemporary Ne: from preliminary results, we can say that it is hopeless. More on this when the simulations are complete. On the theoretical side, you can see this paper about how Ne is affected by meta-population dynamics)

So, the relevant question to ask is: why should we try to estimate Ne from genotypic dataShould we give up attempts at estimating Ne in this way for practical purposes, such as delivering an indicator of adaptation risk?

Let us take a step back and look at what Ne is and why we try to estimate it that way.

Ne is not per se a genetic parameter. Estimated locally (contemporary Ne), it describes the way (or how many) fertile individuals contribute to reproduction: do they all contribute equally? Or is their contribution uneven? How many do reproduce? These are demographic, or population-dynamic, matters. You can handle them (and measure them!) without the slightest knowledge of genetics.

In a population with unequal contributions to the next generation, effective population size can be very small relative to the total number of potential parents (and relative to the number of actual parents)

Of course, Ne matters to geneticists because it has enormous consequences on true genetic parameters: levels of inbreeding and of genetic diversity in particular. In a finite population, inbreeding increases, and genetic diversity decreases, at rates directly proportional to Ne. So, of course, it is super relevant for geneticists. But here is the twist: because Ne drives population-genetic quantities, we have become accustomed to the very convenient strategy of estimating Ne based on estimates of those quantities (linkage disequilibrium, genetic diversity, and so on). The high-throughput sequencing bonanza has worsened our addiction to genetically derived Ne estimates.

It does look like the chain of reasoning ends up being somewhat tautological: Ne  is an important driver of genetic parameters; we know how to estimate genetic parameters; we use genetic parameters to derive Ne; Ne becomes itself a genetic parameter and we restrain ourselves to estimating it form other genetic parameters.

Yet all those Ne estimation methods come with all those biases, to the point that they make estimates useless, and in spite of our beliefs and desires, effective population size is still not a population genetic quantity.

This is not, though, a reason for despair. Working with trees comes with disadvantages, surely (like all those biases that make Ne estimations useless), but also with some assets.

For example, seed dispersal is mostly local for most species, and getting a crude estimate of the contribution of individual trees to reproduction by counting young and recruited seedlings in quadrats should not be too complicated (there are more sophisticated methods to assess single-tree fecundity, of course, like the SEMM-based methods developed by Klein et al. (2013), and direct observations of cone and fruit production through aerial surveys, like the ones we develop in the FORGENIUS project). Individual contributions to reproduction are the very building blocks of effective population size—i.e. the number of reproducing individuals (with equal contribution). Thus, counting the individuals that reproduce, even without accounting for their relative contribution, will likely be less biased than any genetic marker-based, indirect estimate.

So, back to basics: if we want to estimate a demographic parameter, like Ne, let us estimate it directly! No need to worry about the bias of indirect methods, resting heavily on assumptions that almost never hold. Let us go out there and count reproducing trees. It will be sweet to discuss those numbers in the evening, once back to the Forest Genetics Campsite.

Leave a comment