# 2014 Ebola Outbreak: Worldwide Air-Transportation, Relative Import Risk and Most Probable Spreading Routes

## An interactive network analysis

#### August 4th, 2014

Dirk Brockmann1,2,3,*, Lars Schaade1, Luzie Verbeek1

1Robert Koch-Institute, Berlin, Germany

2Institute for Theoretical Biology, Berlin, Germany

3Northwestern Institute on Complex Systems, Northwestern University, Evanston, USA

### Ebola

Ebola hemorrhagic fever caused by a virus is a severe disease with a case fatality rate of up to 90%. Humans can contract the disease from infected animals such as flying foxes, gorillas and chimpanzees. Human-to-human transmission is possible in direct contact with symptomatic patients and their body fluids and an outbreak can emerge. There have been several outbreaks in sub-Saharan Africa in recent years. The current epidemic in West Africa probably goes back to an index patient in Guinea in December 2013. Meanwhile the disease has spread to different countries. This outbreak is considered as an "extraordinary event" and a public health risk to other countries. One reason for this is the high mobility of populations including transnational-border crossing. In July 2014 this also happened in Lagos, the largest city in Nigeria, when an infected patient arrived by airplane and people did not know he had Ebola hemorrhagic fever.

### Challenges

Given a regional outbreak, one of the key challenges of computational, quantitative epidemiology is the assessment and estimation of risks to other regions in the world that is induced by global mobility and transportation. Computational, dynamical or statistical models attempt to estimate import probabilities and likelihoods and related numbers that quantify risk. The interactive network analysis we provide here (see image on the right) displays computationally estimated relative import risks. Before you use the tool, please make sure you understand what these quantities mean and how to interpret them.

#### What is relative import risk?

Assuming that X is an airport in the outbreak region and Y an arbitrary location anywhere in the world, we identify the relative risk with the conditional probability

P(Y|X).

In words: Given an infected individual boards a plane at X, P(Y|X) is the probably that the individual exits the system at location Y. This conditional probability may involve multiple flight legs and multiple possible routes that can be taken from X to Y.

#### What is absolute import risk?

The actual (absolute) import risk is the combined probability

P(Y,X) = P(Y|X) x P(X)

This is the conditional probability P(Y|X) multiplied by the probability that an infected individual boards a plane at X.
Because the probability P(X) is usually much much smaller than P(Y|X) the actual import probability is also much smaller than P(Y|X). For example if only 1 in 1000 passengers that travel from X to Y is infected than the absolute import risk is 1000 times smaller than the relative import risk.

### How to interpret and use relative and absolute risk?

Using the computational methods briefly described below we estimate the conditional probabilities P(Y|X) for each possible destination airport in the worldwide air-transportation network (WAN), given an Ebola infected patient enters airport X in the outbreak region. The relative import risk is useful for comparisons without knowing P(X). For example, say X is a location in the outbreak region and A and B are locations or sets of locations anywhere in the network and let's say we have

P(A|X) = 10%
P(B|X)=1%

The import probability to A is thus 10 times larger than to B. This says nothing about the absolute import probabilities P(A,X) and P(B,X) because these require knowledge of P(X), the probability that an infected individual actually boards a plane at X. P(X) is very difficult to assess as it depends on a number of parameters that we do not know, and some things that are impossible to measure.
On can, however, get a rough estimate of the order of magnitude of P(X) in the context of the Ebola outbreak. Say we have a country with a population N of approx. 10 Mio. (like Guinea for instance). Further assume we have between I = 10-100 infected, asymptomatic individuals that can potentially board a plane. Let's assume that approx. 1000 individuals leave the country every day on average. The probability that at least on infected person is among these is approx:

P(X) ~ 1 - (1 - I/N)^N ~ 0.1 - 1 %

This number is based on the assumption that infected individuals are distributed uniformly in the local population, which they are not. We can expect that 0.1-1% / day is an overestimation of P(X). Nevertheless, if we assume this as an upper bound for P(X) we obtain for the actual daily import probability for A and B would be

P(A|X) = 0.01-0.1% / day
P(B|X)=0.001-0.01% / day

Relative probabilities are mostly determined by properties of global mobility and can be estimated based on data on the worldwide air-transportation network. Absolute probabilities are determined by factors at the outbreak site and are subject to strong variability.

#### Why is relative risk a useful quantity?

The great advantage of relative risk is that this quantity is predominantly shaped by properties of the air transportation network and independent of local factors. We can use this quantity to study the impact of potential containment strategies, for example the shutdown of specific airline connections to the outbreak region. For example, the relative import risk from Freetown, Sierra Leone (FNA) to London, Heathrow (LHR) is 3.04 %, and from Heathrow's distribution propensity (explained below, essentially the degree to which LHR distributes infecteds further) is 28%. If the connection FNA-LHR is shutdown, the relative risk of arrival at LHR is reduced by a factor of 10 to 0.3% and the distribution propensity is reduced by a factor of 100 to 0.2%.

### A model for the global spread of infectious diseases during early phases of the outbreak - Stochastic Estimates of Relative Risks

For a number of diseases (e.g. SARS, pandemic influenza and others), sophistical computer simulation frameworks exists that incorporate comprehensive datasets on global transportation and in combination with mathematical models for the local dynamics of a specific diseases. These high-level computational approaches can make surprisingly reliable predictions concerning the time-course and global dynamics across a wide range of length scales and have become an extremely powerful tool for understanding global disease dynamics.

However, particularly during the early phase of an outbreak, when specifics of the situations are still unclear, unknown or poorly understood, it is difficult to perform detailed computer simulations because they require fixing a multitude of parameters, or, if parameters are unknown, systematic parameter scans. These are computationally resource and time demanding. Furthermore, models that are useful in one context may not be useful in another because substantial differences may exist between e.g. disease specific mechanisms or parameters.

In addition to developing more fine-tuned models, each suitable for specific contexts, it is thus also important to develop techniques that focus on features that different diseases share and extract information that will be valid (within limits) irrespective of disease specifics and can be practically used during the early phases of an outbreak, inform about spreading aspects early on, and may guide decision processes and help establish an intuition about global risks.

In a recent study (D. Brockmann & D. Helbing, Science (2013)) we showed that valuable information about potential, global spreading patterns can be obtained using a geometric approach to the problem and leveraging methods from complex network theory. In this paper, we showed that contagion phenomena in complex, strongly heterogeneous transportation networks are dominated by the most probable pathways a disease can take through the network. This allowed the introduction of a novel type of effective distance that is a reliable predictor of epidemic arrival times.

To estimate risk for the 2014 Ebola outbreak we used insights from this study and devised a stochastic model for the probabilities of ensembles of paths an infected individual that enters an source node in the network (e.g. one of the airports in the outbreak region) can take. The details of this model are currently being prepared for publication. In a nutshell, for every path from X to Y, potentially via a sequence of intermediate locations, we compute two quantities, the probabilities for every step along the path, and the probability that the current location is the destination. Both types of quantities are computed from traffic flux across the worldwide air transportation, a network of approx. 4000 airports and approx. 25000 direct links. Every simulated individual is going to reach a destination somewhere in the network. For two locations X and Y we integrate over all possible paths that contribute to the individual exiting the system at Y.

In addition to the overall relative import risks this method estimates, it provides topological information of the most likely spreading pathways and the roles different airports play in the global distribution of risk. For instance, from the perspective of a chosen location in the outbreak region, the most likely spread to other locations is equivalent to a tree structure in which different airports have different numbers of branches. If an airport has many branches, it plays a more important role (imposes a larger risk) of further disseminating the disease. A clear example of this can be seen by comparing the perspective of the potential Ebola spread from Sierra Leona on one hand, and Guinea on the other. Whereas in the former UK airports play a dominant role in terms of distribution propensity, in the latter case Paris, CDG does.

### Interactive Risk Assessment

In order to better understand the various aspects of the current Ebola situation and the complexity of potential spreading pathways of risk throughout the global air-transportation network, we devised an interactive website that illustrates the most probable spreading pathways and provides risk information for different nodes in the network and cumulative information for different countries. The website is a prototype of an interactive analysis tool that we are developing to aid public health scientists to better understand the global impact of outbreak situations, to help developing a better intuition about it and potentially identify most efficient mitigation strategies.

For example, the tool illustrates the quantitative and topological impact that air traffic reduction may have on the spread of Ebola across the network, including a few examples of the impact of single connection removal in the network.

The tool will open in a separate window if you click on the image on the top right. The usage of the tool is fairly self-explanatory. A few directions are provided on the bottom right in the tool window: When you hover over a node, node information will be displayed, clicking on a country will highlight the country's nodes in the network and cumulative information will be displayed. In the top left of the tool window, you can pick from different perspectives, i.e. different root or reference nodes.

The tool only depict a reduced network of the 1227 largest airports to allow better rendering and avoid clutter.