Methods and Data Sources

Overview

Over the past twenty years it has been shown that the distribution of tsetse flies is related to climatic conditions (Rogers and Randolph, 1993), and that satellite imagery can provide reliable surrogates for a range of climatic parameters (Hay et al., 1996). More recently, tsetse distributions have been quite accurately mapped for a wide range of ecological conditions, using Fourier processed time series of satellite imagery of various types - especially those relating to vegetation cover, rainfall, temperature and elevation discussed by Rogers et al., (1996) and Gilbert et al., (1999).

These techniques have largely been based on discriminant analytical and maximum likelihood methods which use known presence and absence distribution data to 'train' the prediction process - in essence by establishing statistical relationships between the predictor (satellite image) variables and the observed fly presence/absence data. Output is given as the probability of presence for each sample point in the training dataset. The modelled data can be then compared with the known data to provide various indices of accuracy, such as the proportion correctly modelled, and the proportion of false negatives and positives produced. Typically, the technique gives accuracies of better than 85 percent. These relationships can then be applied to areas which have not been sampled, to provide a predicted probability of presence for areas outside the original training data set.

The present work draws on this experience to produce continent wide predictions of the probability of presence of twenty three tsetse species, as well as the three major species groups. A number of adaptations were made to improve predictions:

A larger training set sample was taken from the same distribution maps as were used previously (Ford and Katondo, 1977), supplemented by more recent information (see notes later). Sample points were taken at regular c. 50km spacings throughout sub-Saharan Africa, resulting in a total of 12,000 data points. · The range of predictor variables was extended to include a series of additional parameters as shown in the Appendix. · Logistic regression modelling was used in place of discriminant analytical/maximum likelihood methods. Logistic regression methods give equivalent or slightly greater predictive accuracies than discriminant analysis by relaxing the assumptions of multi-variate normality of the predictor data set: in the present case, logistic regression tended to over-estimate the absent category, whilst discriminant analysis tended to over-estimate the present category. Recent comparisons show that there is little to choose between these alternative methods (Manel et al., 1999). Logistic regressions are easier to implement within current commercial software, although it is more difficult to interpret the results biologically. · Analyses were run at four geographical levels: continental, regional, national, for each of a series of 50 satellite derived ecozones (described in FAO 1999). Models were also run using either the area of fly presence and a 200 kilometre buffer zone of absence, or for the whole of the geographical region. The models used were for the largest area that gave a 90% or greater correct output.

Fly Distribution Data Used 

The underlying data are from (Ford and Katondo, 1977) modified by information from a wide range of sources. These include: Togo, Burkina Faso: data provided by Dr G. Hendryckx (FAO regional Animal Trypanosomosis Control project for Togo and Burkina Faso; Ethiopia: Survey data provided by Dr M. Vreysen IAEA/UNDP and altitude in consultation with FAOAGAH; Cote D'Ivoire: FAO Consultancy by Dr A Douati (undated); Kenya: data prepared for Kenya Trypanosomiais Research Institute by ERGO and TALA; Central African Republic and Chad: Data extracted from Réactualisation de la Situation des Tsé-Tsé et des Trypanosomoses Animales au Tchad. by D. Cuisance, May1995; Zimbabwe and Zambia: data provided by the Regional Tsetse and Trypanosomosis Control Programme and Mr Hem Thakersi, Livestock Development Programme, Ministry of Agriculture, Food and Fisheries; Botswana: data provided by the Tsetse Control Division; Nigeria, Rwanda, and Mali: Reports by FAO Liaison Officers to the Programme Against African Trypanosomiasis for each country; South Africa: data extracted from Kappmeier, K, Nevill, E.M. and Bagnall, R.J., (1998) Review of tsetse flies and trypanosomiasis in South Africa. Onderspoort Journal of Veterinary Research, 65: 195-203; Continental: The Tsetse Fly - A CD produced by ORSTOM and CIRAD; and TALA Research Group Archives.

Draft predicted distribution maps were presented to delegates of Jubilee Meeting of The International Scientific Council for Trypanosomiasis Research and Control (ISCTRC). 27th Sept - 1st Oct 1999. Momabsa, Kenya. and a number of comments incorporated. 3.2 Predictor Variable Data Used A range of information has been incorporated into these analyses, including eco-climatic data, elevation, cattle density, cultivation level, and human population, summary details of which are given below. The analyses underlying their acquisition and generation is presented in FAO (1999). 

Eco-climatic Data - Satellite Imagery 

The study used the following satellite-derived measures of land-surface or atmospheric characteristics for both Africa and Asia:

a) Normalised Difference Vegetation Index (NDVI) from the Advanced Very High Resolution Radiometer (AVHRR) commonly used as an indicator of vegetation cover (data from the Pathfinder Program, initially supplied by the NASA Global Inventory Monitoring and Modelling Systems (GIMMS) group); 

b) Ground surface temperature, derived using the Price split window technique by TALA personnel, from two of the thermal channels (Channel 4 and 5) of the same instrument that produces NDVI data (Price, 1984); 

c) Middle infra-red reflectance (allied to temperature but less susceptible to atmospheric interference) derived from Channel 3 of the AVHRR data; 

d) Vapour Pressure Deficit also derived from the AVHRR channels 4 and 5 and ancillary processing; and, for Africa only c) A measure of surface rainfall, the Cold Cloud Duration (CCD), derived from the METEOSAT satellite (from FAO ARTEMIS).

More details of the performance of each satellite series in relation to interpolated climate data are available in (Hay and Lennon, 1999).

All the AVHRR satellite data were available in dekadal images for a 13 year series from 1982-1994. The CCD imagery runs from 1988 to 1998. Each series was subjected to temporal Fourier analysis, re-sampled to 0.05 degree resolution and re-projected to latitude/longitude (geographic) projection where necessary. Fourier processing extracts, from each multi-temporal data stream, the characteristics of the annual, biannual and tri-annual components, details of which are given in (Rogers et al., 1996). Mean values, and the amplitudes and phases (i.e. timing of the seasonal peaks) of the annual, bi-annual and tri-annual cycles were recorded and turned into IDRISI image data layers, together with the maximum, minimum and ranges (maximum - minimum) of each Fourier description of the observed signal. The percentage of the total variance attributable to each of the three Fourier components (a measure of the relative importance of each component) was also calculated for each parameter series. Further details of these variables can be found in Appendix Tables A1 and A2.

Table A1: Prediction Equation Fourier Variable Names

Fourier

 

 

Data

 

 

Variable

NDVI

CCD

VPD

PRICE

CH3

Mean

NDVIAMP0

CDDAMP0

VPDAMP0

PRIAMP0

CH3AMP0

Amplitude1

NDVIAMP1

CDDAMP1

VPDAMP1

PRIAMP1

CH3AMP1

Amplitude2

NDVIAMP2

CDDAMP2

VPDAMP2

PRIAMP2

CH3AMP2

Amplitude3

NDVIAMP3

CDDAMP3

VPDAMP3

PRIAMP3

CH3AMP3

Phase1

NDVIPHS1

CDDPHS1

VPDPHS1

PRIPHS1

CH3PHS1

Phase2

NDVIPHS2

CDDPHS2

VPDPHS2

PRIPHS2

CH3PHS2

Phase3

NDVIPHS3

CDDPHS3

VPDPHS3

PRIPHS3

CH3PHS3

Variance of Mean

NDVIVAR

CDDVAR

VPDVAR

PRIVAR

CH3VAR

Variance 1*

NDVIDPF1

CDDDPF1

VPDDPF1

PRIDPF1

CH3DPF1

Variance 2*

NDVIDPF2

CDDDPF2

VPDDPF2

PRIDPF2

CH3DPF2

Variance 3*

NDVIDPF3

CDDDPF3

VPDDPF3

PRIDPF3

CH3DPF3

Variance All*

NDVIDPFA

CDDDPFA

VPDDPFA

PRIDPFA

CH3DPFA

Min

NDVIMIN

CDDMIN

VPDMIN

PRIMIN

CH3MIN

Max

NDVIMAX

CDDMAX

VPDMAX

PRIMAX

CH3MAX

Range

NDVIRNG

CDDRNG

VPDRNG

PRIRNG

CH3RNG

*e.g. Variance 1 refers to the  % of total signal variance accounted for by the annual cycle of variation.

Other Data

Digital Elevation Model (DEM) data were obtained from the GTOPO30 1km resolution elevation surface for Africa, produced by the Global Land Information System (GLIS) of the United States Geological Survey, Earth Resources Observation Systems (USGS, EROS) data centre.

Potential Evapotranspiration (PET) data for Africa were provided by FAO AGL, as averaged dekadal means for the years 1961-1990, from which were calculated the mean, minimum and maximum dekadal PET, which were then re-sampled to a 0.05 degree resolution.

Maps of the Length of Growing Period for both continents, at a resolution of 0.5 by 0.5 degrees, was made available by FAO AGL.

A series of parameters were calculated using GIS procedures within IDRISI: these include topographic slope; distance to major rivers and the sea; and the x and y image pixel coordinate. 

Human Population 

The human population data used for Africa and Asia are derived from three sources: a global coverage of population number per image pixel obtained from University of California at Berkeley provided by FAO AGL at 5 minute resolution; a population density coverage at the same resolution from the Consortium for International Earth Science Information Network (CIESIN: http://www.ciesin.org), derived from data collated by the National Centre for Geographic Information and Analysis (NCGIA: http://www.ncgia.ucsb.edu); and the population data from the Intergovernmental Authority on Drought and Development countries used in the earlier work (FAO, 1996). The average of these three estimates was calculated through the raster image manipulation functions within the IDRISI software. 

Agricultural Data

Cattle and cultivation levels used were predicted using regression analyses, fully described in FAO (1999). This process requires an observed data set for each that can be used to 'train' the predictor variables described above, thereby establishing the relationships between observed and predictor variables using a large number of sample points. These can then be applied to the predictor variable images to produce predictions at the imagery resolution.

An ongoing project, funded by the UK Department for International Development (DFID), is currently extending these approaches to produce continental predictions of cattle density and cropping level, to incorporate into an Information System, comprising a GIS, a Knowledge Base and a Resource Inventory for sub-Saharan Africa. This work is being undertaken by ERGO Ltd in collaboration with the Programme Against Animal Trypanosomiasis (PAAT) in FAO Rome, the UK Natural Resources Institute (NRI) and the Trypanosomiasis and Land Use in Africa Research Group (TALA) at Oxford University. These data have been used extensively in the following study, and are fully described in the forthcoming PAATIS documentation. They are summarised below.

The observed data for cattle density are compiled from a wide variety of sources. These include the International Livestock Research Institute, ERGO aerial survey archives for Niger, Nigeria, Sudan, Chad, and Mali, and Government Agricultural Census data from Botswana and Kenya and Ghana.

The observed cultivation data were also obtained from: the Africa Data Dissemination Service (http://edcintl.cr.usgs.gov/adds/adds.html#adds_data_anchor); FAO AGDAT as used in the FAO GEOWEB service, produced by FAO GIEWS (http://geoweb.fao.org/); ERGO/TALA aerial survey archives for Togo, Nigeria, Niger, Mali, Chad, and Sudan; transcribed Government Census data from Zambia, Botswana, South Africa, and Swaziland; and FAO GIEWS reports for Angola, Lesotho Tanzania, Ethiopia, Eritrea, Zaire, and Mozambique.

Table A2: Other Regression Variables Used

 

Name

Description

CDNDLAG

Scaled lag between NDVI and CCD, indicative of delay in growth after rain

RGEZCTPR

Predicted Cattle Density

AFCPRZPR

Predicted Cultivation Percentage

AFKMSEA

Distance to sea (km)

AFXPIX

X coordinate (image pixel column)

AFYPIX

Y coordinate (image pixel row)

AFRIVDEG

Distance to major rivers (degrees)

AFSLPLL

Topographic slope (degrees)

MASK

Continent and Water Mask (Value 255)

POPDN

Human Population Density

ELEVM

Elevation

PHEZ50

Addapix Ecozones - 50 Zones

PETMXDK

Maximum Dekadal Potential Evapotranspiration (Mean 1961-1990)

PETMNDK

Minumum Dekadal Potential Evapotranspiration (Mean 1961-1990)

PETAVDK

Average Dekadal Potential Evapotranspiration (Mean 1961-1990)

LGPR

Predicted Length of Growing Period (Days)

References

FAO, (1999). Agro-Ecological Zones, Farming Systems and Land Pressure in Africa and Asia. Consultancy Report prepared by Environmental Research Group Oxford Ltd and TALA Research Group, Department of Zoology, University of Oxford, for the Animal Health Service of the Animal Production and Health Division of the Food and Agriculture Organisation of the United Nations, Rome, Italy. (Authors: Wint,W; Slingenbergh,J; and Rogers,D.)

Ford, J. and Katondo, K.M. (1977). The distribution of tsetse flies in Africa. Nairobi, OAU. Cook, Hammond & Kell.

Gilbert, M.  Jenner, C., Pender, J., Rogers, D., Slingenbergh, J, and Wint, W. (1999) The development and use of the programme against african trypanosomiasis information system (PAAT-IS). Paper for Proceedings of the Jubilee Meeting of The International Scientific Council for Trypanosomiasis Research and Control (ISCTRC). 27th Sept – 1st Oct 1999. Momabsa, Kenya.

Hay, S.I. and Lennon, J.J. (1999). Deriving meteorological variables across Africa for the study and control of vector-borne disease: a comparison of remote sensing and spatial interpolation of climate. Tropical Medicine and International Health 4, 58-71.

Hay, S.I., Tucker, C.J., Rogers, D.J. and Packer, M.J. (1996). Remotely sensed surrogates of meteorological data for the study of the distribution and abundance of arthropod vectors of disease. Annals of Tropical Medicine and Parasitology 90, 1-19.

Manel, S., Dias, J.M., Buckton, S.T. and Ormerod, S.J. (1999). Alternative methods for predicting species distributions:  an illustration with Himalayan river birds. Journal of Applied Ecology 36, 734-747.

Rogers, D.J., Hay, S.I. and Packer, M.J. (1996). Predicting the distribution of tsetse flies in West Africa using temporal Fourier processed meteorological satellite data. Annals of Tropical Medicine and Parasitology 90, 225-241.

Rogers, D.J. and Randolph, S.E. (1993). Distribution of tsetse and ticks in Africa: past, present and future. Parasitology Today 9, 266-271.

 

Home ] [ Methods ] 1km Distribution Models ] 5 km Distributions ]