Methods

Over the past twenty years it has been shown that the distribution of tsetse flies is related to climatic conditions (Rogers and Randolph, 1993), and that satellite imagery can provide reliable surrogates for a range of climatic parameters (Hay et al., 1996). More recently, tsetse distributions have been quite accurately mapped for a wide range of ecological conditions, using Fourier processed time series of satellite imagery of various types - especially those relating to vegetation cover, rainfall, temperature and elevation discussed by Rogers et al., (1996) and Gilbert et al., (1999).

These techniques have largely been based on discriminant analytical and maximum likelihood methods which use known presence and absence distribution data to 'train' the prediction process - in essence by establishing statistical relationships between the predictor (satellite image) variables and the observed fly presence/absence data. Output is given as the probability of presence for each sample point in the training dataset. The modelled data can be then compared with the known data to provide various indices of accuracy, such as the proportion correctly modelled, and the proportion of false negatives and positives produced. Typically, the technique gives accuracies of better than 85 percent. These relationships can then be applied to areas which have not been sampled, to provide a predicted probability of presence for areas outside the original training data set.

The present work draws on this experience to produce continent wide predictions of the probability of presence of twenty three tsetse species, as well as the three major species groups. A number of adaptations were made to improve predictions:

A larger training set sample was taken from the same distribution maps as were used previously (Ford and Katondo, 1977), supplemented by more recent information (see notes later). Sample points were taken at regular c. 50km spacings throughout sub-Saharan Africa, resulting in a total of 12,000 data points. · The range of predictor variables was extended to include a series of additional parameters as shown in the Appendix. · Logistic regression modelling was used in place of discriminant analytical/maximum likelihood methods. Logistic regression methods give equivalent or slightly greater predictive accuracies than discriminant analysis by relaxing the assumptions of multi-variate normality of the predictor data set: in the present case, logistic regression tended to over-estimate the absent category, whilst discriminant analysis tended to over-estimate the present category. Recent comparisons show that there is little to choose between these alternative methods (Manel et al., 1999). Logistic regressions are easier to implement within current commercial software, although it is more difficult to interpret the results biologically. · Analyses were run at four geographical levels: continental, regional, national, for each of a series of 50 satellite derived ecozones (described in FAO 1999). Models were also run using either the area of fly presence and a 200 kilometre buffer zone of absence, or for the whole of the geographical region. The models used were for the largest area that gave a 90% or greater correct output.

The underlying data are from (Ford and Katondo, 1977) modified by information from a wide range of sources. These include: Togo, Burkina Faso: data provided by Dr G. Hendryckx (FAO regional Animal Trypanosomosis Control project for Togo and Burkina Faso; Ethiopia: Survey data provided by Dr M. Vreysen IAEA/UNDP and altitude in consultation with FAOAGAH; Cote D'Ivoire: FAO Consultancy by Dr A Douati (undated); Kenya: data prepared for Kenya Trypanosomiais Research Institute by ERGO and TALA; Central African Republic and Chad: Data extracted from Réactualisation de la Situation des Tsé-Tsé et des Trypanosomoses Animales au Tchad. by D. Cuisance, May1995; Zimbabwe and Zambia: data provided by the Regional Tsetse and Trypanosomosis Control Programme and Mr Hem Thakersi, Livestock Development Programme, Ministry of Agriculture, Food and Fisheries; Botswana: data provided by the Tsetse Control Division; Nigeria, Rwanda, and Mali: Reports by FAO Liaison Officers to the Programme Against African Trypanosomiasis for each country; South Africa: data extracted from Kappmeier, K, Nevill, E.M. and Bagnall, R.J., (1998) Review of tsetse flies and trypanosomiasis in South Africa. Onderspoort Journal of Veterinary Research, 65: 195-203; Continental: The Tsetse Fly - A CD produced by ORSTOM and CIRAD; and TALA Research Group Archives.

Draft predicted distribution maps were presented to delegates of Jubilee Meeting of The International Scientific Council for Trypanosomiasis Research and Control (ISCTRC). 27th Sept - 1st Oct 1999. Momabsa, Kenya. and a number of comments incorporated. 3.2 Predictor Variable Data Used A range of information has been incorporated into these analyses, including eco-climatic data, elevation, cattle density, cultivation level, and human population, summary details of which are given below. The analyses underlying their acquisition and generation is presented in FAO (1999).

The study used the following satellite-derived measures of land-surface or atmospheric characteristics for both Africa and Asia:

a) Normalised Difference Vegetation Index (NDVI) from the Advanced Very High Resolution Radiometer (AVHRR) commonly used as an indicator of vegetation cover (data from the Pathfinder Program, initially supplied by the NASA Global Inventory Monitoring and Modelling Systems (GIMMS) group);

b) Ground surface temperature, derived using the Price split window technique by TALA personnel, from two of the thermal channels (Channel 4 and 5) of the same instrument that produces NDVI data (Price, 1984);

c) Middle infra-red reflectance (allied to temperature but less susceptible to atmospheric interference) derived from Channel 3 of the AVHRR data;

d) Vapour Pressure Deficit also derived from the AVHRR channels 4 and 5 and ancillary processing; and, for Africa only c) A measure of surface rainfall, the Cold Cloud Duration (CCD), derived from the METEOSAT satellite (from FAO ARTEMIS).

More details of the performance of each satellite series in relation to interpolated climate data are available in (Hay and Lennon, 1999).

All the AVHRR satellite data were available in dekadal images for a 13 year series from 1982-1994. The CCD imagery runs from 1988 to 1998. Each series was subjected to temporal Fourier analysis, re-sampled to 0.05 degree resolution and re-projected to latitude/longitude (geographic) projection where necessary. Fourier processing extracts, from each multi-temporal data stream, the characteristics of the annual, biannual and tri-annual components, details of which are given in (Rogers et al., 1996). Mean values, and the amplitudes and phases (i.e. timing of the seasonal peaks) of the annual, bi-annual and tri-annual cycles were recorded and turned into IDRISI image data layers, together with the maximum, minimum and ranges (maximum - minimum) of each Fourier description of the observed signal. The percentage of the total variance attributable to each of the three Fourier components (a measure of the relative importance of each component) was also calculated for each parameter series. Further details of these variables can be found in Appendix Tables A1 and A2.

Fourier			Data
Variable	NDVI	CCD	VPD	PRICE	CH3
Mean	NDVIAMP0	CDDAMP0	VPDAMP0	PRIAMP0	CH3AMP0
Amplitude1	NDVIAMP1	CDDAMP1	VPDAMP1	PRIAMP1	CH3AMP1
Amplitude2	NDVIAMP2	CDDAMP2	VPDAMP2	PRIAMP2	CH3AMP2
Amplitude3	NDVIAMP3	CDDAMP3	VPDAMP3	PRIAMP3	CH3AMP3
Phase1	NDVIPHS1	CDDPHS1	VPDPHS1	PRIPHS1	CH3PHS1
Phase2	NDVIPHS2	CDDPHS2	VPDPHS2	PRIPHS2	CH3PHS2
Phase3	NDVIPHS3	CDDPHS3	VPDPHS3	PRIPHS3	CH3PHS3
Variance of Mean	NDVIVAR	CDDVAR	VPDVAR	PRIVAR	CH3VAR
Variance 1*	NDVIDPF1	CDDDPF1	VPDDPF1	PRIDPF1	CH3DPF1
Variance 2*	NDVIDPF2	CDDDPF2	VPDDPF2	PRIDPF2	CH3DPF2
Variance 3*	NDVIDPF3	CDDDPF3	VPDDPF3	PRIDPF3	CH3DPF3
Variance All*	NDVIDPFA	CDDDPFA	VPDDPFA	PRIDPFA	CH3DPFA
Min	NDVIMIN	CDDMIN	VPDMIN	PRIMIN	CH3MIN
Max	NDVIMAX	CDDMAX	VPDMAX	PRIMAX	CH3MAX
Range	NDVIRNG	CDDRNG	VPDRNG	PRIRNG	CH3RNG
*e.g. Variance 1 refers to the % of total signal variance accounted for by the annual cycle of variation.

Digital Elevation Model (DEM) data were obtained from the GTOPO30 1km resolution elevation surface for Africa, produced by the Global Land Information System (GLIS) of the United States Geological Survey, Earth Resources Observation Systems (USGS, EROS) data centre.

Potential Evapotranspiration (PET) data for Africa were provided by FAO AGL, as averaged dekadal means for the years 1961-1990, from which were calculated the mean, minimum and maximum dekadal PET, which were then re-sampled to a 0.05 degree resolution.

Maps of the Length of Growing Period for both continents, at a resolution of 0.5 by 0.5 degrees, was made available by FAO AGL.

A series of parameters were calculated using GIS procedures within IDRISI: these include topographic slope; distance to major rivers and the sea; and the x and y image pixel coordinate.

The human population data used for Africa and Asia are derived from three sources: a global coverage of population number per image pixel obtained from University of California at Berkeley provided by FAO AGL at 5 minute resolution; a population density coverage at the same resolution from the Consortium for International Earth Science Information Network (CIESIN: http://www.ciesin.org), derived from data collated by the National Centre for Geographic Information and Analysis (NCGIA: http://www.ncgia.ucsb.edu); and the population data from the Intergovernmental Authority on Drought and Development countries used in the earlier work (FAO, 1996). The average of these three estimates was calculated through the raster image manipulation functions within the IDRISI software.

Cattle and cultivation levels used were predicted using regression analyses, fully described in FAO (1999). This process requires an observed data set for each that can be used to 'train' the predictor variables described above, thereby establishing the relationships between observed and predictor variables using a large number of sample points. These can then be applied to the predictor variable images to produce predictions at the imagery resolution.

An ongoing project, funded by the UK Department for International Development (DFID), is currently extending these approaches to produce continental predictions of cattle density and cropping level, to incorporate into an Information System, comprising a GIS, a Knowledge Base and a Resource Inventory for sub-Saharan Africa. This work is being undertaken by ERGO Ltd in collaboration with the Programme Against Animal Trypanosomiasis (PAAT) in FAO Rome, the UK Natural Resources Institute (NRI) and the Trypanosomiasis and Land Use in Africa Research Group (TALA) at Oxford University. These data have been used extensively in the following study, and are fully described in the forthcoming PAATIS documentation. They are summarised below.

The observed data for cattle density are compiled from a wide variety of sources. These include the International Livestock Research Institute, ERGO aerial survey archives for Niger, Nigeria, Sudan, Chad, and Mali, and Government Agricultural Census data from Botswana and Kenya and Ghana.

The observed cultivation data were also obtained from: the Africa Data Dissemination Service (http://edcintl.cr.usgs.gov/adds/adds.html#adds_data_anchor); FAO AGDAT as used in the FAO GEOWEB service, produced by FAO GIEWS (http://geoweb.fao.org/); ERGO/TALA aerial survey archives for Togo, Nigeria, Niger, Mali, Chad, and Sudan; transcribed Government Census data from Zambia, Botswana, South Africa, and Swaziland; and FAO GIEWS reports for Angola, Lesotho Tanzania, Ethiopia, Eritrea, Zaire, and Mozambique.

Name	Description
CDNDLAG	Scaled lag between NDVI and CCD, indicative of delay in growth after rain
RGEZCTPR	Predicted Cattle Density
AFCPRZPR	Predicted Cultivation Percentage
AFKMSEA	Distance to sea (km)
AFXPIX	X coordinate (image pixel column)
AFYPIX	Y coordinate (image pixel row)
AFRIVDEG	Distance to major rivers (degrees)
AFSLPLL	Topographic slope (degrees)
MASK	Continent and Water Mask (Value 255)
POPDN	Human Population Density
ELEVM	Elevation
PHEZ50	Addapix Ecozones - 50 Zones
PETMXDK	Maximum Dekadal Potential Evapotranspiration (Mean 1961-1990)
PETMNDK	Minumum Dekadal Potential Evapotranspiration (Mean 1961-1990)
PETAVDK	Average Dekadal Potential Evapotranspiration (Mean 1961-1990)
LGPR	Predicted Length of Growing Period (Days)

FAO, (1999). Agro-Ecological Zones, Farming Systems and Land Pressure in Africa and Asia. Consultancy Report prepared by Environmental Research Group Oxford Ltd and TALA Research Group, Department of Zoology, University of Oxford, for the Animal Health Service of the Animal Production and Health Division of the Food and Agriculture Organisation of the United Nations, Rome, Italy. (Authors: Wint,W; Slingenbergh,J; and Rogers,D.)

Ford, J. and Katondo, K.M. (1977). The distribution of tsetse flies in Africa. Nairobi, OAU. Cook, Hammond & Kell.

Gilbert, M. Jenner, C., Pender, J., Rogers, D., Slingenbergh, J, and Wint, W. (1999) The development and use of the programme against african trypanosomiasis information system (PAAT-IS). Paper for Proceedings of the Jubilee Meeting of The International Scientific Council for Trypanosomiasis Research and Control (ISCTRC). 27th Sept – 1st Oct 1999. Momabsa, Kenya.

Hay, S.I. and Lennon, J.J. (1999). Deriving meteorological variables across Africa for the study and control of vector-borne disease: a comparison of remote sensing and spatial interpolation of climate. Tropical Medicine and International Health 4, 58-71.

Hay, S.I., Tucker, C.J., Rogers, D.J. and Packer, M.J. (1996). Remotely sensed surrogates of meteorological data for the study of the distribution and abundance of arthropod vectors of disease. Annals of Tropical Medicine and Parasitology 90, 1-19.

Manel, S., Dias, J.M., Buckton, S.T. and Ormerod, S.J. (1999). Alternative methods for predicting species distributions: an illustration with Himalayan river birds. Journal of Applied Ecology 36, 734-747.

Rogers, D.J., Hay, S.I. and Packer, M.J. (1996). Predicting the distribution of tsetse flies in West Africa using temporal Fourier processed meteorological satellite data. Annals of Tropical Medicine and Parasitology 90, 225-241.

Rogers, D.J. and Randolph, S.E. (1993). Distribution of tsetse and ticks in Africa: past, present and future. Parasitology Today 9, 266-271.

Methods and Data Sources