Comparison and selection criterion of missing imputation methods and quality assessment of monthly rainfall
In the Central Rift Valley Lakes Basin of Ethiopia
22 July 2023
Authors: Sisay Kebede Balcha, Taye Alemayehu Hulluka, Adane Abebe Awass, Amare Bantider & Gebiaw T. Ayele
Missing data is a common problem in all scientific research, and the availability of gap-free data is rare in most developing countries. Statistical and empirical methods are the most often used for the approximation of missing data. The performance of eight missing estimation methods was evaluated using PBIAS, RMSE, and NSE for stations located in the rift-bounded lakes system of Katar and Meki sub-basins in Ethiopia. The multicriteria decision method of compromise programming was used to identify the best imputation method. Four homogeneity test methods were used to evaluate the homogeneity of the time series data. The Mann–Kendall trend test and Son’s slope were used to locate the change and calculate the magnitude of the trend. Multiple linear regression and multiple imputation by chained equations were well performed at most stations. Alternatively, the modified version of IDWM based on spatial distance and elevation difference (IDWME&D3 for k=3) was ranked third at many stations. The inclusion of elevation differences between stations has improved the capability of the inverse distance weight method. However, the performance of all the missing estimation methods decreased as the percentage of missing data increased. Radius influences have no significant impact on the performance of missing imputation methods. Only Butajera station exhibited nonhomogeneity. Dagaga, Iteya, Bui, and Butajera stations all exhibited decreasing trends. Kulumsa and Ejerese-Lele stations presented an increasing trend in monthly rainfall. However, the rest station has shown no significant increasing or decreasing trends in monthly rainfall.