Overview of missing physical commodity trade data and its imputation using data augmentation |
| |
Affiliation: | 1. Laboratory of Dielectric Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou 310027, China;2. State Key Laboratory of Silicon Materials, Cyrus Tang Center for Sensor Materials and Applications, Innovation Center for Minimally Invasive Technique and Device, School of Materials Science and Engineering, Zhejiang University, Hangzhou 310027, China;3. College of Materials Science and Engineering, Zhejiang University of Technology, Hangzhou 310006, China;1. School of Transportation Science and Engineering, Beihang University, Beijing 100191, China;2. Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, MI 48109, USA;3. University of Michigan Transportation Research Institute (UMTRI), 2901 Baxter Road, Ann Arbor, MI 48109, USA;1. Instituto de Investigaciones en Ciencia y Tecnología de Materiales (INTEMA), Av. Juan B Justo 4302, B7608FDQ Mar del Plata, Argentina;2. Electroceramic Department, Instituto de Cerámica y Vidrio, CSIC, Kelsen 5, 28049 Madrid, Spain |
| |
Abstract: | The physical aspects of commodity trade are becoming increasingly important on a global scale for transportation planning, demand management for transportation facilities and services, energy use, and environmental concerns. Such aspects (for example, weight and volume) of commodities are vital for logistics industry to allow for medium-to-long term planning at the strategic level and identify commodity flow trends. However, incomplete physical commodity trade databases impede proper analysis of trade flow between various countries. The missing physical values could be due to many reasons such as, (1) non-compliance of reporter countries with the prescribed regulations by World Customs Organization (WCO) (2) confidentiality issues, (3) delays in processing of data, or (4) erroneous reporting. The traditional missing data imputation methods, such as the substitution by mean, substitution by linear interpolation/extrapolation using adjacent points, the substitution by regression, and the substitution by stochastic regression, have been proposed in the context of estimating physical aspects of commodity trade data. However, a major demerit of these single imputation methods is their failure to incorporate uncertainty associated with missing data. The use of computationally complex stochastic methods to improve the accuracy of imputed data has recently become possible with the advancement of computer technology. Therefore, this study proposes a sophisticated data augmentation algorithm in order to impute missing physical commodity trade data. The key advantage of the proposed approach lies in the fact that instead of using a point estimate as the imputed value, it simulates a distribution of missing data through multiple imputations to reflect uncertainty and to maintain variability in the data. This approach also provides the flexibility to include fundamental distributional property of the variables, such as physical quantity, monetary value, price elasticity of demand, price variation, and product differentiation, and their correlations to generate reasonable average estimates of statistical inferences. An overview and limitations of most commonly used data imputation approaches is presented, followed by the theoretical basis and imputation procedure of the proposed approach. Lastly, a case study is presented to demonstrate the merits of the proposed approach in comparison to traditional imputation methods. |
| |
Keywords: | International trade Commodity flow Missing data Data augmentation Monetary data Price elasticity of demand |
本文献已被 ScienceDirect 等数据库收录! |
|