首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Traffic data provide the basis for both research and applications in transportation control, management, and evaluation, but real-world traffic data collected from loop detectors or other sensors often contain corrupted or missing data points which need to be imputed for traffic analysis. For this end, here we propose a deep learning model named denoising stacked autoencoders for traffic data imputation. We tested and evaluated the model performance with consideration of both temporal and spatial factors. Through these experiments and evaluation results, we developed an algorithm for efficient realization of deep learning for traffic data imputation by training the model hierarchically using the full set of data from all vehicle detector stations. Using data provided by Caltrans PeMS, we have shown that the mean absolute error of the proposed realization is under 10 veh/5-min, a better performance compared with other popular models: the history model, ARIMA model and BP neural network model. We further investigated why the deep leaning model works well for traffic data imputation by visualizing the features extracted by the first hidden layer. Clearly, this work has demonstrated the effectiveness as well as efficiency of deep learning in the field of traffic data imputation and analysis.  相似文献   

2.
Although various innovative traffic sensing technologies have been widely employed, incomplete sensor data is one of the most major problems to significantly degrade traffic data quality and integrity. In this study, a hybrid approach integrating the Fuzzy C-Means (FCM)-based imputation method with the Genetic Algorithm (GA) is develop for missing traffic volume data estimation based on inductance loop detector outputs. By utilizing the weekly similarity among data, the conventional vector-based data structure is firstly transformed into the matrix-based data pattern. Then, the GA is applied to optimize the membership functions and centroids in the FCM model. The experimental tests are conducted to verify the effectiveness of the proposed approach. The traffic volume data collected at different temporal scales were used as the testing dataset, and three different indicators, including root mean square error, correlation coefficient, and relative accuracy, are utilized to quantify the imputation performance compared with some conventional methods (Historical method, Double Exponential Smoothing, and Autoregressive Integrated Moving Average model). The results show the proposed approach outperforms the conventional methods under prevailing traffic conditions.  相似文献   

3.
Accurately modeling traffic speeds is a fundamental part of efficient intelligent transportation systems. Nowadays, with the widespread deployment of GPS-enabled devices, it has become possible to crowdsource the collection of speed information to road users (e.g. through mobile applications or dedicated in-vehicle devices). Despite its rather wide spatial coverage, crowdsourced speed data also brings very important challenges, such as the highly variable measurement noise in the data due to a variety of driving behaviors and sample sizes. When not properly accounted for, this noise can severely compromise any application that relies on accurate traffic data. In this article, we propose the use of heteroscedastic Gaussian processes (HGP) to model the time-varying uncertainty in large-scale crowdsourced traffic data. Furthermore, we develop a HGP conditioned on sample size and traffic regime (SSRC-HGP), which makes use of sample size information (probe vehicles per minute) as well as previous observed speeds, in order to more accurately model the uncertainty in observed speeds. Using 6 months of crowdsourced traffic data from Copenhagen, we empirically show that the proposed heteroscedastic models produce significantly better predictive distributions when compared to current state-of-the-art methods for both speed imputation and short-term forecasting tasks.  相似文献   

4.
Vehicle flow forecasting is of crucial importance for the management of road traffic in complex urban networks, as well as a useful input for route planning algorithms. In general traffic predictive models rely on data gathered by different types of sensors placed on roads, which occasionally produce faulty readings due to several causes, such as malfunctioning hardware or transmission errors. Filling in those gaps is relevant for constructing accurate forecasting models, a task which is engaged by diverse strategies, from a simple null value imputation to complex spatio-temporal context imputation models. This work elaborates on two machine learning approaches to update missing data with no gap length restrictions: a spatial context sensing model based on the information provided by surrounding sensors, and an automated clustering analysis tool that seeks optimal pattern clusters in order to impute values. Their performance is assessed and compared to other common techniques and different missing data generation models over real data captured from the city of Madrid (Spain). The newly presented methods are found to be fairly superior when portions of missing data are large or very abundant, as occurs in most practical cases.  相似文献   

5.
Abstract

Estimating missing values is known as data imputation. Previous research has shown that genetic algorithms (GAs) designed locally weighted regression (LWR) and time delay neural network (TDNN) models can generate more accurate hourly volume imputations for a period of 12 successive hours than traditional methods used by highway agencies. It would be interesting and important to further refine the models for imputing larger missing intervals. Therefore, a large number of genetically designed LWR and TDNN models are developed in this study and used to impute up to a week-long missing interval (168 hours) for sample traffic counts obtained from various groups of roads in Alberta, Canada. It is found that road type and functional class have considerable influences on reliable imputations. The reliable imputation durations range from 4–5 days for traffic counts with most unstable patterns to over 10 days for those with most stable patterns. The study results clearly show that calibrated GA-designed models can provide reliable imputations for missing data with ‘block patterns’, and demonstrate their further potentials in traffic data programs.  相似文献   

6.
Global Positioning System (GPS) technologies have been increasingly considered as an alternative to traditional travel survey methods to collect activity-travel data. Algorithms applied to extract activity-travel patterns vary from informal ad-hoc decision rules to advanced machine learning methods and have different accuracy. This paper systematically compares the relative performance of different algorithms for the detection of transportation modes and activity episodes. In particular, naive Bayesian, Bayesian network, logistic regression, multilayer perceptron, support vector machine, decision table, and C4.5 algorithms are selected and compared for the same data according to their overall error rates and hit ratios. Results show that the Bayesian network has a better performance than the other algorithms in terms of the percentage correctly identified instances and Kappa values for both the training data and test data, in the sense that the Bayesian network is relatively efficient and generalizable in the context of GPS data imputation.  相似文献   

7.
Accurate estimation of travel time is critical to the success of advanced traffic management systems and advanced traveler information systems. Travel time estimation also provides basic data support for travel time reliability research, which is being recognized as an important performance measure of the transportation system. This paper investigates a number of methods to address the three major issues associated with travel time estimation from point traffic detector data: data filling for missing or error data, speed transformation from time‐mean speed to space‐mean speed, and travel time estimation that converts the speeds recorded at detector locations to travel time along the highway segment. The case study results show that the spatial and temporal interpolation of missing data and the transformation to space‐mean speed improve the accuracy of the estimates of travel time. The results also indicate that the piecewise constant‐acceleration‐based method developed in this study and the average speed method produce better results than the other three methods proposed in previous studies. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

8.
Procedures to transform GPS tracks into activity-travel diaries have been increasingly addressed due to their potential benefit to replace traditional methods used in travel surveys. Existing approaches for data annotation however are not sufficiently accurate, which normally involves a prompted recall survey for data validation. Imputation algorithms for transportation mode detection seem to be largely dependent on speed-related features, which may blur the quality of classification results, especially with transportation modes having similar speeds. Therefore, in this paper we propose an enhanced integrated imputation approach by incorporating the critical indicators related to trip patterns, reflecting the effects of uncertain travel environments, including bus stops and speed percentiles. A two-step procedure which embeds a segmentation model and a transportation mode inference model is designed and examined based on purified prompted recall data collected in a large-scale travel survey. Results show the superior performance of the proposed approach, where the overall accuracy at trip level reaches 93.2% and 88.1% for training and surveyed data, respectively.  相似文献   

9.
Echaniz  Eneko  Ho  Chinh  Rodriguez  Andres  dell’Olio  Luigi 《Transportation》2020,47(6):2903-2921

Collecting data to obtain insights into customer satisfaction with public transport services is very time-consuming and costly. Many factors such as service frequency, reliability and comfort during the trip have been found important drivers of customer satisfaction. Consequently, customer satisfaction surveys are quite lengthy, resulting in many interviews not being completed within the aboard time of the passengers/respondents. This paper questions as to whether it is possible to reduce the amount of information collected without a compromise on insights. To address this research question, we conduct a comparative analysis of different Ordered Probit models: one with a full list of attributes versus one with partial set of attributes. For the latter, missing information was imputed using three different methods that are based on modes, single imputations using predictive models and multiple imputation. Estimation results show that the partial model using the multiple imputation method behaves in a similar way to the model that is based on the full survey. This finding opens an opportunity to reduce interview time which is critical for most customer satisfaction surveys.

  相似文献   

10.
The missing data problem remains as a difficulty in a diverse variety of transportation applications, e.g. traffic flow prediction and traffic pattern recognition. To solve this problem, numerous algorithms had been proposed in the last decade to impute the missed data. However, few existing studies had fully used the traffic flow information of neighboring detecting points to improve imputing performance. In this paper, probabilistic principle component analysis (PPCA) based imputing method, which had been proven to be one of the most effective imputing methods without using temporal or spatial dependence, is extended to utilize the information of multiple points. We systematically examine the potential benefits of multi-point data fusion and study the possible influence of measurement time lags. Tests indicate that the hidden temporal–spatial dependence is nonlinear and could be better retrieved by kernel probabilistic principle component analysis (KPPCA) based method rather than PPCA method. Comparison proves that imputing errors can be notably reduced, if temporal–spatial dependence has been appropriately considered.  相似文献   

11.
In this study the impact of transportation improvements on Brazilian interregional commodity flows are considered. The decreasing friction of distance is measured by two variants of the gravity model. First, distance coefficients are calculated for trade among all states in 1942 and 1962. Second, distance coefficients for each state's imports are calculated separately and then related to state per capita income, for the year, 1962. Both the time-series and cross-section results indicate a significant diminution in the friction of distance in the course of Brazilian development. The degree to which trade has integrated the national economy is assessed by the convergence of agricultural prices. Not only have interregional price differentials tended to diminish, but regional price structures are becoming more similar. The interrelation of these price structures provides a method of regionalizing the Brazilian space-economy.Most of the data for this study were collected during the author's tenure as Ford Foundation Visiting Professor at Instituto de Pesquisas Economicas, Universidade de Sao Paulo, Brazil. The Milton Fund and the Department of City and Regional Planning, both of Harvard, sustained the completion of this research. Milton Campanario and Abby Rashid provided invaliable assistance in assembling the data. Jeffrey Dutton of the Harvard Laboratory for Computer Graphics provided a program for calculating distances. William McAuliffe suggested some imaginative interpretations of the factor analysis.  相似文献   

12.
Smartphones have the capability of recording various kinds of data from built-in sensors such as GPS in a non-intrusive, systematic way. In transportation studies, such as route choice modeling, the discrete sequences of GPS data need to be associated with the transportation network to generate meaningful paths. The poor quality of GPS data collected from smartphones precludes the use of state of the art map matching methods. In this paper, we propose a probabilistic map matching approach. It generates a set of potential true paths, and associates a likelihood with each of them. Both spatial (GPS coordinates) and temporal information (speed and time) is used to calculate the likelihood of the data for a specific path. Applications and analyses on real trips illustrate the robustness and effectiveness of the proposed approach. Also, as an application example, a Path-Size Logit model is estimated based on a sample of real observations. The estimation results show the viability of applying the proposed method in a real route choice modeling context.  相似文献   

13.
Big data analytics (BDA) has increasingly attracted a strong attention of analysts, researchers and practitioners in railway transportation and engineering. This urges the necessity for a review of recent research development in this field. This survey aims to provide a comprehensive review of the recent applications of big data in the context of railway engineering and transportation by a novel taxonomy framework, proposed by Mayring (2003). The survey covers three areas of railway transportation where BDA has been applied, namely operations, maintenance and safety. Also, the level of big data analytics, types of big data models and a variety of big data techniques have been reviewed and summarized. The results of this study identify the existing research gaps and thereby directions of future research in BDA in railway transportation systems.  相似文献   

14.
A new convex optimization framework is developed for the route flow estimation problem from the fusion of vehicle count and cellular network data. The issue of highly underdetermined link flow based methods in transportation networks is investigated, then solved using the proposed concept of cellpaths for cellular network data. With this data-driven approach, our proposed approach is versatile: it is compatible with other data sources, and it is model agnostic and thus compatible with user equilibrium, system-optimum, Stackelberg concepts, and other models. Using a dimensionality reduction scheme, we design a projected gradient algorithm suitable for the proposed route flow estimation problem. The algorithm solves a block isotonic regression problem in the projection step in linear time. The accuracy, computational efficiency, and versatility of the proposed approach are validated on the I-210 corridor near Los Angeles, where we achieve 90% route flow accuracy with 1033 traffic sensors and 1000 cellular towers covering a large network of highways and arterials with more than 20,000 links. In contrast to long-term land use planning applications, we demonstrate the first system to our knowledge that can produce route-level flow estimates suitable for short time horizon prediction and control applications in traffic management. Our system is open source and available for validation and extension.  相似文献   

15.
Efforts to reduce energy use in freight transportation usually center around “mode-based” approaches, namely improving the energy efficiency of energy intensive modes, such as truck, and shifting more freight to energy efficient modes, such as rail. In the first part of this paper we review the recent trends and future prospects for these mode-based approaches, finding that despite substantial improvement in the technological efficiency of freight modes and robust growth in the use of intermodal rail since 1980, total freight energy use across all modes in the US has grown by approximately 33%, with proportional growth in carbon emissions. In the second part of the paper we propose use of a “commodity-based” approach, in which freight energy use is disaggregated by contribution of major commodity groups, in order to support efficiency improvement at the commodity level. Two potential applications of the commodity based approach, namely (1) life cycle analysis of energy use for major commodity groups and (2) spatial analysis of freight patterns, are demonstrated using the 1993 US Commodity Flow Survey data. Results of these preliminary findings suggest that commodity groups vary widely in the ratio of energy use in production to energy use in transport, and that for many commodity groups, there may be substantial opportunities for saving energy by redistributing flow patterns. Through development of the commodity-based approach, we also identify the collaborative involvement of shippers and carriers as a key point in improving energy efficiency, since it can be used to both make the mode-based approach more effective and address new issues such as the underlying growth in tonne-km. Benefits for air quality and other transportation issues are also discussed.  相似文献   

16.
Lane-based road information plays a critical role in transportation systems, a lane-based intersection map is the most important component in a detailed road map of the transportation infrastructure. Researchers have developed various algorithms to detect the spatial layout of intersections based on sensor data such as high-definition images/videos, laser point cloud data, and GPS traces, which can recognize intersections and road segments; however, most approaches do not automatically generate Lane-based Intersection Maps (LIMs). The objective of our study is to generate LIMs automatically from crowdsourced big trace data using a multi-hierarchy feature extraction strategy. The LIM automatic generation method proposed in this paper consists of the initial recognition of road intersections, intersection layout detection, and lane-based intersection map-generation. The initial recognition process identifies intersection and non-intersection areas using spatial clustering algorithms based on the similarity of angle and distance. The intersection layout is composed of exit and entry points, obtained by combining trajectory integration algorithms and turn rules at road intersections. The LIM generation step is finally derived from the intersection layout detection results and lane-based road information, based on geometric matching algorithms. The effectiveness of our proposed LIM generation method is demonstrated using crowdsourced vehicle traces. Additional comparisons and analysis are also conducted to confirm recognition results. Experiments show that the proposed method saves time and facilitates LIM refinement from crowdsourced traces more efficiently than methods based on other types of sensor data.  相似文献   

17.
Existing methods for calibrating link fundamental diagrams (FDs) often focus on a limited number of links and use grouping strategies that are largely dependent on roadway physical attributes alone. In this study, we propose a big data-driven two-stage clustering framework to calibrate link FDs for freeway networks. The first stage captures, under normal traffic state, the variations of link FDs over multiple days based on which links are clustered in the second stage. Two methods, i.e. the standard k-means algorithm combined with hierarchical clustering and a modified hierarchical clustering based on the Fréchet distance, are applied in the first stage to obtain the FD parameter matrix for each link. The calibrated matrices are input into the second stage where the modified hierarchical clustering is re-employed as a static approach resulting in multiple clusters of links. To further consider the variations of link FDs, the static approach is extended by modifying the similarity measure through the principle component analysis (PCA). The resulting multivariate time-series clustering models the distributions of the FD parameters as a dynamic approach. The proposed framework is applied on the Melbourne freeway network using one-year worth of loop detector data. Results have shown that (a) similar roadway physical attributes do not necessarily result in similar link FDs, (b) the connectivity-based approach performs better in clustering link FDs as compared with the centroid-based approach, and (c) the proposed framework helps achieving a better understanding of the spatial distribution of links with similar FDs and the associated variations and distributions of the FD parameters.  相似文献   

18.
A promising alternative transportation mode to address growing transportation and environmental issues is bicycle transportation, which is human-powered and emission-free. To increase the use of bicycles, it is fundamental to provide bicycle-friendly environments. The scientific assessment of a bicyclist’s perception of roadway environment, safety and comfort is of great interest. This study developed a methodology for categorizing bicycling environments defined by the bicyclist’s perceived level of safety and comfort. Second-by-second bicycle speed data were collected using global positioning systems (GPS) on public bicycles. A set of features representing the level of bicycling environments was extracted from the GPS-based bicycle speed and acceleration data. These data were used as inputs for the proposed categorization algorithm. A support vector machine (SVM), which is a well-known heuristic classifier, was adopted in this study. A promising rate of 81.6% for correct classification demonstrated the technical feasibility of the proposed algorithm. In addition, a framework for bicycle traffic monitoring based on data and outcomes derived from this study was discussed, which is a novel feature for traffic surveillance and monitoring.  相似文献   

19.
Location-based check-in services in various social media applications have enabled individuals to share their activity-related choices providing a new source of human activity data. Although geo-location data has the potential to infer multi-day patterns of individual activities, appropriate methodological approaches are needed. This paper presents a technique to analyze large-scale geo-location data from social media to infer individual activity patterns. A data-driven modeling approach, based on topic modeling, is proposed to classify patterns in individual activity choices. The model provides an activity generation mechanism which when combined with the data from traditional surveys is potentially a useful component of an activity-travel simulator. Using the model, aggregate patterns of users’ weekly activities are extracted from the data. The model is extended to also find user-specific activity patterns. We extend the model to account for missing activities (a major limitation of social media data) and demonstrate how information from activity-based diaries can be complemented with longitudinal geo-location information. This work provides foundational tools that can be used when geo-location data is available to predict disaggregate activity patterns.  相似文献   

20.
License plate recognition (LPR) data are emerging data sources that provide rich information in estimating the traffic conditions of urban arterials. While large-scale LPR system is not common in US, last few years have seen rapid developments and implementations in many other parts of world (e.g. China, Thailand and Middle East). Due to privacy issues, LPR data are seldom available to research communities. However, when available, this data source can be valuable in estimating real-time operational metrics in transportation systems. This paper proposes a lane-based real-time queue length estimation model using the license plate recognition (LPR) data. In the model, an interpolation method based on Gaussian process is developed to reconstruct the equivalent cumulative arrival–departure curve for each lane. The missing information for unrecognized or unmatched vehicles is obtained from the reconstructed arrival curve. With the complete arrival and departure information, a car-following based simulation scheme is applied to estimate the real-time queue length for each lane. The proposed model is validated using ground truth information of the maximum queue lengths from the city of Langfang in China. The results show that the model can capture the variations in queue lengths in the ground truth data, and the maximum queue length for each signal cycle can be estimated with a reasonable accuracy. The estimated queue length information using the proposed model can serve as a useful performance metric for various real-time traffic control applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号