Abstract:This study aims to improve the accuracy of traffic prediction and to explore the spatio-temporal characteristics of traffic data in urban areas by proposing a traffic flow prediction model based on Bi-LSTM (Bidirectional Long Short Term Memory) for urban traffic grid clusters. The trajectory dataset of ride-hailing vehicles was gridified, and the Bayesian information criterion was used for parameter estimation to determine the cluster number, with considering the influence of manually determining the number of clusters on the results. The Gaussian mixture model was then employed to cluster grids with similar traffic conditions, resulting in distinct traffic grid clusters. A Multi-to-Multi model was designed by considering the mutual influence of input time series of traffic grids within each cluster. The Bi-LSTM model was established to predict traffic flow in non-overlapping clusters. Experimental validation was conducted using the classical MLRA (multiple linear regression analysis) as a control group, and four performance metrics: MAE(mean absolute error), RMSE(mean squared root error), MAPE(mean absolute percentage error) and DTW (dynamic time warping) were used to comprehensively evaluate the prediction results, confirming the feasibility and superiority of the Bi-LSTM model for city traffic grid cluster flow prediction. The results showed that both MLRA and Bi-LSTM models predicted urban traffic grid cluster traffic values were generally smaller than the real value, with more pronounced discrepancies observed during morning peak hours. Increasing data volume improved the prediction performance of the models. Traffic state dynamics within each traffic grid cluster were similar, displaying strong intra-cluster correlation. Both models achieved better traffic prediction results, with Bi-LSTM outperforming MLRA. In terms of model accuracy, the Bi-LSTM model showed improved MAE, RMSE and MAPE(3.068 7, 4.294 3, 0.304 5, respectively) compared to MLRA(3.201 1, 4.400 9, 0.318 7, respectively), representing a 4.14%, 2.40% and 4.46% enhancement, respectively. The constructed Bi-LSTM model exhibited higher accuracy, lower error and better generalization performance. In terms of similarity result evaluation, the DTW results of MLRA and Bi-LSTM were 52 938.635 6 and 54 815.105 5 respectively. The Bi-LSTM model showed better robustness by 3.42% compared to the respective weekday and holiday time series similarity DTW results of the MLRA model. By considering the characteristics of urban traffic flow and leveraging the advantages of gridding traffic trajectory data, the Bi-LSTM-based model for urban traffic grid cluster traffic prediction exhibited high accuracy, low error and superior robustness compared to the MLRA traffic flow prediction model. Meanwhile, in terms of DTW metrics, the Bi-LSTM-based urban traffic grid cluster traffic model captured the real traffic variation trend and demonstrated excellent performance in traffic flow prediction for urban areas.