As one of the most vulnerable entity within the transportation system, pedestrians might face more dangers and sustain severer injuries in the traffic crashes than others. The safety of pedestrians is particularly critical within the context of continuous traffic safety improvements in US. Moreover, traffic crash data are inherently heterogeneous, and such data heterogeneity can cause one to draw incorrect conclusions in many ways. Therefore, developments and applications of proper modeling approaches are needed to identify causes of pedestrian-vehicle crashes to better ensure the safety of pedestrians.
On the other hand, with the development of artificial intelligence techniques, a variety of novel machine learning methods have been established. Compared to conventional discrete choice models (DCMs), machine learning models are more flexible with no or few prior assumptions about input variables and have higher adaptability to process outliers, missing and noisy data. Furthermore, the crash data has inherent patterns related to both space and time, crashes happened in locations with highly aggregated uptrend patterns should be worth exploring to examine the most recently deteriorative factors affecting the pedestrian injury severities in crashes.
The major goal of this study is intended to develop a framework for modeling and analyzing pedestrian injury severities in single-pedestrian-single-vehicle crashes with providing a higher resolution on identification of contributing factors and their associating effects on the injury severities of pedestrians, particularly on those most recently deteriorative factors. Developments of both conventional DCMs and the selected machine learning model, i.e., XGBoost model, are established. Detailed comparisons among all developed models are conducted with a result showing that XGBoost model outperforms all other conventional DCMs in all selected measurements. In addition, an emerging hotspot analysis is further utilized to identify the most targeted hotspots, followed by a proposed XGBoost model that analyzes the most recently deteriorative factors affecting the pedestrian injury severities. By completions of all abovementioned tasks, the gaps between theory and practice could be bridged. Summary and conclusions of the whole research are provided, and further research directions are given at the end.