Model Results. An SVM approach was applied to investigate ridesourcing adoption behavior considering mode dependency and attitudinal factors. By considering balanced class-weights for a linear kernel SVM, a hyperparameter C = 1 was detected. The model performance results are illustrated in Table 18. The precision measure represents the ratio of true positives over the sum of true positives and false positives. The recall measure refers to the ratio of true positives over the sum of true positives and false negatives. Accuracy measures the ratio of all correctly predicted observations (true positives +true negatives) over the whole sample. A summary of the model performance is demonstrated in Table 18. Hyperparameter C 1 0.25 Class weights W-non-frequent balanced 2.5 W-frequent balanced 8 Train Data Overall Accuracy 86.40% 92.30% Precision Recall Precision Recall Majority-Non-frequent 0.99 0.86 0.98 0.94 Minority-frequent 0.4 0.92 0.56 0.78 Test data Overall Accuracy 85% 90.40% Precision Recall Precision Recall Majority-Non-frequent 0.97 0.86 0.97 0.93 Minority-frequent 0.36 0.76 0.51 0.71 The model showed an overall accuracy of 86.4% and 85% on training and test sets, respectively, with no signs of overfitting (recall value close to 1). However, a further look into the confusion matrix, reveals that the model’s accuracy is mainly due to its performance on the majority class (non-regular riders). On the contrary, the precision of minority class predictions is quite low, 0.4 and 0.36 for training and test sets, respectively. This is a critical issue given the nature of the study. The main objective of the model is to capture frequent riders, which is less than 10% of the sample. In this regard, underestimating the number of frequent riders (i.e., false negatives, or type II error) might not be as crucial as overestimating (false positives or type I error). Hence, it sounds reasonable to slightly sacrifice the recall of minority class in exchange for an increase in precision. In this regard, we further manipulated the class-weights and re-ran the grid search algorithm in search for better models. Consequently, we were able to optimize the model by increasing the misclassification penalty on the minority group and decreasing it in the majority class. In terms of contributing factors, Table 19 presents the model coefficients. It reveals that millennials showed the highest positive impact on frequent usage of ridesourcing. This sounds reasonable taking into account that millennials are highly involved in school, work, social activities, and of course they are the leading generation in adoption and use of technologies. On the contrary, mixed results are observed for generation x and baby boomers. This complies with the literature, where boomers and generation x tend to be more specific in their decisions when they are offered new technologies. As expected, there was a positive association between education and frequent ridesourcing. In particular, those with graduate and undergrad degrees showed the highest positive impact on frequent ride-source adoption. The positive correlation between education and technology adoption has been well documented in the literature. In view of income, low (below $ 75k) and very high (above $ 200k) categories discouraged regular ridesourcing. This might somewhat comply with common sense, where mid-to-high income people tend to have the highest utilization of ride-source on a regular basis. According to statistics published by Uber in 2017, around 44% of the riders fall within the mid 50% of income (▇▇▇.▇▇▇▇▇▇▇▇.▇▇▇). Ethnicity is another variable that we focused on. Accordingly, Hispanics were the most likely to use ride-source service regularly while Asians were the least likely. Among different employment types, the model reveals that unemployed people were the most likely to use ridesourcing regularly while self-employment discouraged frequent ridesourcing usage. When it comes to mobility expenses, we look into parking time and parking costs for private car users as well as access/waiting time for transit users as additional expenses imposed on travelers that potentially could be saved by using ridesourcing. Our hypothesis is that higher costs associated with conventional modes may lead to high usage of ridesourcing to avoid such costs. The model results did show a general positive association between high costs and more frequent ridesourcing usage, except for very high parking time (30 minutes or above) and high transit access time (15 minutes or above). This might be an indication of areas with high congestion and discourages driving, as a result, transit might be the better option than ridesourcing. Table 19 Linear SVM Model Coefficients Variables Coefficients Age 25-29 0.819544 30-34 0.501355 35-39 0.629495 45-49 0.168338 50-54 -0.466283 55-59 0.2753 60-64 -0.475362 Ethnicity Hispanic/Latino 0.595465 Asian -1.188994 Employment Part-time -0.101923 Unemployed 2.016964 Other/self-employed -1.036305 Income 0-25k -0.269743 25-50k -0.541611 50-75k -0.311766 100-125k -0.277185 >200k -0.008079 Mobility Expenses Parking fare: $ 10-15 0.712856 Parking fare >= $ 20 1.460178 Parking time 0-5 mins -0.490483 Parking time 10-15 mins -0.774337 Parking time 20-30 mins 1.493997 Parking time >= 30 mins -0.240898 Transit Access time 15-30 mins -0.786305 Attitudinal Factors Technology savviness 0.135294 Mode choice reasoning 0.111572 Trust issues -0.216203 Joy of driving 0.▇▇▇▇▇▇ Mode Dependency Highly car-dependent -0.083075 Factors Car Passengers 0.328291 Transit users 0.990097 Daily commuters 1.▇▇▇▇▇▇ Drivers from big families 0.0614 students without license 0.▇▇▇▇▇▇ In view of attitudinal factors, technology savviness and mode choice reasoning tended to encourage ridesourcing frequency. This is quite reasonable. Ridesourcing, by definition, is a direct manifestation of technology adoption and is expected to increase as people become more technology-oriented. Likewise, as individuals learn more about the higher level of service associated with ridesourcing options, they tend to use it more frequently, which justifies the positive coefficient of reasoning factor. Trust is still a big issue for travelers, and hinders the use ridesourcing. Interestingly, those who enjoyed driving tended to use ridesourcing frequently. All six mode dependency factors showed significant impacts. It seemed that highly car- dependent individuals were less likely to use ridesourcing frequently, followed by those coming from large families (with a high number of drivers and number of vehicles). In the former case, the person uses his/her car for almost every daily activity. There seems to be no desire for other alternatives as long as the person has access to a private vehicle. In the latter situation, there seems to be an abundance of private vehicle/driver availability in the household. Hence, using ridesourcing is not a priority. Transit-dependent individuals and those whose car usage is limited to daily commutes were the most likely to be a frequent ride-hailer, which implies that they view ridesourcing as a suitable mobility option. As expected, students without driver’s licenses were likely to use ridesourcing frequently. Details of the modeling can be found in ▇▇▇▇▇ et al. (2020a).
Appears in 2 contracts
Sources: Technical Memorandum, Technical Memorandum