I think the obvious problem is that induction (which is what learning from data is), is simply only one tool in the huge space that is intelligence, and it will never be enough to emulate the skill of a human driver, which is more or less what is necessary for autonomy in an open environment.
The "induction" that machine learning algorithms do also isn't the same as the induction that humans preform. We induce new concepts from experience (data) -- The description itself assumes consciousness in both "concepts" and "experience".
Thinking of computers as getting more "intelligent" like humans is a category error -- computers are dumb matter configured in an intelligent way by actual intelligence (humans) to preform certain tasks for us. We get better at telling them how to preform certain tasks (software), but there's no reason to think we're moving along some continuum of intelligence towards us.
That's the premise of deep learning - inducing high level concepts from experience, without manual feature engineering.
DL models are good at induction, what they can't handle is generalisation (being accurate outside of distribution). And self driving has a long tail, that's why it's so hard.
> and it will never be enough to emulate the skill of a human driver
It may never be enough to beat the best human drivers, but it only needs to beat most human drivers to be worth it. We're not that far off.
Your objection reminds me of the skeptics of spell check and grammar check. In principle, a perfect spell and grammar checker would also need general AI to fully understand a language and what you're intending to express. Fortunately, imperfect spell and grammar checkers are all that most of us need.