I think it could end up being a problem that we face in the future, but probably not an insurmountable one.
For one, I suspect that clean data sources will always be available, though it could become a lot more expensive to obtain. As an extreme example, you could always source your data by recording in-person conversations.
Also, as AI improves, I’m guessing it will be able to handle bad data more gracefully, and that it should be able to train to the same effectiveness while using a smaller dataset.
Is there any way to validate these claims?