From a practical sense, ActivityPub may be the obvious choice as it gives easier interop with the largest federated platforms.
But what else? There are existing platforms built on these protocols, such as movim for xmpp, and another for matrix I forget.
From a technical standpoint, are there any major pros and cons?
And what do you do about other clients? What happens when the user wants to clear messages on the server when they’re fetched, but doesn’t want to do that for the social network rooms? What about moderation.
XMPP is good at a very specific thing and I don’t think its users would like all the necesary changes.
None of your questions are about the protocol, but implementation details of the application.
My point is that the protocol doesn’t work well for the use case.
And my point is that the your complaints are not related to the protocol, but the applications using it.
Not only that, we are talking about different layers of the OSI model. XMPP should be compared with HTTP, not ActivityPub. There is absolutely nothing stopping someone from implementing the ActivityStreams vocabulary on XMPP.