Unfortunately it doesn’t work like that…
You need to be able to control where the character looks, you need the stuff he/she looks at. You need to be able to tilt the head, sometimes you might need a hand to touch the face, and so on. So you need to move a LOT of your scene into XSI and you need to duplicate parts of your rig with the complete functionality. If you use any app specific elements it might not even be possible at all.
Then there are the deformations. What if it’s a single mesh for the entire body, like if you’re doing a Hulk, and you have the rest of the rig in Maya? Even with a regular character you need deformations from the neck and the collar bones on the same mesh… where to cut the pipeline and insert the cached stuff from XSI and how to combine it with the rest?
So no, unfortunately it’s not that simple. And I haven’t even touched on how you really, really need to have fully shaded and rendered previews to evaluate human facial animation because so much is changed by shadows, SSS shading and speculars, even by things like eyelashes and of course you want to see eyebrows as well. Having to transfer the data regularly to a different app can make test renders quite complicated as well, you need to export the cache again and again…