1. If you have three AD9361- you need three cores unless you have external synchronization and merge the functions within. This affects the DMA interface (you will now need to pass the 12 channels to DMA). There is no effect on other cores. They don't care - but at the system level you will have added a heavy load on the bandwidth.
2. Yes. It is not just writes- keep in mind you may have to poll, service interrupts and make decisions. There are light weight engines that do this- commonly referred as micro coded engines. You may find some on google.