I finally got the SDRAM responding last night; I'd been writing 100 to a register rather than &100 (0x100), and it didn't like that. That's what you get from not taking open source code directly!
The next task is to introduce (memory) maps and threads.
Initially, I won't dynamically allocate these things, just create a few example ones with simple interfaces to prove the thread and map management code works.
One map will implement the serial port driver.
There are two ways I can think of where the caches/TLB can get confused:
Change of thread
Change of map
In the first case, there is a context switch where one thread moves from the running state to either the runnable or blocked state, and another thread moves from runnable to running.
In the second case, a thread makes a call to an object implemented in another map. (This will be done by storing the parameters to the call at a known location in thread-local storage and accessing a location in non-user accessible memory associated with the destination object. The map will be switched to the destination object's map and the thread will continue at a location within that map.)
(Security note: the ARM registers will have to be preserved in the calling map and cleared before entering the called map, to ensure there's no leak of information from one map to another.)
As a simple test of the switching code, I intend to have some threads running that do nothing but increment a single ARM register each while jumping around in their code and, checking their other general purpose registers remain at zero. If any of these test threads detect changes to registers, other than the one they're manipulating, there's a problem with the cache/TLB flushing.
To test the test, I'll start with a kernel that won't perform any manipulations on the TLB or cache.
Update: First problem was that the boot code (presumably) has set the Nonsecure Vector Base Address Register to 0x00014000, so putting my handlers in zero page meant they weren't called.