Memory access on the 80386 to misaligned locations are supported, although they will operate more slowly than their aligned counterparts. If a memory access straddles a page boundary, an access violation will be raised if either page does not support the desired operation (not readable or not writable), and the instruction will not appear to have started. The instruction does not partially-execute before the exception is raised.
If you are unlucky, and your misalignment straddles a page boundary, you can incur multiple page faults until pages on both sides of the boundary are simultaneously ready to accept the operation. If you are super-unlucky, this state may never be achieved and your program will just keep page faulting on that same instruction over and over again until the user terminates it.
Although 80386 does not support symmetric multiprocessor operations, it does support coprocessor operations as well as direct memory access (DMA), so you still need to be aware of atomicity.
Storing values to memory and reading values from memory are atomic operations.¹ If a competing processor writes to or reads the same memory, the result will be completely one value or the other, never a mix of the two.
This atomicity does not extend by default to read-modify-write operations, however.
INC [value] ; may conflict with other processors
It's possible that another processor could write
to the memory between the read and the write of
the INC
instruction.
To prevent another processor from accessing the memory
during a read-modify-write memory operation,
insert a LOCK
prefix in front of the
instruction.
This causes the read-modify-write sequence to occur
atomically.
LOCK INC [value] ; increment atomically
Any memory operation can be prefixed with a LOCK
,
and the processor will prevent any other processors from
accessing the memory for the duration of that instruction.
This works even for unaligned memory accesses!
The LOCK
prefix is superfluous for simple reads
and writes,
since those are already atomic.
It adds value only for read-modify-write instructions.
The LOCK
prefix is also superfluous for the
XCHG
instruction,
because the processor automatically locks the bus during
an exchange.
This automatic lock is for backward compatbility purposes,
because XCHG
was a common way to perform
test-and-set operations on earlier versions of the processor.
Note that many atomic operations
are not available in the form we have become
accustomed to:
Although you can perform an atomic increment or
decrement, or atomic add or subtract,
you don't receive the arithmetic result.
The only atomic result from an arithmetic operation
on memory is the flags.
Therefore,
the only information you got back from
the
InterlockedIncrement
or
InterlockedDecrement
functions was the sign of the result.
You could try to read the memory back to see what
the result was,
but that would be a separate instruction, outside
the scope of the LOCK
,
and therefore is not part of the overall atomic
operation.
The 80386 has no compare-exchange instruction,
so there was no
InterlockedCompareExchange
available for the 80386.
You did get a straight
InterlockedExchange
, though.
Okay, so that's atomic operations and memory alignment. Next time, we'll start looking at Windows software conventions.
¹ The operations are atomic, but not synchronized.