===== Documentation/DocBook/deviceiobook.tmpl 1.5 vs edited =====
--- 1.5/Documentation/DocBook/deviceiobook.tmpl 2004-10-25 13:06:49 -07:00
+++ edited/Documentation/DocBook/deviceiobook.tmpl 2004-10-29 15:38:01 -07:00
@@ -195,7 +195,12 @@
be strongly ordered coming from different CPUs. Thus it's important
to properly protect parts of your driver that do memory-mapped writes
with locks and use the mmiowb to make sure they
- arrive in the order intended.
+ arrive in the order intended. Issuing a regular readX
+ will also ensure write ordering, but should only be used
+ when the driver has to be sure that the write has actually arrived
+ at the device (not that it's simply ordered with respect to other
+ writes), since a full readX is a relatively
+ expensive operation.
@@ -203,29 +208,50 @@
releasing a spinlock that protects regions using writeb
or similar functions that aren't surrounded by
readb calls, which will ensure ordering and flushing. The
- following example (again from qla1280.c) illustrates its use.
+ following pseudocode illustrates what might occur if write ordering
+ isn't guaranteed via mmiowb or one of the
+ readX functions.
- sp->flags |= SRB_SENT;
- ha->actthreads++;
- WRT_REG_WORD(®->mailbox4, ha->req_ring_index);
-
- /*
- * A Memory Mapped I/O Write Barrier is needed to ensure that this write
- * of the request queue in register is ordered ahead of writes issued
- * after this one by other CPUs. Access to the register is protected
- * by the host_lock. Without the mmiowb, however, it is possible for
- * this CPU to release the host lock, another CPU acquire the host lock,
- * and write to the request queue in, and have the second write make it
- * to the chip first.
- */
- mmiowb(); /* posted write ordering */
+CPU A: spin_lock_irqsave(&dev_lock, flags)
+CPU A: ...
+CPU A: writel(newval, ring_ptr);
+CPU A: spin_unlock_irqrestore(&dev_lock, flags)
+ ...
+CPU B: spin_lock_irqsave(&dev_lock, flags)
+CPU B: writel(newval2, ring_ptr);
+CPU B: ...
+CPU B: spin_unlock_irqrestore(&dev_lock, flags)
+ In the case above, newval2 could be written to ring_ptr before
+ newval. Fixing it is easy though:
+
+
+
+CPU A: spin_lock_irqsave(&dev_lock, flags)
+CPU A: ...
+CPU A: writel(newval, ring_ptr);
+CPU A: mmiowb(); /* ensure no other writes beat us to the device */
+CPU A: spin_unlock_irqrestore(&dev_lock, flags)
+ ...
+CPU B: spin_lock_irqsave(&dev_lock, flags)
+CPU B: writel(newval2, ring_ptr);
+CPU B: ...
+CPU B: mmiowb();
+CPU B: spin_unlock_irqrestore(&dev_lock, flags)
+
+
+
+ See tg3.c for a real world example of how to use mmiowb
+
+
+
+
PCI ordering rules also guarantee that PIO read responses arrive
- after any outstanding DMA writes on that bus, since for some devices
+ after any outstanding DMA writes from that bus, since for some devices
the result of a readb call may signal to the
driver that a DMA transaction is complete. In many cases, however,
the driver may want to indicate that the next
@@ -234,7 +260,11 @@
readb_relaxed for these cases, although only
some platforms will honor the relaxed semantics. Using the relaxed
read functions will provide significant performance benefits on
- platforms that support it.
+ platforms that support it. The qla2xxx driver provides examples
+ of how to use readX_relaxed. In many cases,
+ a majority of the driver's readX calls can
+ safely be converted to readX_relaxed calls, since
+ only a few will indicate or depend on DMA completion.