IPW2200 Driver Locking Internal

               by Zhu Yi <yi.zhu@intel.com> on 26-08-2005
	     reviewed by Ben Cahill <ben.m.cahill@intel.com>


1. Hardware/Firmware resources that need to be protected
2. Software critical regions
3. Code paths in ipw2200 driver in which concurrency could happen
4. Locking methods used in ipw2200 driver


1. Hardware/Firmware resources that need to be protected

1) Hardware registers:

   - IPW_EVENT_REG
   - IPW_INTA_RW
     (no need to protect since only touched in isr and load time)
   - TX, RX, TX_CMD queue pointers and dma address registers

2) Sending control command: 

   All hardware commands (ie. associate, setting channel, etc) are
   sent through the tx_cmd queue. The tx_cmd queue is a ring buffer
   like tx and rx ring buffers; the queue head and tail must be
   protected. In theory, hardware commands can be sent
   asynchronously. In practice, some commands shouldn't be sent out
   until the prior commands return successfully (for example, the
   associate command is sent only after an essid command succeeds.)
   Sending hardware commands usually happens at configuring time, and
   are much less frequent compared with sending packets. For this
   reason, we implemented only the synchronous method. The
   implementation needs to guarantee, at any time, that only one
   thread is able to send a hardware command.


2. Software critical regions

1) Device status maintained in the driver (ie. STATUS_RF_KILL_HW):
   This bit is possibly checked and updated concurrently. So the
   operation should be atomic.

2) List operations:
   Adding and deleting list elements must be protected.

3) Waitqueues:
   Make sure a race condition will not prevent a wait_event from
   waking up.

4) Other global shared variables:
   priv->assoc_network


3. Code paths in ipw2200 driver where concurrency could happen

   Although we should always lock data, not code, knowing where
   concurrency could happen helps to understand the problem
   better. The following code paths might access shared global
   resources:


   A) ipw2200 Interrupt handler
   B) ipw2200 Tasklet
   C) ipw2200 callbacks for ieee80211 stack
   D) ipw2200 workqueue
   E) ipw2200 functions executed in user space process(ie. iwconfig) context

   For A, there will be no concurrency thread running at the same time
   since only one CPU handles the interrupt and the interrupt line is
   disabled at this time (hw_interrupt_type->ack()).

   For B, the only possible concurrency thread is A (Linux kernel
   Tasklet implementation guarantees that a given tasklet will run on
   only one CPU at any time). So, for code in B, using
   {dis,en}able_local_irq() is enough to protect a resource shared
   with A.

   C depends on how these callbacks are actually called from the
   ieee80211 stack. The rx path of the stack runs in the tasklet
   context currently (although it will be handled in net rx softirq
   context after the native ieee80211 stack is used). The tx path of
   the stack runs in net tx softirq context. So, for callbacks which
   eventually will be called by the stack tx and rx paths, they should
   be the same as B, except softirq is possibly run on different CPUs
   at the same time. The other callbacks, which are not called from
   the driver Tx/Rx paths, have the same behavior as E.

   D is in process context. It is worth note that the default nice
   value for a workqueue is -5. When a nice 0 user process context
   function calls queue_work(), the work is likely scheduled
   immediately. So we shouldn't make any assumptions about which
   thread will be executed first in the driver.  The locking
   requirement is the same as E.

   E can run concurrently with all these conditions, including
   itself. Even on an UP kernel, because of preemption, E can possibly
   run pseudo-concurrently with other parts of kernel code.


4. Locking methods used in ipw2200 driver

1) Synchronous hardware command sending method.

   To protect another thread from running concurrently, while also
   considering maximum local CPU usage, putting current thread into a
   sleep state until the command response comes back is a natural way
   to go. Since the function will sleep, it must be called from
   process context and it must not hold a spinlock during
   sleep. Depending on whether the callers want to wait until the
   resource becomes available (lock semantics), or want to get a
   response as soon as possible if the resource is not available
   currently (trylock semantics), we can implement it with either
   semaphore or a simpler atomic_t bit.

2) Areas need to be improved

   Current ipw2200 driver uses a semaphore in every workqueue and
   wireless extension entry point. This is an example of locking code
   instead of locking data. Sigh! Some low level shared resources are
   protected by spinlocks, but not all. This caused some duplication
   and inefficiency for the driver. We need to find all the shared
   resources, and add protection for them. When this is done, the
   higher level code locking can be removed.