IPW2200 Driver Locking Internal by Zhu Yi on 26-08-2005 reviewed by Ben Cahill 1. Hardware/Firmware resources that need to be protected 2. Software critical regions 3. Code paths in ipw2200 driver in which concurrency could happen 4. Locking methods used in ipw2200 driver 1. Hardware/Firmware resources that need to be protected 1) Hardware registers: - IPW_EVENT_REG - IPW_INTA_RW (no need to protect since only touched in isr and load time) - TX, RX, TX_CMD queue pointers and dma address registers 2) Sending control command: All hardware commands (ie. associate, setting channel, etc) are sent through the tx_cmd queue. The tx_cmd queue is a ring buffer like tx and rx ring buffers; the queue head and tail must be protected. In theory, hardware commands can be sent asynchronously. In practice, some commands shouldn't be sent out until the prior commands return successfully (for example, the associate command is sent only after an essid command succeeds.) Sending hardware commands usually happens at configuring time, and are much less frequent compared with sending packets. For this reason, we implemented only the synchronous method. The implementation needs to guarantee, at any time, that only one thread is able to send a hardware command. 2. Software critical regions 1) Device status maintained in the driver (ie. STATUS_RF_KILL_HW): This bit is possibly checked and updated concurrently. So the operation should be atomic. 2) List operations: Adding and deleting list elements must be protected. 3) Waitqueues: Make sure a race condition will not prevent a wait_event from waking up. 4) Other global shared variables: priv->assoc_network 3. Code paths in ipw2200 driver where concurrency could happen Although we should always lock data, not code, knowing where concurrency could happen helps to understand the problem better. The following code paths might access shared global resources: A) ipw2200 Interrupt handler B) ipw2200 Tasklet C) ipw2200 callbacks for ieee80211 stack D) ipw2200 workqueue E) ipw2200 functions executed in user space process(ie. iwconfig) context For A, there will be no concurrency thread running at the same time since only one CPU handles the interrupt and the interrupt line is disabled at this time (hw_interrupt_type->ack()). For B, the only possible concurrency thread is A (Linux kernel Tasklet implementation guarantees that a given tasklet will run on only one CPU at any time). So, for code in B, using {dis,en}able_local_irq() is enough to protect a resource shared with A. C depends on how these callbacks are actually called from the ieee80211 stack. The rx path of the stack runs in the tasklet context currently (although it will be handled in net rx softirq context after the native ieee80211 stack is used). The tx path of the stack runs in net tx softirq context. So, for callbacks which eventually will be called by the stack tx and rx paths, they should be the same as B, except softirq is possibly run on different CPUs at the same time. The other callbacks, which are not called from the driver Tx/Rx paths, have the same behavior as E. D is in process context. It is worth note that the default nice value for a workqueue is -5. When a nice 0 user process context function calls queue_work(), the work is likely scheduled immediately. So we shouldn't make any assumptions about which thread will be executed first in the driver. The locking requirement is the same as E. E can run concurrently with all these conditions, including itself. Even on an UP kernel, because of preemption, E can possibly run pseudo-concurrently with other parts of kernel code. 4. Locking methods used in ipw2200 driver 1) Synchronous hardware command sending method. To protect another thread from running concurrently, while also considering maximum local CPU usage, putting current thread into a sleep state until the command response comes back is a natural way to go. Since the function will sleep, it must be called from process context and it must not hold a spinlock during sleep. Depending on whether the callers want to wait until the resource becomes available (lock semantics), or want to get a response as soon as possible if the resource is not available currently (trylock semantics), we can implement it with either semaphore or a simpler atomic_t bit. 2) Areas need to be improved Current ipw2200 driver uses a semaphore in every workqueue and wireless extension entry point. This is an example of locking code instead of locking data. Sigh! Some low level shared resources are protected by spinlocks, but not all. This caused some duplication and inefficiency for the driver. We need to find all the shared resources, and add protection for them. When this is done, the higher level code locking can be removed.