From: Chandra Seetharaman dm_any_congested() just checks for the DMF_BLOCK_IO and has no code to make sure that suspend waits for dm_any_congested() to complete. This patch adds such a check. Without it, a race can occur with dm_table_put() attempting to destroying the table in the wrong thread, the one running dm_any_congested() which is meant to be quick and return immediately. Two examples of problems: 1. Sleeping functions called from congested code, the caller of which holds a spin lock. 2. An ABBA deadlock between pdflush and multipathd. The two locks in contention are inode lock and kernel lock. Signed-off-by: Chandra Seetharaman Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon --- drivers/md/dm.c | 24 ++++++++++++++++-------- 1 files changed, 16 insertions(+), 8 deletions(-) Index: linux-2.6.28-rc4/drivers/md/dm.c =================================================================== --- linux-2.6.28-rc4.orig/drivers/md/dm.c 2008-11-13 23:28:09.000000000 +0000 +++ linux-2.6.28-rc4/drivers/md/dm.c 2008-11-13 23:30:13.000000000 +0000 @@ -937,16 +937,24 @@ static void dm_unplug_all(struct request static int dm_any_congested(void *congested_data, int bdi_bits) { - int r; - struct mapped_device *md = (struct mapped_device *) congested_data; - struct dm_table *map = dm_get_table(md); + int r = bdi_bits; + struct mapped_device *md = congested_data; + struct dm_table *map; - if (!map || test_bit(DMF_BLOCK_IO, &md->flags)) - r = bdi_bits; - else - r = dm_table_any_congested(map, bdi_bits); + atomic_inc(&md->pending); + + if (!test_bit(DMF_BLOCK_IO, &md->flags)) { + map = dm_get_table(md); + if (map) { + r = dm_table_any_congested(map, bdi_bits); + dm_table_put(map); + } + } + + if (!atomic_dec_return(&md->pending)) + /* nudge anyone waiting on suspend queue */ + wake_up(&md->wait); - dm_table_put(map); return r; }