commit d3d373e0e3f51f335d8c722dd1340ab812fdf94b
Merge: aceb91c 5084f89
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Feb 9 11:51:40 2011 -0800

    Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
    
    * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
      virtio: console: Update Copyright
      virtio: console: Wake up outvq on host notifications

commit aceb91cd351bc3a19a783c901fe8a8070d5f6fa9
Merge: ae8eed2 b8cf0e0
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Feb 9 11:45:21 2011 -0800

    Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
    
    * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
      cdrom: support devices that have check_events but not media_changed
      cfq-iosched: Don't wait if queue already has requests.
      blkio-throttle: Avoid calling blkiocg_lookup_group() for root group
      cfq: rename a function to give it more appropriate name
      cciss: make cciss_revalidate not loop through CISS_MAX_LUNS volumes unnecessarily.
      drivers/block/aoe/Makefile: replace the use of <module>-objs with <module>-y
      loop: queue_lock NULL pointer derefence in blk_throtl_exit
      drivers/block/Makefile: replace the use of <module>-objs with <module>-y
      blktrace: Don't output messages if NOTIFY isn't set.

commit ae8eed2d0906bf0d8eb0c2a4651676a41d361297
Merge: 100b33c 02214dc
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Feb 9 11:44:55 2011 -0800

    Merge branch 'for-linus' of git://neil.brown.name/md
    
    * 'for-linus' of git://neil.brown.name/md:
      FIX: md: process hangs at wait_barrier after 0->10 takeover
      md_make_request: don't touch the bio after calling make_request
      md: Don't allow slot_store while resync/recovery is happening.
      md: don't clear curr_resync_completed at end of resync.
      md: Don't use remove_and_add_spares to remove failed devices from a read-only array
      Add raid1->raid0 takeover support
      md: Remove the AllReserved flag for component devices.
      md: don't abort checking spares as soon as one cannot be added.
      md: fix the test for finding spares in raid5_start_reshape.
      md: simplify some 'if' conditionals in raid5_start_reshape.
      md: revert change to raid_disks on failure.

commit b8cf0e0e552ca48e9a00f518aeb4f5e03984022b
Author: Simon Arlott <simon@fire.lp0.eu>
Date:   Wed Feb 9 14:21:07 2011 +0100

    cdrom: support devices that have check_events but not media_changed
    
    Commit 93aae17af1172c40c6f74b7294e93a90c3cfaa5d ("sr: implement
    sr_check_events()") replaced the media_changed op with the
    check_events op in drivers/scsi/sr.c
    
    All users that check for the CDC_MEDIA_CHANGED capbility try both
    the check_events op and the media_changed op, but register_cdrom()
    was requiring media_changed.
    
    This patch fixes the capability checking.
    
    The cdrom_select_disc ioctl is also using the two operations, so
    they should be required for CDC_SELECT_DISC too.
    
    Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Kay Sievers <kay.sievers@vrfy.org>
    Tested-by: Chris Clayton <chris2553@googlemail.com>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit 02a8f01b5a9f396d0327977af4c232d0f94c45fd
Author: Justin TerAvest <teravest@google.com>
Date:   Wed Feb 9 14:20:03 2011 +0100

    cfq-iosched: Don't wait if queue already has requests.
    
    Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
    the last queue of the group to become busy (have at least one request),
    so that the group does not lose out for not being continuously
    backlogged. The commit did not check for the condition that the last
    queue already has some requests. As a result, if the queue already has
    requests, wait_busy is set. Later on, cfq_select_queue() checks the
    flag, and decides that since the queue has a request now and wait_busy
    is set, the queue is expired.  This results in early expiration of the
    queue.
    
    This patch fixes the problem by adding a check to see if queue already
    has requests. If it does, wait_busy is not set. As a result, time slices
    do not expire early.
    
    The queues with more than one request are usually buffered writers.
    Testing shows improvement in isolation between buffered writers.
    
    Cc: stable@kernel.org
    Signed-off-by: Justin TerAvest <teravest@google.com>
    Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
    Acked-by: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit 5084f89303c0a138f66bf74662753f46878989bb
Author: Amit Shah <amit.shah@redhat.com>
Date:   Mon Jan 31 13:06:37 2011 +0530

    virtio: console: Update Copyright
    
    Signed-off-by: Amit Shah <amit.shah@redhat.com>
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit 2770c5ea501be69989a7acf54ec4cb3554c47191
Author: Amit Shah <amit.shah@redhat.com>
Date:   Mon Jan 31 13:06:36 2011 +0530

    virtio: console: Wake up outvq on host notifications
    
    The outvq needs to be woken up on host notifications so that buffers
    consumed by the host can be reclaimed, outvq freed, and application
    writes may proceed again.
    
    The need for this is now finally noticed when I have qemu patches ready
    to use nonblocking IO and flow control.
    
    CC: Hans de Goede <hdegoede@redhat.com>
    CC: stable@kernel.org
    Signed-off-by: Amit Shah <amit.shah@redhat.com>
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Acked-by: Hans de Goede <hdegoede@redhat.com>

commit 02214dc5461c36da26a34014cab4e1bb484edba2
Author: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Date:   Fri Feb 4 14:18:26 2011 +0100

    FIX: md: process hangs at wait_barrier after 0->10 takeover
    
    Following symptoms were observed:
    1. After raid0->raid10 takeover operation we have array with 2
    missing disks.
    When we add disk for rebuild, recovery process starts as expected
    but it does not finish- it stops at about 90%, md126_resync process
    hangs in "D" state.
    2. Similar behavior is when we have mounted raid0 array and we
    execute takeover to raid10. After this when we try to unmount array-
    it causes process umount hangs in "D"
    
    In scenarios above processes hang at the same function- wait_barrier
    in raid10.c.
    Process waits in macro "wait_event_lock_irq" until the
    "!conf->barrier" condition will be true.
    In scenarios above it never happens.
    
    Reason was that at the end of level_store, after calling pers->run,
    we call mddev_resume. This calls pers->quiesce(mddev, 0) with
    RAID10, that calls lower_barrier.
    However raise_barrier hadn't been called on that 'conf' yet,
    so conf->barrier becomes negative, which is bad.
    
    This patch introduces setting conf->barrier=1 after takeover
    operation. It prevents to become barrier negative after call
    lower_barrier().
    
    Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
    Signed-off-by: NeilBrown <neilb@suse.de>

commit e91ece5590b3c728624ab57043fc7a05069c604a
Author: Chris Mason <chris.mason@oracle.com>
Date:   Mon Feb 7 19:21:48 2011 -0500

    md_make_request: don't touch the bio after calling make_request
    
    md_make_request was calling bio_sectors() for part_stat_add
    after it was calling the make_request function.  This is
    bad because the make_request function can free the bio and
    because the bi_size field can change around.
    
    The fix here was suggested by Jens Axboe.  It saves the
    sector count before the make_request call.  I hit this
    with CONFIG_DEBUG_PAGEALLOC turned on while trying to break
    his pretty fusionio card.
    
    Cc: <stable@kernel.org>
    Signed-off-by: Chris Mason <chris.mason@oracle.com>
    Signed-off-by: NeilBrown <neilb@suse.de>

commit c6751b2bde477f56ceef67aa1d298ce44e8e2e23
Author: NeilBrown <neilb@suse.de>
Date:   Wed Feb 2 11:57:13 2011 +1100

    md: Don't allow slot_store while resync/recovery is happening.
    
    Activating a spare in an array while resync/recovery is already
    happening can lead the that spare being marked in-sync when it isn't
    really.
    So don't allow the 'slot' to be set (this activating the device)
    while resync/recovery is happening.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit 7281f8129c362436237b82c8c026494dd36479dc
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 14:30:27 2011 +1100

    md: don't clear curr_resync_completed at end of resync.
    
    There is no need to set this to zero at this point.  It will be
    set to zero by remove_and_add_spares or at the start of
    md_do_sync at the latest.
    And setting it to zero before MD_RECOVERY_RUNNING is cleared can
    make a 'zero' appear briefly in the 'sync_completed' sysfs attribute
    just as resync is finishing.
    
    So simply remove this setting to zero.
    
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit a8c42c7f476b5bb39bb3a5b32d5473b9a46cadb9
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 13:47:13 2011 +1100

    md: Don't use remove_and_add_spares to remove failed devices from a read-only array
    
    remove_and_add_spares is called in two places where the needs really
    are very different.
    remove_and_add_spares should not be called on an array which is about
    to be reshaped as some extra devices might have been manually added
    and that would remove them.  However if the array is 'read-auto',
    that will currently happen, which is bad.
    
    So in the 'ro != 0' case don't call remove_and_add_spares but simply
    remove the failed devices as the comment suggests is needed.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit fc3a08b85b7a4f6c1069e5f71f6ad40d925ff55b
Author: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Date:   Mon Jan 31 13:47:13 2011 +1100

    Add raid1->raid0 takeover support
    
    This patch introduces raid 1 to raid0 takeover operation
    in kernel space.
    
    Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
    Signed-off-by: Neil Brown <neilb@nbeee.brown>

commit f21e9ff7f77d41ceca4e1e5ee5a4efa5ad7a5e40
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 12:10:09 2011 +1100

    md: Remove the AllReserved flag for component devices.
    
    This flag is not needed and is used badly.
    
    Devices that are included in a native-metadata array are reserved
    exclusively for that array - and currently have AllReserved set.
    They all are bd_claimed for the rdev and so cannot be shared.
    
    Devices that are included in external-metadata arrays can be shared
    among multiple arrays - providing there is no overlap.
    These are bd_claimed for md in general - not for a particular rdev.
    
    When changing the amount of a device that is used in an array we need
    to check for overlap.  This currently includes a check on AllReserved
    So even without overlap, sharing with an AllReserved device is not
    allowed.
    However the bd_claim usage already precludes sharing with these
    devices, so the test on AllReserved is not needed.  And in fact it is
    wrong.
    
    As this is the only use of AllReserved, simply remove all usage and
    definition of AllReserved.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit 50da08409654e036c4c964a473567a61a654cb83
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 11:57:43 2011 +1100

    md: don't abort checking spares as soon as one cannot be added.
    
    As spares can be added manually before a reshape starts, we need to
    find them all to mark some of them as in_sync.
    
    Previously we would abort looking for spares when we found an
    unallocated spare what could not be added to the array (implying there
    was no room for new spares).  However already-added spares could be
    later in the list, so we need to keep searching.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit 469518a3455c79619e9231aeffeffa2e2989f738
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 11:57:43 2011 +1100

    md: fix the test for finding spares in raid5_start_reshape.
    
    As spares can be added to the array before the reshape is started,
    we need to find and count them when checking there are enough.
    The array could have been degraded, so we need to check all devices,
    no just those out side of the range of devices in the array before
    the reshape.
    
    So instead of checking the index, check the In_sync flag as that
    reliably tells if the device is a spare or this purpose.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit 87a8dec91e15954f0cf86be6c21741d991d83621
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 11:57:43 2011 +1100

    md: simplify some 'if' conditionals in raid5_start_reshape.
    
    There are two consecutive 'if' statements.
    
     if (mddev->delta_disks >= 0)
          ....
     if (mddev->delta_disks > 0)
    
    The code in the second is equally valid if delta_disks == 0, and these
    two statements are the only place that 'added_devices' is used.
    
    So make them a single if statement, make added_devices a local
    variable, and re-indent it all.
    
    No functional change.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit de171cb9a52598cc023adceafc6c166112401386
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 31 11:57:42 2011 +1100

    md: revert change to raid_disks on failure.
    
    If we try to update_raid_disks and it fails, we should put
    'delta_disks' back to zero.  This is important because some code,
    such as slot_store, assumes that delta_disks has been validated.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit be2c6b1990904dbd43f3d9b90fa2c530504375cd
Author: Vivek Goyal <vgoyal@redhat.com>
Date:   Wed Jan 19 08:25:02 2011 -0700

    blkio-throttle: Avoid calling blkiocg_lookup_group() for root group
    
    o Jeff Moyer was doing some testing on a RAM backed disk and
      blkiocg_lookup_group() showed up high overhead after memcpy(). Similarly
      somebody else reported that blkiocg_lookup_group() is eating 6% extra
      cpu. Though looking at the code I can't think why the overhead of
      this function is so high. One thing is that it is called with very high
      frequency (once for every IO).
    
    o For lot of folks blkio controller will be compiled in but they might
      not have actually created cgroups. Hence optimize the case of root
      cgroup where we can avoid calling blkiocg_lookup_group() if IO is happening
      in root group (common case).
    
    Reported-by: Jeff Moyer <jmoyer@redhat.com>
    Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
    Acked-by: Jeff Moyer <jmoyer@redhat.com>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit ba5bd520f679c450fb6efa439618703bd0956daa
Author: Vivek Goyal <vgoyal@redhat.com>
Date:   Wed Jan 19 08:25:02 2011 -0700

    cfq: rename a function to give it more appropriate name
    
    o Rename a function to give it more approprate name. We are calculating
      cfq queue slice and function name gives the impression as if cfq group
      slice length is being calculated.
    
    Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit 68264e9d6781f7163e92c517769bb470fa43f6cd
Author: Stephen M. Cameron <StephenM.Cameron>
Date:   Wed Jan 19 08:25:02 2011 -0700

    cciss: make cciss_revalidate not loop through CISS_MAX_LUNS volumes unnecessarily.
    
    Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit a0700bdd0b0150ea445159b1dee587f1507c272f
Author: Tracey Dent <tdent48227@gmail.com>
Date:   Wed Jan 19 08:25:02 2011 -0700

    drivers/block/aoe/Makefile: replace the use of <module>-objs with <module>-y
    
    Change Makefile to use <modules>-y instead of <modules>-objs because -objs
    is deprecated and should now be switched.  According to
    (documentation/kbuild/makefiles.txt).
    
    Signed-off-by: Tracey Dent <tdent48227@gmail.com>
    Cc: "Ed L. Cashin" <ecashin@coraid.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit ee71a968672a9951aee6014c55511007596425bc
Author: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date:   Wed Jan 19 08:25:02 2011 -0700

    loop: queue_lock NULL pointer derefence in blk_throtl_exit
    
    Performing
    $ sudo mount -o loop -o umask=0 /dev/sdb1 /mnt/
    mount: wrong fs type, bad option, bad superblock on /dev/loop0,
           missing codepage or helper program, or other error
           In some cases useful info is found in syslog - try
           dmesg | tail  or so
    
    $ sudo modprobe -r loop
    
    results in oops:
    
     BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
     IP: [<ffffffff812479d4>] do_raw_spin_lock+0x14/0x122
     Process modprobe (pid: 6189, threadinfo ffff88009a898000, task ffff880154a88000)
     Call Trace:
      [<ffffffff81486788>] _raw_spin_lock_irq+0x4a/0x51
      [<ffffffff8123404b>] ? blk_throtl_exit+0x3b/0xa0
      [<ffffffff8105b120>] ? cancel_delayed_work_sync+0xd/0xf
      [<ffffffff8123404b>] blk_throtl_exit+0x3b/0xa0
      [<ffffffff81229bc8>] blk_release_queue+0x21/0x65
      [<ffffffff8123bb06>] kobject_release+0x51/0x66
      [<ffffffff8123bab5>] ? kobject_release+0x0/0x66
      [<ffffffff8123ce1e>] kref_put+0x43/0x4d
      [<ffffffff8123ba27>] kobject_put+0x47/0x4b
      [<ffffffff8122717c>] blk_cleanup_queue+0x56/0x5b
      [<ffffffffa01c3824>] loop_exit+0x68/0x844 [loop]
      [<ffffffff8107cccc>] sys_delete_module+0x1e8/0x25b
      [<ffffffff814864c9>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [<ffffffff81002112>] system_call_fastpath+0x16/0x1b
    
    because of an attempt to acquire NULL queue_lock.
    I added the same lines as in blk_queue_make_request -
    index 44e18c0..49e6a54 100644`fall back to embedded per-queue lock'.
    
    Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit 04de96c9c6981c5957aa5db39bbdc4d958d07efa
Author: Tracey Dent <tdent48227@gmail.com>
Date:   Wed Jan 19 08:25:02 2011 -0700

    drivers/block/Makefile: replace the use of <module>-objs with <module>-y
    
    Change Makefile to use <modules>-y instead of <modules>-objs because -objs
    is deprecated and should now be switched.  According to
    (documentation/kbuild/makefiles.txt).
    
    Signed-off-by: Tracey Dent <tdent48227@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit 490da40d82b31c0562d3f5edb37810f492ca1c34
Author: Tao Ma <boyu.mt@taobao.com>
Date:   Wed Jan 19 10:51:44 2011 +0800

    blktrace: Don't output messages if NOTIFY isn't set.
    
    Now if we enable blktrace, cfq has too many messages output to the
    trace buffer. It is fine if we don't specify any action mask.
    But if I do like this:
    blktrace /dev/sdb -a issue -a complete -o - | blkparse -i -
    I only want to see 'D' and 'C', while with the following command
    dd if=/mnt/ocfs2/test of=/dev/null bs=4k count=1 iflag=direct
    
    I will get(with a 2.6.37 vanilla kernel):
      8,16   0        0     0.000000000     0  m   N cfq3805 alloced
      8,16   0        0     0.000004126     0  m   N cfq3805 insert_request
      8,16   0        0     0.000004884     0  m   N cfq3805 add_to_rr
      8,16   0        0     0.000008417     0  m   N cfq workload slice:300
      8,16   0        0     0.000009557     0  m   N cfq3805 set_active wl_prio:0 wl_type:2
      8,16   0        0     0.000010640     0  m   N cfq3805 fifo=          (null)
      8,16   0        0     0.000011193     0  m   N cfq3805 dispatch_insert
      8,16   0        0     0.000012221     0  m   N cfq3805 dispatched a request
      8,16   0        0     0.000012802     0  m   N cfq3805 activate rq, drv=1
      8,16   0        1     0.000013181  3805  D   R 114759 + 8 [dd]
      8,16   0        2     0.000164244     0  C   R 114759 + 8 [0]
      8,16   0        0     0.000167997     0  m   N cfq3805 complete rqnoidle 0
      8,16   0        0     0.000168782     0  m   N cfq3805 set_slice=100
      8,16   0        0     0.000169874     0  m   N cfq3805 arm_idle: 8 group_idle: 0
      8,16   0        0     0.000170189     0  m   N cfq schedule dispatch
      8,16   0        0     0.000397938     0  m   N cfq3805 slice expired t=0
      8,16   0        0     0.000399763     0  m   N cfq3805 sl_used=1 disp=1 charge=1 iops=0 sect=8
      8,16   0        0     0.000400227     0  m   N cfq3805 del_from_rr
      8,16   0        0     0.000400882     0  m   N cfq3805 put_queue
    
    See, there are 19 lines while I only need 2. I don't think it is
    appropriate for a user.
    
    So this patch will disable any messages if the BLK_TC_NOTIFY isn't set.
    Now the output for the same command will look like:
      8,16   0        1     0.000000000  4908  D   R 114759 + 8 [dd]
      8,16   0        2     0.000146827     0  C   R 114759 + 8 [0]
    
    Yes, it is what I want to see.
    
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Jeff Moyer <jmoyer@redhat.com>
    Signed-off-by: Tao Ma <boyu.mt@taobao.com>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>