====== spidernet-ipfrag-nfs.diff ====== Subject: spidernet: Fix problem sending IP fragments From: Norbert Eicker I found out that the spidernet-driver is unable to send fragmented IP frames. Let me just recall the basic structure of "normal" UDP/IP/Ethernet frames (that actually work): - It starts with the Ethernet header (dest MAC, src MAC, etc.) - The next part is occupied by the IP header (version info, length of packet, id=0, fragment offset=0, checksum, from / to address, etc.) - Then comes the UDP header (src / dest port, length, checksum) - Actual payload - Ethernet checksum Now what's different for IP fragment: - The IP header has id set to some value (same for all fragments), offset is set appropriately (i.e. 0 for first fragment, following according to size of other fragments), size is the length of the frame. - UDP header is unchanged. I.e. length is according to full UDP datagram, not just the part within the actual frame! But this is only true within the first frame: all following frames don't have a valid UDP-header at all. The spidernet silicon seems to be quite intelligent: It's able to compute (IP / UDP / Ethernet) checksums on the fly and tests if frames are conforming to RFC -- at least conforming to RFC on complete frames. But IP fragments are different as explained above: I.e. for IP fragments containing part of a UDP datagram it sees incompatible length in the headers for IP and UDP in the first frame and, thus, skips this frame. But the content *is* correct for IP fragments. For all following frames it finds (most probably) no valid UDP header at all. But this *is* also correct for IP fragments. The Linux IP-stack seems to be clever in this point. It expects the spidernet to calculate the checksum (since the module claims to be able to do so) and marks the skb's for "normal" frames accordingly (ip_summed set to CHECKSUM_HW). But for the IP fragments it does not expect the driver to be capable to handle the frames appropriately. Thus all checksums are allready computed. This is also flaged within the skb (ip_summed set to CHECKSUM_NONE). Unfortunately the spidernet driver ignores that hints. It tries to send the IP fragments of UDP datagrams as normal UDP/IP frames. Since they have different structure the silicon detects them the be not "well-formed" and skips them. The following one-liner against 2.6.21-rc2 changes this behavior. If the IP-stack claims to have done the checksumming, the driver should not try to checksum (and analyze) the frame but send it as is. Signed-off-by: Norbert Eicker Signed-off-by: Linas Vepstas Signed-off-by: Arnd Bergmann ---- --- diffstat: spider_net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ====== ipmi_si-check-devicetree-3.diff ====== Subject: ipmi: add suport for powerpc of_platform_driver From: Christian Krafft This patch adds support for of_platform_driver to the ipmi_si module. When loading the module, the driver will be registered to of_platform. The driver will be probed for all devices with the type ipmi. It's supporting devices with compatible settings ipmi-kcs, ipmi-smic and ipmi-bt. Only ipmi-kcs could be tested. Signed-off-by: Christian Krafft Acked-by: Heiko J Schick Signed-off-by: Corey Minyard Signed-off-by: Arnd Bergmann --- diffstat: ipmi_si_intf.c | 108 +++++++++++++++++++++++++++++++++++++ 1 file changed, 108 insertions(+) ====== ipmi_add_module_device_table.diff ====== Subject: ipmi: add module_device_table to ipmi_si From: Christian Krafft This patch adds MODULE_DEVICE_TABLE to ipmi_si. This way the module can be autoloaded by the kernel if a matching device is found. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: ipmi_si_intf.c | 2 ++ 1 file changed, 2 insertions(+) ====== spu-sched-tick-workqueue-is-rearming.diff ====== Subject: use cancel_rearming_delayed_workqueue when stopping spu contexts From: Christoph Hellwig The scheduler workqueue may rearm itself and deadlock when we try to stop it. Put a flag in place to avoid skip the work if we're tearing down the context. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- diffstat: sched.c | 23 +++++++++++++++++++++-- spufs.h | 2 +- 2 files changed, 22 insertions(+), 3 deletions(-) ====== cbe_thermal-add-reg_to_temp.diff ====== Subject: cbe_thermal: clean up computation of temperature From: Christian Krafft This patch introduces a little function for transforming register values into temperature. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: cbe_thermal.c | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) ====== cbe_thermal-throttling-attributes.diff ====== Subject: cbe_thermal: add throttling attributes to cpu and spu nodes From: Christian Krafft This patch adds some attributes the cpu and spu nodes: /sys/devices/system/[c|s]pu/[c|s]pu*/thermal/throttle_begin /sys/devices/system/[c|s]pu/[c|s]pu*/thermal/throttle_end /sys/devices/system/[c|s]pu/[c|s]pu*/thermal/throttle_full_stop Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: cbe_thermal.c | 155 +++++++++++++++++++++++++++++++++++++- 1 file changed, 154 insertions(+), 1 deletion(-) ====== cell-add-node-to-cpu-4.diff ====== Subject: cell: add cbe_node_to_cpu function From: Christian Krafft This patch adds code to deal with conversion of logical cpu to cbe nodes. It removes code that assummed there were two logical CPUs per CBE. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: arch/powerpc/oprofile/op_model_cell.c | 1 arch/powerpc/platforms/cell/cbe_regs.c | 53 +++++++++---- arch/powerpc/platforms/cell/cbe_regs.h | 5 + include/asm-powerpc/cell-pmu.h | 5 - 4 files changed, 45 insertions(+), 19 deletions(-) ====== cbe_cpufreq-use-pmi-3.diff ====== Subject: cell: use pmi in cpufreq driver From: Christian Krafft The new PMI driver was added in order to support cpufreq on blades that require the frequency to be controlled by the service processor, so use it on those. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- --- diffstat: cbe_cpufreq.c | 81 +++++++++++++++++++++++++++++++++++++- 1 file changed, 80 insertions(+), 1 deletion(-) ====== powerpc-add-of_remap.diff ====== Subject: powerpc: add of_iomap function From: Christian Krafft The of_iomap function maps memory for a given device_node and returns a pointer to that memory. This is used at some places, so it makes sense to a seperate function. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: arch/powerpc/sysdev/pmi.c | 19 ++----------------- include/asm-powerpc/prom.h | 11 +++++++++++ 2 files changed, 13 insertions(+), 17 deletions(-) ====== cell-support-new-device-tree-layout.diff ====== Subject: cell: add support for proper device-tree From: Christian Krafft This patch adds support for a proper device-tree. A porper device-tree on cell contains be nodes for each CBE containg nodes for SPEs and all the other special devices on it. Ofcourse oldschool devicetree is still supported. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: cbe_regs.c | 119 ++++++++++++++++++++++++++++++----------- 1 file changed, 88 insertions(+), 31 deletions(-) ====== spufs-fix-ctx-lifetimes.diff ====== Subject: Clear mapping pointers after last close From: Christoph Hellwig Make sure the pointers to various mappings are cleared once the last user stopped using them. This avoids accessing freed memory when tearing down the gang directory aswell as optimizing away pte invalidations if no one uses these. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- diffstat: context.c | 1 file.c | 146 +++++++++++++++++++++++++++++++++++++++--- inode.c | 1 spufs.h | 12 ++- 4 files changed, 147 insertions(+), 13 deletions(-) ====== spufs-ensure-preempted-threads-are-on-the-runqueue.diff ====== Subject: spu sched: ensure preempted threads are put back on the runqueue From: Christoph Hellwig To not lose a spu thread we need to make sure it always gets put back on the runqueue. Signed-off-by: Christoph Hellwig Acked-by: Jeremy Kerr Signed-off-by: Arnd Bergmann --- diffstat: sched.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) ====== spufs-add-missing-wakeup-in-find_victim.diff ====== Subject: spu sched: ensure preempted threads are put back on the runqueue, part2 From: Christoph Hellwig To not lose a spu thread we need to make sure it always gets put back on the runqueue. In find_victim aswell as in the scheduler tick as done in the previous patch. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- diffstat: sched.c | 6 ++++++ 1 file changed, 6 insertions(+) ====== spufs-use-barriers-for-set_bit.diff ====== Subject: spufs: add memory barriers after set_bit From: Arnd Bergmann set_bit does not guarantee ordering on powerpc, so using it for communication between threads requires explicit mb() calls. Signed-off-by: Arnd Bergmann --- diffstat: sched.c | 3 +++ 1 file changed, 3 insertions(+) ====== spusched-remove-from-runqueue-early.diff ====== Subject: remove woken threads from the runqueue early From: Christoph Hellwig A single context should only be woken once, and we should not have more wakeups for a given priority than the number of contexts on that runqueue position. Also add some asserts to trap future problems in this area more easily. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- diffstat: context.c | 2 + sched.c | 44 ++++++++++++++++-------------------------- 2 files changed, 19 insertions(+), 27 deletions(-) ====== spufs-always-release-mapping-lock.diff ====== Subject: fix missing unlock in spufs_signal1_release From: Christoph Hellwig Add a missing spin_unlock in spufs_signal1_release. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- diffstat: file.c | 1 + 1 file changed, 1 insertion(+) ====== spu_base-move-spu-init-channel-out-of-spu-mutex.diff ====== Subject: spu_base: move spu_init_channels out of spu_mutex From: Christoph Hellwig There is no reason to execute spu_init_channels under spu_mutex - after the spu has been taken off the freelist it's ours. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- --- diffstat: spu_base.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) ====== cell-rtas-ptcal.diff ====== Subject: cell: enable RTAS-based PTCAL for Cell XDR memory From: Jeremy Kerr Enable Periodic Recalibration (PTCAL) support for Cell XDR memory, using the new ibm,cbe-start-ptcal and ibm,cbe-stop-ptcal RTAS calls. Tested on QS20 and QS21 (by Thomas Huth). It seems that SLOF has problems disabling, at least on QS20; this patch should only be used once these problems have been addressed. Signed-off-by: Jeremy Kerr Signed-off-by: Arnd Bergmann -- Update: Updated with Michael's feedback, expanded comment. --- diffstat: ras.c | 160 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 160 insertions(+) ====== axon-ram-2.diff ====== Subject: cell: driver for DDR2 memory on AXON From: Maxim Shchetynin Signed-off-by: Arnd Bergmann --- diffstat: Kconfig | 7 sysdev/Makefile | 1 sysdev/axonram.c | 498 +++++++++++++++++++++++++++++++++++ 3 files changed, 506 insertions(+) ====== axon-ram-block-config-fix.diff ====== Subject: axonram bugfix From: Jens Osterkamp unlink_gendisk is already called in del_gendisk, so we dont need it here. Signed-off-by: Jens Osterkamp Signed-off-by: Arnd Bergmann --- --- diffstat: axonram.c | 1 - 1 file changed, 1 deletion(-) ====== spufs-export-expand_stack.diff ====== Subject: spufs: export expand_stack From: Arnd Bergmann An SPU can create page faults on the stack, which we need to handle from a loadable module, so export the expand_stack function used for this. Signed-off-by: Arnd Bergmann --- diffstat: mmap.c | 1 + 1 file changed, 1 insertion(+) ====== spufs-pagefault-rework.diff ====== Subject: spufs: make spu page faults not block scheduling From: Arnd Bergmann Until now, we have always entered the spu page fault handler with a mutex for the spu context held. This has multiple bad side-effects: - it becomes impossible to suspend the context during page faults - if an spu program attempts to access its own mmio areas through DMA, we get an immediate livelock when the nopage function tries to acquire the same mutex This patch makes the page fault logic operate on a struct spu_context instead of a struct spu, and moves it from spu_base.c to a new file fault.c inside of spufs. We now also need to copy the dar and dsisr contents of the last fault into the saved context to have it accessible in case we schedule out the context before activating the page fault handler. Signed-off-by: Arnd Bergmann --- diffstat: arch/powerpc/platforms/cell/spu_base.c | 103 ----- arch/powerpc/platforms/cell/spufs/Makefile | 1 arch/powerpc/platforms/cell/spufs/backing_ops.c | 6 arch/powerpc/platforms/cell/spufs/fault.c | 192 ++++++++++ arch/powerpc/platforms/cell/spufs/hw_ops.c | 9 arch/powerpc/platforms/cell/spufs/run.c | 28 - arch/powerpc/platforms/cell/spufs/spufs.h | 4 arch/powerpc/platforms/cell/spufs/switch.c | 8 include/asm-powerpc/mmu.h | 1 include/asm-powerpc/spu_csa.h | 1 10 files changed, 224 insertions(+), 129 deletions(-) ====== export-force-siginfo.diff ====== Subject: export force_sig_info From: Jeremy Kerr Export force_sig_info for use by modules. This is required to allow spufs to provide siginfo data for SPE-generated signals. Signed-off-by: Jeremy Kerr --- diffstat: signal.c | 1 + 1 file changed, 1 insertion(+) ====== spufs-provide-siginfo-for-SPE-faults.diff ====== Subject: spufs: provide siginfo for SPE faults From: Jeremy Kerr This change populates a siginfo struct for SPE application exceptions (ie, invalid DMAs and illegal instructions). Tested on an IBM Cell Blade. Signed-off-by: Jeremy Kerr Signed-off-by: Arnd Bergmann --- diffstat: fault.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) ====== cell-be_info-2.diff ====== Subject: cell: add per BE structure with info about its SPUs From: Andre Detsch Addition of a spufs-global "be_info" array. Each entry contains information about one BE node, namelly: * list of spus (both free and busy spus are in this list); * list of free spus (replacing the static spu_list from spu_base.c) * number of spus; * number of reserved (non scheduleable) spus. SPE affinity implementation actually requires only access to one spu per BE node (since it implements its own pointer to walk through the other spus of the ring) and the number of scheduleable spus (n_spus - non_sched_spus) However having this more general structure can be useful for other functionalities, concentrating per-be statistics / data. Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- --- diffstat: arch/powerpc/platforms/cell/spu_base.c | 26 ++++++---- arch/powerpc/platforms/cell/spufs/sched.c | 4 + include/asm-powerpc/spu.h | 10 +++ 3 files changed, 31 insertions(+), 9 deletions(-) ====== cell-spu_indexing-2.diff ====== Subject: cell: add vicinity information on spus From: Andre Detsch This patch adds affinity data to each spu instance. A doubly linked list is created, meant to connect the spus in the physical order they are placed in the BE. SPUs near to memory should be marked as having memory affinity. Adjustments of the fields acording to FW properties is done in separate patches, one for CPBW, one for Malta (patch for Malta under testing). Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- --- diffstat: arch/powerpc/platforms/cell/spu_base.c | 2 ++ include/asm-powerpc/spu.h | 3 +++ 2 files changed, 5 insertions(+) ====== cell-spu_indexing_QS20-2.diff ====== Subject: cell: add hardcoded spu vicinity information for QS20 From: Andre Detsch This patch allows the use of spu affinity on QS20, whose original FW does not provide affinity information. This is done through two hardcoded arrays, and by reading the reg property from each spu. Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- --- diffstat: spu_base.c | 55 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) ====== spufs-affinity_create-4.diff ====== Subject: spufs: extension of spu_create to support affinity definition From: Andre Detsch This patch adds support for additional flags at spu_create, which relate to the establishment of affinity between contexts and contexts to memory. A fourth, optional, parameter is supported. This parameter represent a affinity neighbor of the context being created, and is used when defining SPU-SPU affinity. Affinity is represented as a doubly linked list of spu_contexts. Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- --- diffstat: arch/powerpc/platforms/cell/spu_syscalls.c | 17 + arch/powerpc/platforms/cell/spufs/context.c | 1 arch/powerpc/platforms/cell/spufs/gang.c | 4 arch/powerpc/platforms/cell/spufs/inode.c | 132 +++++++++- arch/powerpc/platforms/cell/spufs/spufs.h | 16 + arch/powerpc/platforms/cell/spufs/syscalls.c | 32 ++ include/asm-powerpc/spu.h | 8 include/linux/syscalls.h | 2 8 files changed, 195 insertions(+), 17 deletions(-) ====== spufs-affinity_placement-3.diff ====== Subject: cell: add placement computation for scheduling of affinity contexts From: Andre Detsch This patch provides the spu affinity placement logic for the spufs scheduler. Each time a gang is going to be scheduled, the placement of a reference context is defined. The placement of all other contexts with affinity from the gang is defined based on this reference context location and on a precomputed displacement offset. Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- --- diffstat: gang.c | 4 - sched.c | 134 ++++++++++++++++++++++++++++++++++++++++++++ spufs.h | 6 + 3 files changed, 143 insertions(+), 1 deletion(-) ====== spufs-affinity_schedulling-2.diff ====== Subject: spufs: integration of SPE affinity with the scheduller From: Andre Detsch This patch makes the scheduller honor affinity information for each context being scheduled. If the context has no affinity information, behaviour is unchanged. If there are affinity information, context is schedulled to be run on the exact spu recommended by the affinity placement algorithm. Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- --- diffstat: spu_base.c | 19 +++++++++++++++++++ spufs/sched.c | 4 ++++ 2 files changed, 23 insertions(+) ====== cell-spu_indexing_FW_vicinity-1.diff ====== Subject: cell: indexing of SPUs based on firmware vicinity properties From: Andre Detsch This patch links spus according to their physical position using information provided by the firmware through a special vicinity device-tree property. This property is present in current version of Malta firmware. Example of vicinity properties for a node in Malta: Node: Vicinity property contains phandles of: spe@0 [ spe@100000 , mic-tm@50a000 ] spe@100000 [ spe@0 , spe@200000 ] spe@200000 [ spe@100000 , spe@300000 ] spe@300000 [ spe@200000 , bif0@512000 ] spe@80000 [ spe@180000 , mic-tm@50a000 ] spe@180000 [ spe@80000 , spe@280000 ] spe@280000 [ spe@180000 , spe@380000 ] spe@380000 [ spe@280000 , bif0@512000 ] Only spe@* have a vicinity property (e.g., bif0@512000 and mic-tm@50a000 do not have it). Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- diffstat: spu_base.c | 90 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 89 insertions(+), 1 deletion(-) ====== spufs-affinity-respecting-numa-properties.diff ====== Subject: spufs: affinity now respecting numa properties From: Andre Detsch Placement of context with affinity honoring numa properties, the same way contexts without affinity already do. Signed-off-by: Andre Detsch Signed-off-by: Arnd Bergmann --- diffstat: sched.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) ====== re-cell-oprofile-spu-profiling-updated-patch-3.diff ====== Subject: Add support to OProfile for profiling Cell BE SPUs From: Maynard Johnson This patch updates the existing arch/powerpc/oprofile/op_model_cell.c to add in the SPU profiling capabilities. In addition, a 'cell' subdirectory was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling code. Signed-off-by: Carl Love Signed-off-by: Maynard Johnson Signed-off-by: Arnd Bergmann --- diffstat: arch/powerpc/configs/cell_defconfig | 3 arch/powerpc/kernel/time.c | 1 arch/powerpc/oprofile/Kconfig | 7 arch/powerpc/oprofile/Makefile | 3 arch/powerpc/oprofile/cell/pr_util.h | 90 + arch/powerpc/oprofile/cell/spu_profiler.c | 220 ++++ arch/powerpc/oprofile/cell/spu_task_sync.c | 487 +++++++++ arch/powerpc/oprofile/cell/vma_map.c | 279 +++++ arch/powerpc/oprofile/common.c | 49 arch/powerpc/oprofile/op_model_cell.c | 505 +++++++++- arch/powerpc/oprofile/op_model_power4.c | 11 arch/powerpc/oprofile/op_model_rs64.c | 10 arch/powerpc/platforms/cell/spufs/context.c | 20 arch/powerpc/platforms/cell/spufs/sched.c | 8 arch/powerpc/platforms/cell/spufs/spufs.h | 4 drivers/oprofile/buffer_sync.c | 1 drivers/oprofile/event_buffer.h | 20 drivers/oprofile/oprof.c | 26 include/asm-powerpc/oprofile_impl.h | 10 include/asm-powerpc/spu.h | 15 include/linux/dcookies.h | 1 include/linux/elf-em.h | 3 include/linux/oprofile.h | 38 kernel/hrtimer.c | 1 24 files changed, 1723 insertions(+), 89 deletions(-) ====== oprofile-spu-cleanup.diff ====== Subject: cleanup spu oprofile code From: Arnd Bergmann This cleans up some of the new oprofile code. It's mostly cosmetic changes, like way multi-line comments are formatted. The most significant change is a simplification of the context-switch record format. It does mean the oprofile report tool needs to be adapted, but I'm sure that it pays off in the end. Signed-off-by: Arnd Bergmann --- diffstat: cell/spu_task_sync.c | 89 ++++--------- op_model_cell.c | 204 +++++++++++++++++-------------- 2 files changed, 147 insertions(+), 146 deletions(-) ====== miscellaneous-fixes-for-spu-profiling-code.diff ====== Subject: Miscellaneous fixes for SPU profiling code From: Maynard Johnson After applying the "cleanup spu oprofile code" patch posted by Arnd Bergmann on Feb 26, 2007, I found a few issues that required fixing up: - Bug fix: Initialize retval in spu_task_sync.c, line 95, otherwise this function returns non-zero and OProfile fails. - Remove unused codes in include/linux/oprofile.h - Compile warnings: Initialize offset and spu_cookie at lines 283 and 284 in spu_task_sync.c. Additionally, in a separate email, Arnd pointed out a bug in spu_task_sync.c:process_context_switch, where we were ignoring invalid values in the dcookies returned from get_exec_dcookie_and_offset. This is fixed in this patch so that we now fail with ENOENT if either cookie is invalid. Signed-off-by: Maynard Johnson Signed-off-by: Arnd Bergmann --- diffstat: arch/powerpc/oprofile/cell/spu_task_sync.c | 10 +++- include/linux/oprofile.h | 23 ++++------ 2 files changed, 17 insertions(+), 16 deletions(-) ====== re-add-support-to-oprofile-for-profiling-cell-be-spus-update-2.diff ====== Subject: Enable SPU switch notification to detect currently active SPU tasks. From: Maynard Johnson This patch adds to the capability of spu_switch_event_register so that the caller is also notified of currently active SPU tasks. It also exports spu_switch_event_register and spu_switch_event_unregister. Signed-off-by: Maynard Johnson Signed-off-by: Carl Love Signed-off-by: Arnd Bergmann --- diffstat: run.c | 16 ++++++++++++++-- sched.c | 30 ++++++++++++++++++++++++++++-- spufs.h | 2 ++ 3 files changed, 44 insertions(+), 4 deletions(-) ====== cell-oprofile-compile-fix.diff ====== Subject: Fix oprofile compilation From: Christoph Hellwig Fix compilation for CONFIG_OPROFILE=y CONFIG_OPROFILE_CELL=n on cell. Signed-off-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- diffstat: common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ====== this-is-a-hack-to-get_unmapped_area-to-make-the-spe-64k-code-work.diff ====== Subject: This is a hack to get_unmapped_area to make the SPE 64K code work. From: Benjamin Herrenschmidt (Though it might prove to not have nasty side effects ...) The basic idea is that if the filesystem's get_unmapped_area was used, we skip the hugepage check. That assumes that the only filesytems that provide a g_u_a callback are either hugetlbfs itself, or filesystems that have arch specific code that "knows" already not to collide with hugetlbfs. A proper fix will be done later, basically by removing the hugetlbfs hacks completely from get_unmapped_area and calling down to the mm and/or the filesytem g_u_a implementations for MAX_FIXED as well. (Note that this will still rely on the fact that filesytems that provide a g_u_a "know" how to return areas that don't collide with hugetlbfs, thus the base assumption is the same as this hack) Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Arnd Bergmann mm/mmap.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) --- diffstat: mmap.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) ====== powerpc-introduce-address-space-slices.diff ====== Subject: powerpc: Introduce address space "slices" From: Benjamin Herrenschmidt The basic issue is to be able to do what hugetlbfs does but with different page sizes for some other special filesystems, more specifically, my need is: - Huge pages - SPE local store mappings using 64K pages on a 4K base page size kernel on Cell - Some special 4K segments in 64K pages kernels for mapping a dodgy specie of powerpc specific infiniband hardware that requires 4K MMU mappings for various reasons I won't explain here. The main issues are: - To maintain/keep track of the page size per "segments" (as we can only have one page size per segment on powerpc, which are 256MB divisions of the address space). - To make sure special mappings stay within their alloted "segments" (including MAP_FIXED crap) - To make sure everybody else doesn't mmap/brk/grow_stack into a "segment" that is used for a special mapping Some of the necessary mecanisms to handle that were present in the hugetlbfs code, but mostly in ways not suitable for anything else. The patch address these in various ways described quickly below that hijack some of the existing hugetlbfs callbacks. The ideal solution requires some changes to the generic get_unmapped_area(), among others, to get rid of the hugetlbfs hacks in there, and instead, make sure that the fs and mm get_unmapped_area are also called for MAP_FIXED. We might also need to add an mm callback to validate a mapping. I intend to do those changes separately and then adapt this work to use them. So what is a slice ? Well, I re-used the mecanism used formerly by our hugetlbfs implementation which divides the address space in "meta-segments" which I called "slices". The division is done using 256MB slices below 4G, and 1T slices above. Thus the address space is divided currently into 16 "low" slices and 16 "high" slices. (Special case: high slice 0 is the area between 4G and 1T). Doing so simplifies significantly the tracking of segments and avoid having to keep track of all the 256MB segments in the address space. While I used the "concepts" of hugetlbfs, I mostly re-implemented everything in a more generic way and "ported" hugetlbfs to it. slices can have an associated page size, which is encoded in the mmu context and used by the SLB miss handler to set the segment sizes. The hash code currently doesn't care, it has a specific check for hugepages, though I might add a mecanism to provide per-slice hash mapping functions in the future. The slice code provide a pair of "generic" get_unmapped_area() (bottomup and topdown) functions that should work with any slice size. There is some trickyness here so I would appreciate people to have a look at the implementation of these and let me know if I got something wrong. Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Arnd Bergmann arch/powerpc/Kconfig | 5 arch/powerpc/kernel/asm-offsets.c | 16 arch/powerpc/mm/Makefile | 1 arch/powerpc/mm/hash_utils_64.c | 124 +++--- arch/powerpc/mm/hugetlbpage.c | 528 --------------------------- arch/powerpc/mm/mmu_context_64.c | 10 arch/powerpc/mm/slb.c | 11 arch/powerpc/mm/slb_low.S | 54 +- arch/powerpc/mm/slice.c | 630 +++++++++++++++++++++++++++++++++ arch/powerpc/platforms/cell/spu_base.c | 9 include/asm-powerpc/mmu.h | 12 include/asm-powerpc/paca.h | 2 include/asm-powerpc/page_64.h | 87 ++-- 13 files changed, 827 insertions(+), 662 deletions(-) --- diffstat: arch/powerpc/Kconfig | 5 arch/powerpc/kernel/asm-offsets.c | 16 arch/powerpc/mm/Makefile | 1 arch/powerpc/mm/hash_utils_64.c | 124 +- arch/powerpc/mm/hugetlbpage.c | 528 ---------- arch/powerpc/mm/mmu_context_64.c | 10 arch/powerpc/mm/slb.c | 11 arch/powerpc/mm/slb_low.S | 54 - arch/powerpc/mm/slice.c | 630 +++++++++++++ arch/powerpc/platforms/cell/spu_base.c | 9 include/asm-powerpc/mmu.h | 12 include/asm-powerpc/paca.h | 2 include/asm-powerpc/page_64.h | 87 - 13 files changed, 827 insertions(+), 662 deletions(-) ====== powerpc-add-ability-to-4k-kernel-to-hash-in-64k-pages.diff ====== Subject: powerpc: Add ability to 4K kernel to hash in 64K pages From: Benjamin Herrenschmidt This patch adds the ability for a kernel compiled with 4K page size to have special slices containing 64K pages and hash the right type of hash PTEs. Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Arnd Bergmann arch/powerpc/Kconfig | 6 ++++++ arch/powerpc/mm/hash_low_64.S | 5 ++++- arch/powerpc/mm/hash_utils_64.c | 36 +++++++++++++++++++++++------------- arch/powerpc/mm/tlb_64.c | 12 +++++++++--- include/asm-powerpc/pgtable-4k.h | 6 +++++- include/asm-powerpc/pgtable-64k.h | 7 ++++++- 6 files changed, 53 insertions(+), 19 deletions(-) --- diffstat: arch/powerpc/Kconfig | 6 +++ arch/powerpc/mm/hash_low_64.S | 5 ++ arch/powerpc/mm/hash_utils_64.c | 36 +++++++++++------- arch/powerpc/mm/tlb_64.c | 12 ++++-- include/asm-powerpc/pgtable-4k.h | 6 ++- include/asm-powerpc/pgtable-64k.h | 7 +++ 6 files changed, 53 insertions(+), 19 deletions(-) ====== powerpc-spufs-support-for-64k-ls-mappings-on-4k-kernels.diff ====== Subject: powerpc: spufs support for 64K LS mappings on 4K kernels From: Benjamin Herrenschmidt This patch adds an option to spufs when the kernel is configured for 4K page to give it the ability to use 64K pages for SPE local store mappings. Currently, we are optimistic and try order 4 allocations when creting contexts. If that fails, the code will fallback to 4K automatically. Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Arnd Bergmann arch/powerpc/platforms/cell/Kconfig | 15 + arch/powerpc/platforms/cell/spufs/Makefile | 2 arch/powerpc/platforms/cell/spufs/context.c | 4 arch/powerpc/platforms/cell/spufs/file.c | 77 ++++++++-- arch/powerpc/platforms/cell/spufs/lscsa_alloc.c | 181 ++++++++++++++++++++++++ arch/powerpc/platforms/cell/spufs/switch.c | 28 +-- include/asm-powerpc/spu_csa.h | 10 + 7 files changed, 282 insertions(+), 35 deletions(-) --- diffstat: arch/powerpc/platforms/cell/Kconfig | 15 arch/powerpc/platforms/cell/spufs/Makefile | 2 arch/powerpc/platforms/cell/spufs/context.c | 4 arch/powerpc/platforms/cell/spufs/file.c | 79 +++- arch/powerpc/platforms/cell/spufs/lscsa_alloc.c | 181 ++++++++++ arch/powerpc/platforms/cell/spufs/switch.c | 28 - include/asm-powerpc/spu_csa.h | 10 7 files changed, 283 insertions(+), 36 deletions(-) ====== allow-spufs-to-build-as-a-module-with-slices-enabled.diff ====== Subject: Allow spufs to build as a module with slices enabled From: Michael Ellerman The slice code is missing some exports to allow spufs to build as a module. Add them. Signed-off-by: Michael Ellerman Signed-off-by: Arnd Bergmann --- MODPOST 209 modules WARNING: ".get_slice_psize" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! WARNING: ".slice_get_unmapped_area" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! --- arch/powerpc/mm/slice.c | 3 +++ 1 file changed, 3 insertions(+) --- diffstat: slice.c | 3 +++ 1 file changed, 3 insertions(+) ====== 64k-ls-mappings-fix.diff ====== Subject: fix ls store access with 64k mappings From: Benjamin Herrenschmidt This is also part of the latest patch posted to cbe-oss-dev. Signed-off-by: Arnd Bergmann --- diffstat: hash_utils_64.c | 9 +++++++++ 1 file changed, 9 insertions(+) ====== ipv6-round-robin-stub.diff ====== Subject: Patch ported from LTC bugzilla report 31558 From: sri@us.ibm.com We saw similar stack traces with RHEL5 on zSeries(bug #28338) and iSeries(bug #29263). There definitely seems to be some bugs in the ipv6 fib insertion code. Redhat decided to work around this bug by disabling round-robin routing with ipv6 until the fib management code is fixed. The following patch does this. Acked-by: Linas Vepstas Signed-off-by: Arnd Bergmann ---- net/ipv6/route.c | 10 ++++++++++ 1 file changed, 10 insertions(+) --- diffstat: route.c | 10 ++++++++++ 1 file changed, 10 insertions(+) ====== mm-fix-alloc_bootmem-on-nodes-without-mem.diff ====== Subject: mm: enables booting a NUMA system where some nodes have no memory From: Christian Krafft When booting a NUMA system with nodes that have no memory (eg by limiting memory), bootmem_alloc_core tried to find pages in an uninitialized bootmem_map. This caused a null pointer access. This fix adds a check, so that NULL is returned. That will enable the caller (bootmem_alloc_nopanic) to alloc memory on other without a panic. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: bootmem.c | 4 ++++ 1 file changed, 4 insertions(+) ====== mm-fix-alloc_bootmem-call-after-bootmem-freed-2.diff ====== Subject: mm: fix call to alloc_bootmem after bootmem has been freed From: Christian Krafft In some cases it might happen, that alloc_bootmem is beeing called after bootmem pages have been freed. This is, because the condition SYSTEM_BOOTING is still true after bootmem has been freed. Signed-off-by: Christian Krafft Signed-off-by: Arnd Bergmann --- diffstat: page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ====== early_pfn_in_nid-workaround-2.diff ====== Subject: early_pfn_in_nid() called when not early From: Arnd Bergmann After a lot of debugging in spufs, I found that a crash that we encountered on Cell actually was caused by a change in the memory management. The patch that caused it is archived in http://lkml.org/lkml/2006/11/1/43, and this one has been discussed back and forth, but I fear that the current version may be broken for all setups that do memory hotplug with sparsemen and NUMA, at least on powerpc. What happens exactly is that the spufs code tries to register the memory area owned by the SPU as hotplug memory in order to get page structs (we probably shouldn't do it that way, but that's a separate discussion). memmap_init_zone now calls early_pfn_valid() and early_pfn_in_nid() in order to determine if the page struct should be initialized. This is wrong for two reasons: - early_pfn_in_nid checks the early_node_map variable to determine to which node the hot plugged memory belongs. However, the new memory never was part of the early_node_map to start with, so it incorrectly returns node zero, and then fails to initialize the page struct if we were trying to add it to a nonzero node. This is probably not a problem for pseries, but it is for cell. - both early_pfn_{in,to}_nid and early_node_map are in the __init section and may already have been freed at the time we are calling memmap_init_zone(). The patch below is not a suggested fix that I want to get into mainline (checking slab_is_available is the wrong here), but it is a quick fix that you should apply if you want to run a recent (post-2.6.18) kernel on the IBM QS20 blade. I'm sorry for not having reported this earlier, but we were always trying to find the problem in my own code... Signed-off-by: Arnd Bergmann --- diffstat: page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)