GIT 5765a57714b0a7bb52ea3787ff74d60a9e0e9952 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6.git#test commit d50f5c5ca0c3426669fbe11ad4d5708d333eb9fb Author: Andreas Schwab Date: Fri Jan 13 23:46:38 2006 +0100 [IA64] build broken for ia64 simserial.c TTY layer buffering revamp broke ia64 in commit 33f0f88f1c51ae5c2d593d26960c760ea154c2e2 CC arch/ia64/hp/sim/simserial.o arch/ia64/hp/sim/simserial.c: In function `receive_chars': arch/ia64/hp/sim/simserial.c:170: error: structure has no member named `flip' ... and so on ... make[1]: *** [arch/ia64/hp/sim/simserial.o] Error 1 Patch from Andreas Schwab. Signed-off-by: Tony Luck commit d3ef1f5aafcf7a4129eb2078c70bc9e577bc3af1 Author: Zhang Yanmin Date: Fri Jan 13 14:45:21 2006 -0800 [IA64] prevent accidental modification of args in jprobe handler When jprobe is hit, the function parameters of the original function should be saved before jprobe handler is executed, and restored it after jprobe handler is executed, because jprobe handler might change the register values due to tail call optimization by the gcc. Signed-off-by: Zhang Yanmin Signed-off-by: Anil S Keshavamurthy Signed-off-by: Tony Luck commit e026cca0f2c09c4c28c902db6384fd8a412671d6 Author: Keith Owens Date: Fri Jan 6 10:36:06 2006 +1100 [IA64] Add hotplug cpu to salinfo.c, replace semaphore with mutex Add hotplug cpu support to salinfo.c. The cpu_event field is a cpumask so use the cpu_* macros consistently, replacing the existing mixture of cpu_* and *_bit macros. Instead of counting the number of outstanding events in a semaphore and trying to track that count over user space context, interrupt context, non-maskable interrupt context and cpu hotplug, replace the semaphore with a test for "any bits set" combined with a mutex. Modify the locking to make the test for "work to do" an atomic operation. Signed-off-by: Keith Owens Signed-off-by: Tony Luck commit 15029285dc977a392e74eacb7625984b71d4f605 Author: Jason Uhlenkott Date: Fri Dec 30 02:27:01 2005 -0800 [IA64] Handle debug traps in fsys mode We need to handle debug traps in fsys mode non-fatally. They can happen now that we have fsyscalls which contain probe instructions. Signed-off-by: Jason Uhlenkott Signed-off-by: Tony Luck commit 6d6e420005f3753392b608a614eee8475bdc16f7 Author: Prarit Bhargava Date: Fri Dec 23 13:33:25 2005 -0500 [IA64-SGI] Fix sn_flush_device_kernel & spinlock initialization This patch separates the sn_flush_device_list struct into kernel and common (both kernel and PROM accessible) structures. As it was, if the size of a spinlock_t changed (due to additional CONFIG options, etc.) the sal call which populated the sn_flush_device_list structs would erroneously write data (and cause memory corruption and/or a panic). This patch does the following: 1. Removes sn_flush_device_list and adds sn_flush_device_common and sn_flush_device_kernel. 2. Adds a new SAL call to populate a sn_flush_device_common struct per device, not per widget as previously done. 3. Correctly initializes each device's sn_flush_device_kernel spinlock_t struct (before it was only doing each widget's first device). Signed-off-by: Prarit Bhargava Signed-off-by: Tony Luck commit cfbb1426bd76c4ba6ec4491c8df2a5dd3d984750 Author: Jack Steiner Date: Thu Dec 22 13:45:41 2005 -0600 [IA64] Hole in IA64 TLB flushing from system threads I originally thought this was an bug only in the SN code, but I think I also see a hole in the generic IA64 tlb code. (Separate patch was sent for the SN problem). It looks like there is a bug in the TLB flushing code. During context switch, kernel threads (kswapd, for example) inherit the mm of the task that was previously running on the cpu. Normally, this is ok because the previous context is still loaded into the RR registers. However, if the owner of the mm migrates to another cpu, changes it's context number, and references a page before kswapd issues a tlb_purge for that same page, the purge will be done with a stale context number (& RR registers). Signed-off-by: Tony Luck commit 17e8ce0e9417eee1f57f9b3d4aad168425e043c3 Author: Russ Anderson Date: Fri Dec 16 17:19:01 2005 -0600 [IA64-SGI] Altix BTE error handling fixes Altix (shub2) pushes the BTE clean-up into SAL. This patch correctly interfaces with the now implemented SAL call. It also fixes a bug when delaying clean-up to allow busy BTEs to complete (or error out). Signed-off-by: Russ Anderson Signed-off-by: Tony Luck commit 8a4b7b6f187f2967bff222e8c3758ab47efdb14f Author: Francois Wellenrieter Date: Fri Jan 13 14:01:01 2006 -0800 [IA64] Fix conversion of pal_min_state physical address On return from INIT handler we must convert the address of the minstate area from a kernel virtual uncached address (0xC...) to physical uncached (0x8...). A typo (or thinko?) in the code converted to physical cached. Signed-off-by: Tony Luck commit 9335d48e10d2d07eacaddf889ec1efb8a5a5082e Author: Dean Nelson Date: Tue Jan 10 11:12:32 2006 -0600 [IA64-SGI] move xpc.h to include/asm-ia64/sn (cleanup) Cleanup a few items after moving xpc.h from arch/ia64/sn/kernel to include/asm-ia64/sn. Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit 87a149d6bba5949fbc53b8a21189b54748ac9e2a Author: Dean Nelson Date: Tue Jan 10 11:09:48 2006 -0600 [IA64-SGI] move xpc.h to include/asm-ia64/sn Move xpc.h from arch/ia64/sn/kernel to include/asm-ia64/sn without change. Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit d6ad033a88b42420ddb6c62c95e42f88d862b246 Author: Dean Nelson Date: Tue Jan 10 11:08:55 2006 -0600 [IA64-SGI] move xpc_system_reboot() Move xpc_system_reboot() to be closer to the file it calls for readability reasons (which are indeed subjective). Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit 1f4674b2d5f63bac4c393ac4de1d6c1b6b72c09c Author: Dean Nelson Date: Tue Jan 10 11:08:00 2006 -0600 [IA64-SGI] ignoring loss of heartbeat while XPC is in kdebug Allow for the loss of heartbeat while in kdebug to be ignored by remote partitions. Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit 0752c670d83362609c7f3f59ffa0e180709c60c2 Author: Dean Nelson Date: Tue Jan 10 11:07:19 2006 -0600 [IA64-SGI] XPC and unregistering from notifier lists Only unregister from notifier lists if XPC is unloading. Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit 1ecaded80f94f2779160529aef7d6f37a22c2f60 Author: Dean Nelson Date: Fri Jan 6 09:48:21 2006 -0600 [IA64-SGI] cleanup XPC disengage related messages Cleanup the XPC disengage related messages that are printed to the log. Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit 246c7e33d51afe99890b2caab7ad482c0296d5ba Author: Dean Nelson Date: Thu Dec 22 14:32:56 2005 -0600 [IA64-SGI] ensure XPC disengage request is processed This patch fixes a problem in XPC disengage processing whereby it was not seeing the request to disengage from a remote partition, so the disengage wasn't happening. The disengagement is suppose to transpire during the time a XPC channel is disconnecting, and should be completed before the channel is declared to be disconnected. Signed-off-by: Dean Nelson Signed-off-by: Tony Luck commit 7ae69d2aa4ed3ee8cef18a072346366f019d6a4a Author: Tony Luck Date: Fri Jan 13 10:03:58 2006 -0800 [IA64] Add stub entry to fsys.S for sys_migrate_pages When this new syscall was added to ia64 in commit 39743889aaf76725152f16aa90ca3c45f6d52da3 fsys.S was forgotten. Add a ".data8 0" there to keep it in step. [Reported by Stephane Eranian] Signed-off-by: Tony Luck commit ff741906ad3cf4b8ca1a958acb013a97a6381ca2 Author: Ashok Raj Date: Fri Nov 11 14:32:40 2005 -0800 [IA64] support for cpu0 removal here is the BSP removal support for IA64. Its pretty much the same thing that was released a while back, but has your feedback incorporated. - Removed CONFIG_BSP_REMOVE_WORKAROUND and associated cmdline param - Fixed compile issue with sn2/zx1 due to a undefined fix_b0_for_bsp - some formatting nits (whitespace etc) This has been tested on tiger and long back by alex on hp systems as well. Signed-off-by: Ashok Raj Signed-off-by: Tony Luck --- diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 199eeaf..5e0f58e 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -272,6 +272,25 @@ config SCHED_SMT Intel IA64 chips with MultiThreading at a cost of slightly increased overhead in some places. If unsure say N here. +config PERMIT_BSP_REMOVE + bool "Support removal of Bootstrap Processor" + depends on HOTPLUG_CPU + default n + ---help--- + Say Y here if your platform SAL will support removal of BSP with HOTPLUG_CPU + support. + +config FORCE_CPEI_RETARGET + bool "Force assumption that CPEI can be re-targetted" + depends on PERMIT_BSP_REMOVE + default n + ---help--- + Say Y if you need to force the assumption that CPEI can be re-targetted to + any cpu in the system. This hint is available via ACPI 3.0 specifications. + Tiger4 systems are capable of re-directing CPEI to any CPU other than BSP. + This option it useful to enable this feature on older BIOS's as well. + You can also enable this by using boot command line option force_cpei=1. + config PREEMPT bool "Preemptible Kernel" help diff --git a/arch/ia64/configs/tiger_defconfig b/arch/ia64/configs/tiger_defconfig index b1e8f09..aed034d 100644 --- a/arch/ia64/configs/tiger_defconfig +++ b/arch/ia64/configs/tiger_defconfig @@ -114,6 +114,8 @@ CONFIG_FORCE_MAX_ZONEORDER=17 CONFIG_SMP=y CONFIG_NR_CPUS=4 CONFIG_HOTPLUG_CPU=y +CONFIG_PERMIT_BSP_REMOVE=y +CONFIG_FORCE_CPEI_RETARGET=y # CONFIG_SCHED_SMT is not set # CONFIG_PREEMPT is not set CONFIG_SELECT_MEMORY_MODEL=y diff --git a/arch/ia64/hp/sim/simserial.c b/arch/ia64/hp/sim/simserial.c index a346e18..27f23fa 100644 --- a/arch/ia64/hp/sim/simserial.c +++ b/arch/ia64/hp/sim/simserial.c @@ -167,15 +167,9 @@ static void receive_chars(struct tty_st } } seen_esc = 0; - if (tty->flip.count >= TTY_FLIPBUF_SIZE) break; - *tty->flip.char_buf_ptr = ch; - - *tty->flip.flag_buf_ptr = 0; - - tty->flip.flag_buf_ptr++; - tty->flip.char_buf_ptr++; - tty->flip.count++; + if (tty_insert_flip_char(tty, ch, TTY_NORMAL) == 0) + break; } tty_flip_buffer_push(tty); } diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index 9ad94dd..fe1d90b 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -287,16 +287,20 @@ acpi_parse_plat_int_src(acpi_table_entry unsigned int can_cpei_retarget(void) { extern int cpe_vector; + extern unsigned int force_cpei_retarget; /* * Only if CPEI is supported and the override flag * is present, otherwise return that its re-targettable * if we are in polling mode. */ - if (cpe_vector > 0 && !acpi_cpei_override) - return 0; - else - return 1; + if (cpe_vector > 0) { + if (acpi_cpei_override || force_cpei_retarget) + return 1; + else + return 0; + } + return 1; } unsigned int is_cpu_cpei_target(unsigned int cpu) diff --git a/arch/ia64/kernel/fsys.S b/arch/ia64/kernel/fsys.S index 2ddbac6..ce42391 100644 --- a/arch/ia64/kernel/fsys.S +++ b/arch/ia64/kernel/fsys.S @@ -903,5 +903,6 @@ fsyscall_table: data8 0 data8 0 data8 0 + data8 0 // 1280 .org fsyscall_table + 8*NR_syscalls // guard against failures to increase NR_syscalls diff --git a/arch/ia64/kernel/iosapic.c b/arch/ia64/kernel/iosapic.c index 574084f..37ac742 100644 --- a/arch/ia64/kernel/iosapic.c +++ b/arch/ia64/kernel/iosapic.c @@ -631,6 +631,7 @@ get_target_cpu (unsigned int gsi, int ve { #ifdef CONFIG_SMP static int cpu = -1; + extern int cpe_vector; /* * In case of vector shared by multiple RTEs, all RTEs that @@ -653,6 +654,11 @@ get_target_cpu (unsigned int gsi, int ve if (!cpu_online(smp_processor_id())) return cpu_physical_id(smp_processor_id()); +#ifdef CONFIG_ACPI + if (cpe_vector > 0 && vector == IA64_CPEP_VECTOR) + return get_cpei_target_cpu(); +#endif + #ifdef CONFIG_NUMA { int num_cpus, cpu_index, iosapic_index, numa_cpu, i = 0; diff --git a/arch/ia64/kernel/irq.c b/arch/ia64/kernel/irq.c index d33244c..5ce908e 100644 --- a/arch/ia64/kernel/irq.c +++ b/arch/ia64/kernel/irq.c @@ -163,8 +163,19 @@ void fixup_irqs(void) { unsigned int irq; extern void ia64_process_pending_intr(void); + extern void ia64_disable_timer(void); + extern volatile int time_keeper_id; + + ia64_disable_timer(); + + /* + * Find a new timesync master + */ + if (smp_processor_id() == time_keeper_id) { + time_keeper_id = first_cpu(cpu_online_map); + printk ("CPU %d is now promoted to time-keeper master\n", time_keeper_id); + } - ia64_set_itv(1<<16); /* * Phase 1: Locate irq's bound to this cpu and * relocate them for cpu removal. diff --git a/arch/ia64/kernel/jprobes.S b/arch/ia64/kernel/jprobes.S index 2323377..5cd6226 100644 --- a/arch/ia64/kernel/jprobes.S +++ b/arch/ia64/kernel/jprobes.S @@ -60,3 +60,30 @@ END(jprobe_break) GLOBAL_ENTRY(jprobe_inst_return) br.call.sptk.many b0=jprobe_break END(jprobe_inst_return) + +GLOBAL_ENTRY(invalidate_stacked_regs) + movl r16=invalidate_restore_cfm + ;; + mov b6=r16 + ;; + br.ret.sptk.many b6 + ;; +invalidate_restore_cfm: + mov r16=ar.rsc + ;; + mov ar.rsc=r0 + ;; + loadrs + ;; + mov ar.rsc=r16 + ;; + br.cond.sptk.many rp +END(invalidate_stacked_regs) + +GLOBAL_ENTRY(flush_register_stack) + // flush dirty regs to backing store (must be first in insn group) + flushrs + ;; + br.ret.sptk.many rp +END(flush_register_stack) + diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c index 346fedf..50ae8c7 100644 --- a/arch/ia64/kernel/kprobes.c +++ b/arch/ia64/kernel/kprobes.c @@ -766,11 +766,56 @@ int __kprobes kprobe_exceptions_notify(s return ret; } +struct param_bsp_cfm { + unsigned long ip; + unsigned long *bsp; + unsigned long cfm; +}; + +static void ia64_get_bsp_cfm(struct unw_frame_info *info, void *arg) +{ + unsigned long ip; + struct param_bsp_cfm *lp = arg; + + do { + unw_get_ip(info, &ip); + if (ip == 0) + break; + if (ip == lp->ip) { + unw_get_bsp(info, (unsigned long*)&lp->bsp); + unw_get_cfm(info, (unsigned long*)&lp->cfm); + return; + } + } while (unw_unwind(info) >= 0); + lp->bsp = 0; + lp->cfm = 0; + return; +} + int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs) { struct jprobe *jp = container_of(p, struct jprobe, kp); unsigned long addr = ((struct fnptr *)(jp->entry))->ip; struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); + struct param_bsp_cfm pa; + int bytes; + + /* + * Callee owns the argument space and could overwrite it, eg + * tail call optimization. So to be absolutely safe + * we save the argument space before transfering the control + * to instrumented jprobe function which runs in + * the process context + */ + pa.ip = regs->cr_iip; + unw_init_running(ia64_get_bsp_cfm, &pa); + bytes = (char *)ia64_rse_skip_regs(pa.bsp, pa.cfm & 0x3f) + - (char *)pa.bsp; + memcpy( kcb->jprobes_saved_stacked_regs, + pa.bsp, + bytes ); + kcb->bsp = pa.bsp; + kcb->cfm = pa.cfm; /* save architectural state */ kcb->jprobe_saved_regs = *regs; @@ -792,8 +837,20 @@ int __kprobes setjmp_pre_handler(struct int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) { struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); + int bytes; + /* restoring architectural state */ *regs = kcb->jprobe_saved_regs; + + /* restoring the original argument space */ + flush_register_stack(); + bytes = (char *)ia64_rse_skip_regs(kcb->bsp, kcb->cfm & 0x3f) + - (char *)kcb->bsp; + memcpy( kcb->bsp, + kcb->jprobes_saved_stacked_regs, + bytes ); + invalidate_stacked_regs(); + preempt_enable_no_resched(); return 1; } diff --git a/arch/ia64/kernel/mca.c b/arch/ia64/kernel/mca.c index ee7eec9..87fb7ce 100644 --- a/arch/ia64/kernel/mca.c +++ b/arch/ia64/kernel/mca.c @@ -289,6 +289,7 @@ ia64_mca_log_sal_error_record(int sal_in #ifdef CONFIG_ACPI int cpe_vector = -1; +int ia64_cpe_irq = -1; static irqreturn_t ia64_mca_cpe_int_handler (int cpe_irq, void *arg, struct pt_regs *ptregs) @@ -1444,11 +1445,13 @@ void __devinit ia64_mca_cpu_init(void *cpu_data) { void *pal_vaddr; + static int first_time = 1; - if (smp_processor_id() == 0) { + if (first_time) { void *mca_data; int cpu; + first_time = 0; mca_data = alloc_bootmem(sizeof(struct ia64_mca_cpu) * NR_CPUS + KERNEL_STACK_SIZE); mca_data = (void *)(((unsigned long)mca_data + @@ -1704,6 +1707,7 @@ ia64_mca_late_init(void) desc = irq_descp(irq); desc->status |= IRQ_PER_CPU; setup_irq(irq, &mca_cpe_irqaction); + ia64_cpe_irq = irq; } ia64_mca_register_cpev(cpe_vector); IA64_MCA_DEBUG("%s: CPEI/P setup and enabled.\n", __FUNCTION__); diff --git a/arch/ia64/kernel/mca_asm.S b/arch/ia64/kernel/mca_asm.S index db32fc1..403a80a 100644 --- a/arch/ia64/kernel/mca_asm.S +++ b/arch/ia64/kernel/mca_asm.S @@ -847,7 +847,7 @@ ia64_state_restore: ;; mov cr.iim=temp3 mov cr.iha=temp4 - dep r22=0,r22,62,2 // pal_min_state, physical, uncached + dep r22=0,r22,62,1 // pal_min_state, physical, uncached mov IA64_KR(CURRENT)=r21 ld8 r8=[temp1] // os_status ld8 r10=[temp2] // context diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c index bd87cb6..eb072fa 100644 --- a/arch/ia64/kernel/perfmon.c +++ b/arch/ia64/kernel/perfmon.c @@ -6719,6 +6719,7 @@ __initcall(pfm_init); void pfm_init_percpu (void) { + static int first_time=1; /* * make sure no measurement is active * (may inherit programmed PMCs from EFI). @@ -6731,8 +6732,10 @@ pfm_init_percpu (void) */ pfm_unfreeze_pmu(); - if (smp_processor_id() == 0) + if (first_time) { register_percpu_irq(IA64_PERFMON_VECTOR, &perfmon_irqaction); + first_time=0; + } ia64_setreg(_IA64_REG_CR_PMV, IA64_PERFMON_VECTOR); ia64_srlz_d(); diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c index a87a162..9d5a823 100644 --- a/arch/ia64/kernel/salinfo.c +++ b/arch/ia64/kernel/salinfo.c @@ -3,7 +3,7 @@ * * Creates entries in /proc/sal for various system features. * - * Copyright (c) 2003 Silicon Graphics, Inc. All rights reserved. + * Copyright (c) 2003, 2006 Silicon Graphics, Inc. All rights reserved. * Copyright (c) 2003 Hewlett-Packard Co * Bjorn Helgaas * @@ -27,9 +27,17 @@ * mca.c may not pass a buffer, a NULL buffer just indicates that a new * record is available in SAL. * Replace some NR_CPUS by cpus_online, for hotplug cpu. + * + * Jan 5 2006 kaos@sgi.com + * Handle hotplug cpus coming online. + * Handle hotplug cpus going offline while they still have outstanding records. + * Use the cpu_* macros consistently. + * Replace the counting semaphore with a mutex and a test if the cpumask is non-empty. + * Modify the locking to make the test for "work to do" an atomic operation. */ #include +#include #include #include #include @@ -132,8 +140,8 @@ enum salinfo_state { }; struct salinfo_data { - volatile cpumask_t cpu_event; /* which cpus have outstanding events */ - struct semaphore sem; /* count of cpus with outstanding events (bits set in cpu_event) */ + cpumask_t cpu_event; /* which cpus have outstanding events */ + struct semaphore mutex; u8 *log_buffer; u64 log_size; u8 *oemdata; /* decoded oem data */ @@ -174,6 +182,21 @@ struct salinfo_platform_oemdata_parms { int ret; }; +/* Kick the mutex that tells user space that there is work to do. Instead of + * trying to track the state of the mutex across multiple cpus, in user + * context, interrupt context, non-maskable interrupt context and hotplug cpu, + * it is far easier just to grab the mutex if it is free then release it. + * + * This routine must be called with data_saved_lock held, to make the down/up + * operation atomic. + */ +static void +salinfo_work_to_do(struct salinfo_data *data) +{ + down_trylock(&data->mutex); + up(&data->mutex); +} + static void salinfo_platform_oemdata_cpu(void *context) { @@ -212,9 +235,9 @@ salinfo_log_wakeup(int type, u8 *buffer, BUG_ON(type >= ARRAY_SIZE(salinfo_log_name)); + if (irqsafe) + spin_lock_irqsave(&data_saved_lock, flags); if (buffer) { - if (irqsafe) - spin_lock_irqsave(&data_saved_lock, flags); for (i = 0, data_saved = data->data_saved; i < saved_size; ++i, ++data_saved) { if (!data_saved->buffer) break; @@ -232,13 +255,11 @@ salinfo_log_wakeup(int type, u8 *buffer, data_saved->size = size; data_saved->buffer = buffer; } - if (irqsafe) - spin_unlock_irqrestore(&data_saved_lock, flags); } - - if (!test_and_set_bit(smp_processor_id(), &data->cpu_event)) { - if (irqsafe) - up(&data->sem); + cpu_set(smp_processor_id(), data->cpu_event); + if (irqsafe) { + salinfo_work_to_do(data); + spin_unlock_irqrestore(&data_saved_lock, flags); } } @@ -249,20 +270,17 @@ static struct timer_list salinfo_timer; static void salinfo_timeout_check(struct salinfo_data *data) { - int i; + unsigned long flags; if (!data->open) return; - for_each_online_cpu(i) { - if (test_bit(i, &data->cpu_event)) { - /* double up() is not a problem, user space will see no - * records for the additional "events". - */ - up(&data->sem); - } + if (!cpus_empty(data->cpu_event)) { + spin_lock_irqsave(&data_saved_lock, flags); + salinfo_work_to_do(data); + spin_unlock_irqrestore(&data_saved_lock, flags); } } -static void +static void salinfo_timeout (unsigned long arg) { salinfo_timeout_check(salinfo_data + SAL_INFO_TYPE_MCA); @@ -290,16 +308,20 @@ salinfo_event_read(struct file *file, ch int i, n, cpu = -1; retry: - if (down_trylock(&data->sem)) { + if (cpus_empty(data->cpu_event) && down_trylock(&data->mutex)) { if (file->f_flags & O_NONBLOCK) return -EAGAIN; - if (down_interruptible(&data->sem)) + if (down_interruptible(&data->mutex)) return -EINTR; } n = data->cpu_check; for (i = 0; i < NR_CPUS; i++) { - if (test_bit(n, &data->cpu_event) && cpu_online(n)) { + if (cpu_isset(n, data->cpu_event)) { + if (!cpu_online(n)) { + cpu_clear(n, data->cpu_event); + continue; + } cpu = n; break; } @@ -310,9 +332,6 @@ retry: if (cpu == -1) goto retry; - /* events are sticky until the user says "clear" */ - up(&data->sem); - /* for next read, start checking at next CPU */ data->cpu_check = cpu; if (++data->cpu_check == NR_CPUS) @@ -381,10 +400,8 @@ salinfo_log_release(struct inode *inode, static void call_on_cpu(int cpu, void (*fn)(void *), void *arg) { - cpumask_t save_cpus_allowed, new_cpus_allowed; - memcpy(&save_cpus_allowed, ¤t->cpus_allowed, sizeof(save_cpus_allowed)); - memset(&new_cpus_allowed, 0, sizeof(new_cpus_allowed)); - set_bit(cpu, &new_cpus_allowed); + cpumask_t save_cpus_allowed = current->cpus_allowed; + cpumask_t new_cpus_allowed = cpumask_of_cpu(cpu); set_cpus_allowed(current, new_cpus_allowed); (*fn)(arg); set_cpus_allowed(current, save_cpus_allowed); @@ -433,10 +450,10 @@ retry: if (!data->saved_num) call_on_cpu(cpu, salinfo_log_read_cpu, data); if (!data->log_size) { - data->state = STATE_NO_DATA; - clear_bit(cpu, &data->cpu_event); + data->state = STATE_NO_DATA; + cpu_clear(cpu, data->cpu_event); } else { - data->state = STATE_LOG_RECORD; + data->state = STATE_LOG_RECORD; } } @@ -473,27 +490,31 @@ static int salinfo_log_clear(struct salinfo_data *data, int cpu) { sal_log_record_header_t *rh; + unsigned long flags; + spin_lock_irqsave(&data_saved_lock, flags); data->state = STATE_NO_DATA; - if (!test_bit(cpu, &data->cpu_event)) + if (!cpu_isset(cpu, data->cpu_event)) { + spin_unlock_irqrestore(&data_saved_lock, flags); return 0; - down(&data->sem); - clear_bit(cpu, &data->cpu_event); + } + cpu_clear(cpu, data->cpu_event); if (data->saved_num) { - unsigned long flags; - spin_lock_irqsave(&data_saved_lock, flags); - shift1_data_saved(data, data->saved_num - 1 ); + shift1_data_saved(data, data->saved_num - 1); data->saved_num = 0; - spin_unlock_irqrestore(&data_saved_lock, flags); } + spin_unlock_irqrestore(&data_saved_lock, flags); rh = (sal_log_record_header_t *)(data->log_buffer); /* Corrected errors have already been cleared from SAL */ if (rh->severity != sal_log_severity_corrected) call_on_cpu(cpu, salinfo_log_clear_cpu, data); /* clearing a record may make a new record visible */ salinfo_log_new_read(cpu, data); - if (data->state == STATE_LOG_RECORD && - !test_and_set_bit(cpu, &data->cpu_event)) - up(&data->sem); + if (data->state == STATE_LOG_RECORD) { + spin_lock_irqsave(&data_saved_lock, flags); + cpu_set(cpu, data->cpu_event); + salinfo_work_to_do(data); + spin_unlock_irqrestore(&data_saved_lock, flags); + } return 0; } @@ -550,6 +571,53 @@ static struct file_operations salinfo_da .write = salinfo_log_write, }; +#ifdef CONFIG_HOTPLUG_CPU +static int __devinit +salinfo_cpu_callback(struct notifier_block *nb, unsigned long action, void *hcpu) +{ + unsigned int i, cpu = (unsigned long)hcpu; + unsigned long flags; + struct salinfo_data *data; + switch (action) { + case CPU_ONLINE: + spin_lock_irqsave(&data_saved_lock, flags); + for (i = 0, data = salinfo_data; + i < ARRAY_SIZE(salinfo_data); + ++i, ++data) { + cpu_set(cpu, data->cpu_event); + salinfo_work_to_do(data); + } + spin_unlock_irqrestore(&data_saved_lock, flags); + break; + case CPU_DEAD: + spin_lock_irqsave(&data_saved_lock, flags); + for (i = 0, data = salinfo_data; + i < ARRAY_SIZE(salinfo_data); + ++i, ++data) { + struct salinfo_data_saved *data_saved; + int j; + for (j = ARRAY_SIZE(data->data_saved) - 1, data_saved = data->data_saved + j; + j >= 0; + --j, --data_saved) { + if (data_saved->buffer && data_saved->cpu == cpu) { + shift1_data_saved(data, j); + } + } + cpu_clear(cpu, data->cpu_event); + } + spin_unlock_irqrestore(&data_saved_lock, flags); + break; + } + return NOTIFY_OK; +} + +static struct notifier_block salinfo_cpu_notifier = +{ + .notifier_call = salinfo_cpu_callback, + .priority = 0, +}; +#endif /* CONFIG_HOTPLUG_CPU */ + static int __init salinfo_init(void) { @@ -557,7 +625,7 @@ salinfo_init(void) struct proc_dir_entry **sdir = salinfo_proc_entries; /* keeps track of every entry */ struct proc_dir_entry *dir, *entry; struct salinfo_data *data; - int i, j, online; + int i, j; salinfo_dir = proc_mkdir("sal", NULL); if (!salinfo_dir) @@ -572,7 +640,7 @@ salinfo_init(void) for (i = 0; i < ARRAY_SIZE(salinfo_log_name); i++) { data = salinfo_data + i; data->type = i; - sema_init(&data->sem, 0); + init_MUTEX(&data->mutex); dir = proc_mkdir(salinfo_log_name[i], salinfo_dir); if (!dir) continue; @@ -592,12 +660,8 @@ salinfo_init(void) *sdir++ = entry; /* we missed any events before now */ - online = 0; - for_each_online_cpu(j) { - set_bit(j, &data->cpu_event); - ++online; - } - sema_init(&data->sem, online); + for_each_online_cpu(j) + cpu_set(j, data->cpu_event); *sdir++ = dir; } @@ -609,6 +673,10 @@ salinfo_init(void) salinfo_timer.function = &salinfo_timeout; add_timer(&salinfo_timer); +#ifdef CONFIG_HOTPLUG_CPU + register_cpu_notifier(&salinfo_cpu_notifier); +#endif + return 0; } diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c index 8f44e7d..e9d37bf 100644 --- a/arch/ia64/kernel/smpboot.c +++ b/arch/ia64/kernel/smpboot.c @@ -70,6 +70,12 @@ #endif #ifdef CONFIG_HOTPLUG_CPU +#ifdef CONFIG_PERMIT_BSP_REMOVE +#define bsp_remove_ok 1 +#else +#define bsp_remove_ok 0 +#endif + /* * Store all idle threads, this can be reused instead of creating * a new thread. Also avoids complicated thread destroy functionality @@ -104,7 +110,7 @@ struct sal_to_os_boot *sal_state_for_boo /* * ITC synchronization related stuff: */ -#define MASTER 0 +#define MASTER (0) #define SLAVE (SMP_CACHE_BYTES/8) #define NUM_ROUNDS 64 /* magic value */ @@ -151,6 +157,27 @@ char __initdata no_int_routing; unsigned char smp_int_redirect; /* are INT and IPI redirectable by the chipset? */ +#ifdef CONFIG_FORCE_CPEI_RETARGET +#define CPEI_OVERRIDE_DEFAULT (1) +#else +#define CPEI_OVERRIDE_DEFAULT (0) +#endif + +unsigned int force_cpei_retarget = CPEI_OVERRIDE_DEFAULT; + +static int __init +cmdl_force_cpei(char *str) +{ + int value=0; + + get_option (&str, &value); + force_cpei_retarget = value; + + return 1; +} + +__setup("force_cpei=", cmdl_force_cpei); + static int __init nointroute (char *str) { @@ -161,6 +188,27 @@ nointroute (char *str) __setup("nointroute", nointroute); +static void fix_b0_for_bsp(void) +{ +#ifdef CONFIG_HOTPLUG_CPU + int cpuid; + static int fix_bsp_b0 = 1; + + cpuid = smp_processor_id(); + + /* + * Cache the b0 value on the first AP that comes up + */ + if (!(fix_bsp_b0 && cpuid)) + return; + + sal_boot_rendez_state[0].br[0] = sal_boot_rendez_state[cpuid].br[0]; + printk ("Fixed BSP b0 value from CPU %d\n", cpuid); + + fix_bsp_b0 = 0; +#endif +} + void sync_master (void *arg) { @@ -327,8 +375,9 @@ smp_setup_percpu_timer (void) static void __devinit smp_callin (void) { - int cpuid, phys_id; + int cpuid, phys_id, itc_master; extern void ia64_init_itm(void); + extern volatile int time_keeper_id; #ifdef CONFIG_PERFMON extern void pfm_init_percpu(void); @@ -336,6 +385,7 @@ smp_callin (void) cpuid = smp_processor_id(); phys_id = hard_smp_processor_id(); + itc_master = time_keeper_id; if (cpu_online(cpuid)) { printk(KERN_ERR "huh, phys CPU#0x%x, CPU#0x%x already present??\n", @@ -343,6 +393,8 @@ smp_callin (void) BUG(); } + fix_b0_for_bsp(); + lock_ipi_calllock(); cpu_set(cpuid, cpu_online_map); unlock_ipi_calllock(); @@ -365,8 +417,8 @@ smp_callin (void) * calls spin_unlock_bh(), which calls spin_unlock_bh(), which calls * local_bh_enable(), which bugs out if irqs are not enabled... */ - Dprintk("Going to syncup ITC with BP.\n"); - ia64_sync_itc(0); + Dprintk("Going to syncup ITC with ITC Master.\n"); + ia64_sync_itc(itc_master); } /* @@ -638,6 +690,47 @@ remove_siblinginfo(int cpu) } extern void fixup_irqs(void); + +int migrate_platform_irqs(unsigned int cpu) +{ + int new_cpei_cpu; + irq_desc_t *desc = NULL; + cpumask_t mask; + int retval = 0; + + /* + * dont permit CPEI target to removed. + */ + if (cpe_vector > 0 && is_cpu_cpei_target(cpu)) { + printk ("CPU (%d) is CPEI Target\n", cpu); + if (can_cpei_retarget()) { + /* + * Now re-target the CPEI to a different processor + */ + new_cpei_cpu = any_online_cpu(cpu_online_map); + mask = cpumask_of_cpu(new_cpei_cpu); + set_cpei_target_cpu(new_cpei_cpu); + desc = irq_descp(ia64_cpe_irq); + /* + * Switch for now, immediatly, we need to do fake intr + * as other interrupts, but need to study CPEI behaviour with + * polling before making changes. + */ + if (desc) { + desc->handler->disable(ia64_cpe_irq); + desc->handler->set_affinity(ia64_cpe_irq, mask); + desc->handler->enable(ia64_cpe_irq); + printk ("Re-targetting CPEI to cpu %d\n", new_cpei_cpu); + } + } + if (!desc) { + printk ("Unable to retarget CPEI, offline cpu [%d] failed\n", cpu); + retval = -EBUSY; + } + } + return retval; +} + /* must be called with cpucontrol mutex held */ int __cpu_disable(void) { @@ -646,8 +739,17 @@ int __cpu_disable(void) /* * dont permit boot processor for now */ - if (cpu == 0) - return -EBUSY; + if (cpu == 0 && !bsp_remove_ok) { + printk ("Your platform does not support removal of BSP\n"); + return (-EBUSY); + } + + cpu_clear(cpu, cpu_online_map); + + if (migrate_platform_irqs(cpu)) { + cpu_set(cpu, cpu_online_map); + return (-EBUSY); + } remove_siblinginfo(cpu); cpu_clear(cpu, cpu_online_map); diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c index 028a2b9..1ca130a 100644 --- a/arch/ia64/kernel/time.c +++ b/arch/ia64/kernel/time.c @@ -32,7 +32,7 @@ extern unsigned long wall_jiffies; -#define TIME_KEEPER_ID 0 /* smp_processor_id() of time-keeper */ +volatile int time_keeper_id = 0; /* smp_processor_id() of time-keeper */ #ifdef CONFIG_IA64_DEBUG_IRQ @@ -71,7 +71,7 @@ timer_interrupt (int irq, void *dev_id, new_itm += local_cpu_data->itm_delta; - if (smp_processor_id() == TIME_KEEPER_ID) { + if (smp_processor_id() == time_keeper_id) { /* * Here we are in the timer irq handler. We have irqs locally * disabled, but we don't know if the timer_bh is running on @@ -236,6 +236,11 @@ static struct irqaction timer_irqaction .name = "timer" }; +void __devinit ia64_disable_timer(void) +{ + ia64_set_itv(1 << 16); +} + void __init time_init (void) { diff --git a/arch/ia64/kernel/traps.c b/arch/ia64/kernel/traps.c index d3e0ecb..5539190 100644 --- a/arch/ia64/kernel/traps.c +++ b/arch/ia64/kernel/traps.c @@ -530,12 +530,15 @@ ia64_fault (unsigned long vector, unsign if (fsys_mode(current, ®s)) { extern char __kernel_syscall_via_break[]; /* - * Got a trap in fsys-mode: Taken Branch Trap and Single Step trap - * need special handling; Debug trap is not supposed to happen. + * Got a trap in fsys-mode: Taken Branch Trap + * and Single Step trap need special handling; + * Debug trap is ignored (we disable it here + * and re-enable it in the lower-privilege trap). */ if (unlikely(vector == 29)) { - die("Got debug trap in fsys-mode---not supposed to happen!", - ®s, 0); + set_thread_flag(TIF_DB_DISABLED); + ia64_psr(®s)->db = 0; + ia64_psr(®s)->lp = 1; return; } /* re-do the system call via break 0x100000: */ @@ -589,10 +592,19 @@ ia64_fault (unsigned long vector, unsign case 34: if (isr & 0x2) { /* Lower-Privilege Transfer Trap */ + + /* If we disabled debug traps during an fsyscall, + * re-enable them here. + */ + if (test_thread_flag(TIF_DB_DISABLED)) { + clear_thread_flag(TIF_DB_DISABLED); + ia64_psr(®s)->db = 1; + } + /* - * Just clear PSR.lp and then return immediately: all the - * interesting work (e.g., signal delivery is done in the kernel - * exit path). + * Just clear PSR.lp and then return immediately: + * all the interesting work (e.g., signal delivery) + * is done in the kernel exit path. */ ia64_psr(®s)->lp = 0; return; diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c index acaaec4..9855ba3 100644 --- a/arch/ia64/mm/contig.c +++ b/arch/ia64/mm/contig.c @@ -181,13 +181,15 @@ per_cpu_init (void) { void *cpu_data; int cpu; + static int first_time=1; /* * get_free_pages() cannot be used before cpu_init() done. BSP * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls * get_zeroed_page(). */ - if (smp_processor_id() == 0) { + if (first_time) { + first_time=0; cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS, PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS)); for (cpu = 0; cpu < NR_CPUS; cpu++) { diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c index c87d6d1..573d5cc 100644 --- a/arch/ia64/mm/discontig.c +++ b/arch/ia64/mm/discontig.c @@ -528,12 +528,17 @@ void __init find_memory(void) void *per_cpu_init(void) { int cpu; + static int first_time = 1; + if (smp_processor_id() != 0) return __per_cpu_start + __per_cpu_offset[smp_processor_id()]; - for (cpu = 0; cpu < NR_CPUS; cpu++) - per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu]; + if (first_time) { + first_time = 0; + for (cpu = 0; cpu < NR_CPUS; cpu++) + per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu]; + } return __per_cpu_start + __per_cpu_offset[smp_processor_id()]; } diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c index 41105d4..6a4eec9 100644 --- a/arch/ia64/mm/tlb.c +++ b/arch/ia64/mm/tlb.c @@ -90,7 +90,7 @@ ia64_global_tlb_purge (struct mm_struct { static DEFINE_SPINLOCK(ptcg_lock); - if (mm != current->active_mm) { + if (mm != current->active_mm || !current->mm) { flush_tlb_all(); return; } diff --git a/arch/ia64/sn/include/xtalk/hubdev.h b/arch/ia64/sn/include/xtalk/hubdev.h index 71c2b27..4d417c3 100644 --- a/arch/ia64/sn/include/xtalk/hubdev.h +++ b/arch/ia64/sn/include/xtalk/hubdev.h @@ -26,11 +26,14 @@ #define IIO_NUM_ITTES 7 #define HUB_NUM_BIG_WINDOW (IIO_NUM_ITTES - 1) -struct sn_flush_device_list { +/* This struct is shared between the PROM and the kernel. + * Changes to this struct will require corresponding changes to the kernel. + */ +struct sn_flush_device_common { int sfdl_bus; int sfdl_slot; int sfdl_pin; - struct bar_list { + struct common_bar_list { unsigned long start; unsigned long end; } sfdl_bar_list[6]; @@ -40,14 +43,19 @@ struct sn_flush_device_list { uint32_t sfdl_persistent_busnum; uint32_t sfdl_persistent_segment; struct pcibus_info *sfdl_pcibus_info; +}; + +/* This struct is kernel only and is not used by the PROM */ +struct sn_flush_device_kernel { spinlock_t sfdl_flush_lock; + struct sn_flush_device_common *common; }; /* - * **widget_p - Used as an array[wid_num][device] of sn_flush_device_list. + * **widget_p - Used as an array[wid_num][device] of sn_flush_device_kernel. */ struct sn_flush_nasid_entry { - struct sn_flush_device_list **widget_p; /* Used as a array of wid_num */ + struct sn_flush_device_kernel **widget_p; // Used as an array of wid_num uint64_t iio_itte[8]; }; diff --git a/arch/ia64/sn/kernel/bte_error.c b/arch/ia64/sn/kernel/bte_error.c index fcbc748..f1ec137 100644 --- a/arch/ia64/sn/kernel/bte_error.c +++ b/arch/ia64/sn/kernel/bte_error.c @@ -33,7 +33,7 @@ void bte_error_handler(unsigned long); * Wait until all BTE related CRBs are completed * and then reset the interfaces. */ -void shub1_bte_error_handler(unsigned long _nodepda) +int shub1_bte_error_handler(unsigned long _nodepda) { struct nodepda_s *err_nodepda = (struct nodepda_s *)_nodepda; struct timer_list *recovery_timer = &err_nodepda->bte_recovery_timer; @@ -53,7 +53,7 @@ void shub1_bte_error_handler(unsigned lo (err_nodepda->bte_if[1].bh_error == BTE_SUCCESS)) { BTE_PRINTK(("eh:%p:%d Nothing to do.\n", err_nodepda, smp_processor_id())); - return; + return 1; } /* Determine information about our hub */ @@ -81,7 +81,7 @@ void shub1_bte_error_handler(unsigned lo mod_timer(recovery_timer, HZ * 5); BTE_PRINTK(("eh:%p:%d Marked Giving up\n", err_nodepda, smp_processor_id())); - return; + return 1; } if (icmr.ii_icmr_fld_s.i_crb_vld != 0) { @@ -99,7 +99,7 @@ void shub1_bte_error_handler(unsigned lo BTE_PRINTK(("eh:%p:%d Valid %d, Giving up\n", err_nodepda, smp_processor_id(), i)); - return; + return 1; } } } @@ -124,6 +124,42 @@ void shub1_bte_error_handler(unsigned lo REMOTE_HUB_S(nasid, IIO_IBCR, ibcr.ii_ibcr_regval); del_timer(recovery_timer); + return 0; +} + +/* + * Wait until all BTE related CRBs are completed + * and then reset the interfaces. + */ +int shub2_bte_error_handler(unsigned long _nodepda) +{ + struct nodepda_s *err_nodepda = (struct nodepda_s *)_nodepda; + struct timer_list *recovery_timer = &err_nodepda->bte_recovery_timer; + struct bteinfo_s *bte; + nasid_t nasid; + u64 status; + int i; + + nasid = cnodeid_to_nasid(err_nodepda->bte_if[0].bte_cnode); + + /* + * Verify that all the BTEs are complete + */ + for (i = 0; i < BTES_PER_NODE; i++) { + bte = &err_nodepda->bte_if[i]; + status = BTE_LNSTAT_LOAD(bte); + if ((status & IBLS_ERROR) || !(status & IBLS_BUSY)) + continue; + mod_timer(recovery_timer, HZ * 5); + BTE_PRINTK(("eh:%p:%d Marked Giving up\n", err_nodepda, + smp_processor_id())); + return 1; + } + if (ia64_sn_bte_recovery(nasid)) + panic("bte_error_handler(): Fatal BTE Error"); + + del_timer(recovery_timer); + return 0; } /* @@ -135,7 +171,6 @@ void bte_error_handler(unsigned long _no struct nodepda_s *err_nodepda = (struct nodepda_s *)_nodepda; spinlock_t *recovery_lock = &err_nodepda->bte_recovery_lock; int i; - nasid_t nasid; unsigned long irq_flags; volatile u64 *notify; bte_result_t bh_error; @@ -160,12 +195,15 @@ void bte_error_handler(unsigned long _no } if (is_shub1()) { - shub1_bte_error_handler(_nodepda); + if (shub1_bte_error_handler(_nodepda)) { + spin_unlock_irqrestore(recovery_lock, irq_flags); + return; + } } else { - nasid = cnodeid_to_nasid(err_nodepda->bte_if[0].bte_cnode); - - if (ia64_sn_bte_recovery(nasid)) - panic("bte_error_handler(): Fatal BTE Error"); + if (shub2_bte_error_handler(_nodepda)) { + spin_unlock_irqrestore(recovery_lock, irq_flags); + return; + } } for (i = 0; i < BTES_PER_NODE; i++) { diff --git a/arch/ia64/sn/kernel/huberror.c b/arch/ia64/sn/kernel/huberror.c index 5c5eb01..56ab6ba 100644 --- a/arch/ia64/sn/kernel/huberror.c +++ b/arch/ia64/sn/kernel/huberror.c @@ -32,13 +32,14 @@ static irqreturn_t hub_eint_handler(int ret_stuff.v0 = 0; hubdev_info = (struct hubdev_info *)arg; nasid = hubdev_info->hdi_nasid; - SAL_CALL_NOLOCK(ret_stuff, SN_SAL_HUB_ERROR_INTERRUPT, + + if (is_shub1()) { + SAL_CALL_NOLOCK(ret_stuff, SN_SAL_HUB_ERROR_INTERRUPT, (u64) nasid, 0, 0, 0, 0, 0, 0); - if ((int)ret_stuff.v0) - panic("hubii_eint_handler(): Fatal TIO Error"); + if ((int)ret_stuff.v0) + panic("hubii_eint_handler(): Fatal TIO Error"); - if (is_shub1()) { if (!(nasid & 1)) /* Not a TIO, handle CRB errors */ (void)hubiio_crb_error_handler(hubdev_info); } else diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c index 318087e..258d9d7 100644 --- a/arch/ia64/sn/kernel/io_init.c +++ b/arch/ia64/sn/kernel/io_init.c @@ -76,11 +76,12 @@ static struct sn_pcibus_provider sn_pci_ }; /* - * Retrieve the DMA Flush List given nasid. This list is needed - * to implement the WAR - Flush DMA data on PIO Reads. + * Retrieve the DMA Flush List given nasid, widget, and device. + * This list is needed to implement the WAR - Flush DMA data on PIO Reads. */ -static inline uint64_t -sal_get_widget_dmaflush_list(u64 nasid, u64 widget_num, u64 address) +static inline u64 +sal_get_device_dmaflush_list(u64 nasid, u64 widget_num, u64 device_num, + u64 address) { struct ia64_sal_retval ret_stuff; @@ -88,17 +89,17 @@ sal_get_widget_dmaflush_list(u64 nasid, ret_stuff.v0 = 0; SAL_CALL_NOLOCK(ret_stuff, - (u64) SN_SAL_IOIF_GET_WIDGET_DMAFLUSH_LIST, - (u64) nasid, (u64) widget_num, (u64) address, 0, 0, 0, - 0); - return ret_stuff.v0; + (u64) SN_SAL_IOIF_GET_DEVICE_DMAFLUSH_LIST, + (u64) nasid, (u64) widget_num, + (u64) device_num, (u64) address, 0, 0, 0); + return ret_stuff.status; } /* * Retrieve the hub device info structure for the given nasid. */ -static inline uint64_t sal_get_hubdev_info(u64 handle, u64 address) +static inline u64 sal_get_hubdev_info(u64 handle, u64 address) { struct ia64_sal_retval ret_stuff; @@ -114,7 +115,7 @@ static inline uint64_t sal_get_hubdev_in /* * Retrieve the pci bus information given the bus number. */ -static inline uint64_t sal_get_pcibus_info(u64 segment, u64 busnum, u64 address) +static inline u64 sal_get_pcibus_info(u64 segment, u64 busnum, u64 address) { struct ia64_sal_retval ret_stuff; @@ -130,7 +131,7 @@ static inline uint64_t sal_get_pcibus_in /* * Retrieve the pci device information given the bus and device|function number. */ -static inline uint64_t +static inline u64 sal_get_pcidev_info(u64 segment, u64 bus_number, u64 devfn, u64 pci_dev, u64 sn_irq_info) { @@ -170,12 +171,12 @@ sn_pcidev_info_get(struct pci_dev *dev) */ static void sn_fixup_ionodes(void) { - - struct sn_flush_device_list *sn_flush_device_list; + struct sn_flush_device_kernel *sn_flush_device_kernel; + struct sn_flush_device_kernel *dev_entry; struct hubdev_info *hubdev; - uint64_t status; - uint64_t nasid; - int i, widget; + u64 status; + u64 nasid; + int i, widget, device; /* * Get SGI Specific HUB chipset information. @@ -186,7 +187,7 @@ static void sn_fixup_ionodes(void) nasid = cnodeid_to_nasid(i); hubdev->max_segment_number = 0xffffffff; hubdev->max_pcibus_number = 0xff; - status = sal_get_hubdev_info(nasid, (uint64_t) __pa(hubdev)); + status = sal_get_hubdev_info(nasid, (u64) __pa(hubdev)); if (status) continue; @@ -213,38 +214,49 @@ static void sn_fixup_ionodes(void) hubdev->hdi_flush_nasid_list.widget_p = kmalloc((HUB_WIDGET_ID_MAX + 1) * - sizeof(struct sn_flush_device_list *), GFP_KERNEL); - + sizeof(struct sn_flush_device_kernel *), + GFP_KERNEL); memset(hubdev->hdi_flush_nasid_list.widget_p, 0x0, (HUB_WIDGET_ID_MAX + 1) * - sizeof(struct sn_flush_device_list *)); + sizeof(struct sn_flush_device_kernel *)); for (widget = 0; widget <= HUB_WIDGET_ID_MAX; widget++) { - sn_flush_device_list = kmalloc(DEV_PER_WIDGET * - sizeof(struct - sn_flush_device_list), - GFP_KERNEL); - memset(sn_flush_device_list, 0x0, + sn_flush_device_kernel = kmalloc(DEV_PER_WIDGET * + sizeof(struct + sn_flush_device_kernel), + GFP_KERNEL); + if (!sn_flush_device_kernel) + BUG(); + memset(sn_flush_device_kernel, 0x0, DEV_PER_WIDGET * - sizeof(struct sn_flush_device_list)); + sizeof(struct sn_flush_device_kernel)); - status = - sal_get_widget_dmaflush_list(nasid, widget, - (uint64_t) - __pa - (sn_flush_device_list)); - if (status) { - kfree(sn_flush_device_list); - continue; - } + dev_entry = sn_flush_device_kernel; + for (device = 0; device < DEV_PER_WIDGET; + device++,dev_entry++) { + dev_entry->common = kmalloc(sizeof(struct + sn_flush_device_common), + GFP_KERNEL); + if (!dev_entry->common) + BUG(); + memset(dev_entry->common, 0x0, sizeof(struct + sn_flush_device_common)); + + status = sal_get_device_dmaflush_list(nasid, + widget, + device, + (u64)(dev_entry->common)); + if (status) + BUG(); - spin_lock_init(&sn_flush_device_list->sfdl_flush_lock); - hubdev->hdi_flush_nasid_list.widget_p[widget] = - sn_flush_device_list; - } + spin_lock_init(&dev_entry->sfdl_flush_lock); + } + if (sn_flush_device_kernel) + hubdev->hdi_flush_nasid_list.widget_p[widget] = + sn_flush_device_kernel; + } } - } /* diff --git a/arch/ia64/sn/kernel/xpc.h b/arch/ia64/sn/kernel/xpc.h deleted file mode 100644 index 5483a9f..0000000 --- a/arch/ia64/sn/kernel/xpc.h +++ /dev/null @@ -1,1273 +0,0 @@ -/* - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file "COPYING" in the main directory of this archive - * for more details. - * - * Copyright (c) 2004-2005 Silicon Graphics, Inc. All Rights Reserved. - */ - - -/* - * Cross Partition Communication (XPC) structures and macros. - */ - -#ifndef _IA64_SN_KERNEL_XPC_H -#define _IA64_SN_KERNEL_XPC_H - - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - - -/* - * XPC Version numbers consist of a major and minor number. XPC can always - * talk to versions with same major #, and never talk to versions with a - * different major #. - */ -#define _XPC_VERSION(_maj, _min) (((_maj) << 4) | ((_min) & 0xf)) -#define XPC_VERSION_MAJOR(_v) ((_v) >> 4) -#define XPC_VERSION_MINOR(_v) ((_v) & 0xf) - - -/* - * The next macros define word or bit representations for given - * C-brick nasid in either the SAL provided bit array representing - * nasids in the partition/machine or the AMO_t array used for - * inter-partition initiation communications. - * - * For SN2 machines, C-Bricks are alway even numbered NASIDs. As - * such, some space will be saved by insisting that nasid information - * passed from SAL always be packed for C-Bricks and the - * cross-partition interrupts use the same packing scheme. - */ -#define XPC_NASID_W_INDEX(_n) (((_n) / 64) / 2) -#define XPC_NASID_B_INDEX(_n) (((_n) / 2) & (64 - 1)) -#define XPC_NASID_IN_ARRAY(_n, _p) ((_p)[XPC_NASID_W_INDEX(_n)] & \ - (1UL << XPC_NASID_B_INDEX(_n))) -#define XPC_NASID_FROM_W_B(_w, _b) (((_w) * 64 + (_b)) * 2) - -#define XPC_HB_DEFAULT_INTERVAL 5 /* incr HB every x secs */ -#define XPC_HB_CHECK_DEFAULT_INTERVAL 20 /* check HB every x secs */ - -/* define the process name of HB checker and the CPU it is pinned to */ -#define XPC_HB_CHECK_THREAD_NAME "xpc_hb" -#define XPC_HB_CHECK_CPU 0 - -/* define the process name of the discovery thread */ -#define XPC_DISCOVERY_THREAD_NAME "xpc_discovery" - - -/* - * the reserved page - * - * SAL reserves one page of memory per partition for XPC. Though a full page - * in length (16384 bytes), its starting address is not page aligned, but it - * is cacheline aligned. The reserved page consists of the following: - * - * reserved page header - * - * The first cacheline of the reserved page contains the header - * (struct xpc_rsvd_page). Before SAL initialization has completed, - * SAL has set up the following fields of the reserved page header: - * SAL_signature, SAL_version, partid, and nasids_size. The other - * fields are set up by XPC. (xpc_rsvd_page points to the local - * partition's reserved page.) - * - * part_nasids mask - * mach_nasids mask - * - * SAL also sets up two bitmaps (or masks), one that reflects the actual - * nasids in this partition (part_nasids), and the other that reflects - * the actual nasids in the entire machine (mach_nasids). We're only - * interested in the even numbered nasids (which contain the processors - * and/or memory), so we only need half as many bits to represent the - * nasids. The part_nasids mask is located starting at the first cacheline - * following the reserved page header. The mach_nasids mask follows right - * after the part_nasids mask. The size in bytes of each mask is reflected - * by the reserved page header field 'nasids_size'. (Local partition's - * mask pointers are xpc_part_nasids and xpc_mach_nasids.) - * - * vars - * vars part - * - * Immediately following the mach_nasids mask are the XPC variables - * required by other partitions. First are those that are generic to all - * partitions (vars), followed on the next available cacheline by those - * which are partition specific (vars part). These are setup by XPC. - * (Local partition's vars pointers are xpc_vars and xpc_vars_part.) - * - * Note: Until vars_pa is set, the partition XPC code has not been initialized. - */ -struct xpc_rsvd_page { - u64 SAL_signature; /* SAL: unique signature */ - u64 SAL_version; /* SAL: version */ - u8 partid; /* SAL: partition ID */ - u8 version; - u8 pad1[6]; /* align to next u64 in cacheline */ - volatile u64 vars_pa; - struct timespec stamp; /* time when reserved page was setup by XPC */ - u64 pad2[9]; /* align to last u64 in cacheline */ - u64 nasids_size; /* SAL: size of each nasid mask in bytes */ -}; - -#define XPC_RP_VERSION _XPC_VERSION(1,1) /* version 1.1 of the reserved page */ - -#define XPC_SUPPORTS_RP_STAMP(_version) \ - (_version >= _XPC_VERSION(1,1)) - -/* - * compare stamps - the return value is: - * - * < 0, if stamp1 < stamp2 - * = 0, if stamp1 == stamp2 - * > 0, if stamp1 > stamp2 - */ -static inline int -xpc_compare_stamps(struct timespec *stamp1, struct timespec *stamp2) -{ - int ret; - - - if ((ret = stamp1->tv_sec - stamp2->tv_sec) == 0) { - ret = stamp1->tv_nsec - stamp2->tv_nsec; - } - return ret; -} - - -/* - * Define the structures by which XPC variables can be exported to other - * partitions. (There are two: struct xpc_vars and struct xpc_vars_part) - */ - -/* - * The following structure describes the partition generic variables - * needed by other partitions in order to properly initialize. - * - * struct xpc_vars version number also applies to struct xpc_vars_part. - * Changes to either structure and/or related functionality should be - * reflected by incrementing either the major or minor version numbers - * of struct xpc_vars. - */ -struct xpc_vars { - u8 version; - u64 heartbeat; - u64 heartbeating_to_mask; - u64 heartbeat_offline; /* if 0, heartbeat should be changing */ - int act_nasid; - int act_phys_cpuid; - u64 vars_part_pa; - u64 amos_page_pa; /* paddr of page of AMOs from MSPEC driver */ - AMO_t *amos_page; /* vaddr of page of AMOs from MSPEC driver */ -}; - -#define XPC_V_VERSION _XPC_VERSION(3,1) /* version 3.1 of the cross vars */ - -#define XPC_SUPPORTS_DISENGAGE_REQUEST(_version) \ - (_version >= _XPC_VERSION(3,1)) - - -static inline int -xpc_hb_allowed(partid_t partid, struct xpc_vars *vars) -{ - return ((vars->heartbeating_to_mask & (1UL << partid)) != 0); -} - -static inline void -xpc_allow_hb(partid_t partid, struct xpc_vars *vars) -{ - u64 old_mask, new_mask; - - do { - old_mask = vars->heartbeating_to_mask; - new_mask = (old_mask | (1UL << partid)); - } while (cmpxchg(&vars->heartbeating_to_mask, old_mask, new_mask) != - old_mask); -} - -static inline void -xpc_disallow_hb(partid_t partid, struct xpc_vars *vars) -{ - u64 old_mask, new_mask; - - do { - old_mask = vars->heartbeating_to_mask; - new_mask = (old_mask & ~(1UL << partid)); - } while (cmpxchg(&vars->heartbeating_to_mask, old_mask, new_mask) != - old_mask); -} - - -/* - * The AMOs page consists of a number of AMO variables which are divided into - * four groups, The first two groups are used to identify an IRQ's sender. - * These two groups consist of 64 and 128 AMO variables respectively. The last - * two groups, consisting of just one AMO variable each, are used to identify - * the remote partitions that are currently engaged (from the viewpoint of - * the XPC running on the remote partition). - */ -#define XPC_NOTIFY_IRQ_AMOS 0 -#define XPC_ACTIVATE_IRQ_AMOS (XPC_NOTIFY_IRQ_AMOS + XP_MAX_PARTITIONS) -#define XPC_ENGAGED_PARTITIONS_AMO (XPC_ACTIVATE_IRQ_AMOS + XP_NASID_MASK_WORDS) -#define XPC_DISENGAGE_REQUEST_AMO (XPC_ENGAGED_PARTITIONS_AMO + 1) - - -/* - * The following structure describes the per partition specific variables. - * - * An array of these structures, one per partition, will be defined. As a - * partition becomes active XPC will copy the array entry corresponding to - * itself from that partition. It is desirable that the size of this - * structure evenly divide into a cacheline, such that none of the entries - * in this array crosses a cacheline boundary. As it is now, each entry - * occupies half a cacheline. - */ -struct xpc_vars_part { - volatile u64 magic; - - u64 openclose_args_pa; /* physical address of open and close args */ - u64 GPs_pa; /* physical address of Get/Put values */ - - u64 IPI_amo_pa; /* physical address of IPI AMO_t structure */ - int IPI_nasid; /* nasid of where to send IPIs */ - int IPI_phys_cpuid; /* physical CPU ID of where to send IPIs */ - - u8 nchannels; /* #of defined channels supported */ - - u8 reserved[23]; /* pad to a full 64 bytes */ -}; - -/* - * The vars_part MAGIC numbers play a part in the first contact protocol. - * - * MAGIC1 indicates that the per partition specific variables for a remote - * partition have been initialized by this partition. - * - * MAGIC2 indicates that this partition has pulled the remote partititions - * per partition variables that pertain to this partition. - */ -#define XPC_VP_MAGIC1 0x0053524156435058L /* 'XPCVARS\0'L (little endian) */ -#define XPC_VP_MAGIC2 0x0073726176435058L /* 'XPCvars\0'L (little endian) */ - - -/* the reserved page sizes and offsets */ - -#define XPC_RP_HEADER_SIZE L1_CACHE_ALIGN(sizeof(struct xpc_rsvd_page)) -#define XPC_RP_VARS_SIZE L1_CACHE_ALIGN(sizeof(struct xpc_vars)) - -#define XPC_RP_PART_NASIDS(_rp) (u64 *) ((u8 *) _rp + XPC_RP_HEADER_SIZE) -#define XPC_RP_MACH_NASIDS(_rp) (XPC_RP_PART_NASIDS(_rp) + xp_nasid_mask_words) -#define XPC_RP_VARS(_rp) ((struct xpc_vars *) XPC_RP_MACH_NASIDS(_rp) + xp_nasid_mask_words) -#define XPC_RP_VARS_PART(_rp) (struct xpc_vars_part *) ((u8 *) XPC_RP_VARS(rp) + XPC_RP_VARS_SIZE) - - -/* - * Functions registered by add_timer() or called by kernel_thread() only - * allow for a single 64-bit argument. The following macros can be used to - * pack and unpack two (32-bit, 16-bit or 8-bit) arguments into or out from - * the passed argument. - */ -#define XPC_PACK_ARGS(_arg1, _arg2) \ - ((((u64) _arg1) & 0xffffffff) | \ - ((((u64) _arg2) & 0xffffffff) << 32)) - -#define XPC_UNPACK_ARG1(_args) (((u64) _args) & 0xffffffff) -#define XPC_UNPACK_ARG2(_args) ((((u64) _args) >> 32) & 0xffffffff) - - - -/* - * Define a Get/Put value pair (pointers) used with a message queue. - */ -struct xpc_gp { - volatile s64 get; /* Get value */ - volatile s64 put; /* Put value */ -}; - -#define XPC_GP_SIZE \ - L1_CACHE_ALIGN(sizeof(struct xpc_gp) * XPC_NCHANNELS) - - - -/* - * Define a structure that contains arguments associated with opening and - * closing a channel. - */ -struct xpc_openclose_args { - u16 reason; /* reason why channel is closing */ - u16 msg_size; /* sizeof each message entry */ - u16 remote_nentries; /* #of message entries in remote msg queue */ - u16 local_nentries; /* #of message entries in local msg queue */ - u64 local_msgqueue_pa; /* physical address of local message queue */ -}; - -#define XPC_OPENCLOSE_ARGS_SIZE \ - L1_CACHE_ALIGN(sizeof(struct xpc_openclose_args) * XPC_NCHANNELS) - - - -/* struct xpc_msg flags */ - -#define XPC_M_DONE 0x01 /* msg has been received/consumed */ -#define XPC_M_READY 0x02 /* msg is ready to be sent */ -#define XPC_M_INTERRUPT 0x04 /* send interrupt when msg consumed */ - - -#define XPC_MSG_ADDRESS(_payload) \ - ((struct xpc_msg *)((u8 *)(_payload) - XPC_MSG_PAYLOAD_OFFSET)) - - - -/* - * Defines notify entry. - * - * This is used to notify a message's sender that their message was received - * and consumed by the intended recipient. - */ -struct xpc_notify { - struct semaphore sema; /* notify semaphore */ - volatile u8 type; /* type of notification */ - - /* the following two fields are only used if type == XPC_N_CALL */ - xpc_notify_func func; /* user's notify function */ - void *key; /* pointer to user's key */ -}; - -/* struct xpc_notify type of notification */ - -#define XPC_N_CALL 0x01 /* notify function provided by user */ - - - -/* - * Define the structure that manages all the stuff required by a channel. In - * particular, they are used to manage the messages sent across the channel. - * - * This structure is private to a partition, and is NOT shared across the - * partition boundary. - * - * There is an array of these structures for each remote partition. It is - * allocated at the time a partition becomes active. The array contains one - * of these structures for each potential channel connection to that partition. - * - * Each of these structures manages two message queues (circular buffers). - * They are allocated at the time a channel connection is made. One of - * these message queues (local_msgqueue) holds the locally created messages - * that are destined for the remote partition. The other of these message - * queues (remote_msgqueue) is a locally cached copy of the remote partition's - * own local_msgqueue. - * - * The following is a description of the Get/Put pointers used to manage these - * two message queues. Consider the local_msgqueue to be on one partition - * and the remote_msgqueue to be its cached copy on another partition. A - * description of what each of the lettered areas contains is included. - * - * - * local_msgqueue remote_msgqueue - * - * |/////////| |/////////| - * w_remote_GP.get --> +---------+ |/////////| - * | F | |/////////| - * remote_GP.get --> +---------+ +---------+ <-- local_GP->get - * | | | | - * | | | E | - * | | | | - * | | +---------+ <-- w_local_GP.get - * | B | |/////////| - * | | |////D////| - * | | |/////////| - * | | +---------+ <-- w_remote_GP.put - * | | |////C////| - * local_GP->put --> +---------+ +---------+ <-- remote_GP.put - * | | |/////////| - * | A | |/////////| - * | | |/////////| - * w_local_GP.put --> +---------+ |/////////| - * |/////////| |/////////| - * - * - * ( remote_GP.[get|put] are cached copies of the remote - * partition's local_GP->[get|put], and thus their values can - * lag behind their counterparts on the remote partition. ) - * - * - * A - Messages that have been allocated, but have not yet been sent to the - * remote partition. - * - * B - Messages that have been sent, but have not yet been acknowledged by the - * remote partition as having been received. - * - * C - Area that needs to be prepared for the copying of sent messages, by - * the clearing of the message flags of any previously received messages. - * - * D - Area into which sent messages are to be copied from the remote - * partition's local_msgqueue and then delivered to their intended - * recipients. [ To allow for a multi-message copy, another pointer - * (next_msg_to_pull) has been added to keep track of the next message - * number needing to be copied (pulled). It chases after w_remote_GP.put. - * Any messages lying between w_local_GP.get and next_msg_to_pull have - * been copied and are ready to be delivered. ] - * - * E - Messages that have been copied and delivered, but have not yet been - * acknowledged by the recipient as having been received. - * - * F - Messages that have been acknowledged, but XPC has not yet notified the - * sender that the message was received by its intended recipient. - * This is also an area that needs to be prepared for the allocating of - * new messages, by the clearing of the message flags of the acknowledged - * messages. - */ -struct xpc_channel { - partid_t partid; /* ID of remote partition connected */ - spinlock_t lock; /* lock for updating this structure */ - u32 flags; /* general flags */ - - enum xpc_retval reason; /* reason why channel is disconnect'g */ - int reason_line; /* line# disconnect initiated from */ - - u16 number; /* channel # */ - - u16 msg_size; /* sizeof each msg entry */ - u16 local_nentries; /* #of msg entries in local msg queue */ - u16 remote_nentries; /* #of msg entries in remote msg queue*/ - - void *local_msgqueue_base; /* base address of kmalloc'd space */ - struct xpc_msg *local_msgqueue; /* local message queue */ - void *remote_msgqueue_base; /* base address of kmalloc'd space */ - struct xpc_msg *remote_msgqueue;/* cached copy of remote partition's */ - /* local message queue */ - u64 remote_msgqueue_pa; /* phys addr of remote partition's */ - /* local message queue */ - - atomic_t references; /* #of external references to queues */ - - atomic_t n_on_msg_allocate_wq; /* #on msg allocation wait queue */ - wait_queue_head_t msg_allocate_wq; /* msg allocation wait queue */ - - u8 delayed_IPI_flags; /* IPI flags received, but delayed */ - /* action until channel disconnected */ - - /* queue of msg senders who want to be notified when msg received */ - - atomic_t n_to_notify; /* #of msg senders to notify */ - struct xpc_notify *notify_queue;/* notify queue for messages sent */ - - xpc_channel_func func; /* user's channel function */ - void *key; /* pointer to user's key */ - - struct semaphore msg_to_pull_sema; /* next msg to pull serialization */ - struct semaphore wdisconnect_sema; /* wait for channel disconnect */ - - struct xpc_openclose_args *local_openclose_args; /* args passed on */ - /* opening or closing of channel */ - - /* various flavors of local and remote Get/Put values */ - - struct xpc_gp *local_GP; /* local Get/Put values */ - struct xpc_gp remote_GP; /* remote Get/Put values */ - struct xpc_gp w_local_GP; /* working local Get/Put values */ - struct xpc_gp w_remote_GP; /* working remote Get/Put values */ - s64 next_msg_to_pull; /* Put value of next msg to pull */ - - /* kthread management related fields */ - -// >>> rethink having kthreads_assigned_limit and kthreads_idle_limit; perhaps -// >>> allow the assigned limit be unbounded and let the idle limit be dynamic -// >>> dependent on activity over the last interval of time - atomic_t kthreads_assigned; /* #of kthreads assigned to channel */ - u32 kthreads_assigned_limit; /* limit on #of kthreads assigned */ - atomic_t kthreads_idle; /* #of kthreads idle waiting for work */ - u32 kthreads_idle_limit; /* limit on #of kthreads idle */ - atomic_t kthreads_active; /* #of kthreads actively working */ - // >>> following field is temporary - u32 kthreads_created; /* total #of kthreads created */ - - wait_queue_head_t idle_wq; /* idle kthread wait queue */ - -} ____cacheline_aligned; - - -/* struct xpc_channel flags */ - -#define XPC_C_WASCONNECTED 0x00000001 /* channel was connected */ - -#define XPC_C_ROPENREPLY 0x00000002 /* remote open channel reply */ -#define XPC_C_OPENREPLY 0x00000004 /* local open channel reply */ -#define XPC_C_ROPENREQUEST 0x00000008 /* remote open channel request */ -#define XPC_C_OPENREQUEST 0x00000010 /* local open channel request */ - -#define XPC_C_SETUP 0x00000020 /* channel's msgqueues are alloc'd */ -#define XPC_C_CONNECTCALLOUT 0x00000040 /* channel connected callout made */ -#define XPC_C_CONNECTED 0x00000080 /* local channel is connected */ -#define XPC_C_CONNECTING 0x00000100 /* channel is being connected */ - -#define XPC_C_RCLOSEREPLY 0x00000200 /* remote close channel reply */ -#define XPC_C_CLOSEREPLY 0x00000400 /* local close channel reply */ -#define XPC_C_RCLOSEREQUEST 0x00000800 /* remote close channel request */ -#define XPC_C_CLOSEREQUEST 0x00001000 /* local close channel request */ - -#define XPC_C_DISCONNECTED 0x00002000 /* channel is disconnected */ -#define XPC_C_DISCONNECTING 0x00004000 /* channel is being disconnected */ -#define XPC_C_DISCONNECTCALLOUT 0x00008000 /* chan disconnected callout made */ -#define XPC_C_WDISCONNECT 0x00010000 /* waiting for channel disconnect */ - - - -/* - * Manages channels on a partition basis. There is one of these structures - * for each partition (a partition will never utilize the structure that - * represents itself). - */ -struct xpc_partition { - - /* XPC HB infrastructure */ - - u8 remote_rp_version; /* version# of partition's rsvd pg */ - struct timespec remote_rp_stamp;/* time when rsvd pg was initialized */ - u64 remote_rp_pa; /* phys addr of partition's rsvd pg */ - u64 remote_vars_pa; /* phys addr of partition's vars */ - u64 remote_vars_part_pa; /* phys addr of partition's vars part */ - u64 last_heartbeat; /* HB at last read */ - u64 remote_amos_page_pa; /* phys addr of partition's amos page */ - int remote_act_nasid; /* active part's act/deact nasid */ - int remote_act_phys_cpuid; /* active part's act/deact phys cpuid */ - u32 act_IRQ_rcvd; /* IRQs since activation */ - spinlock_t act_lock; /* protect updating of act_state */ - u8 act_state; /* from XPC HB viewpoint */ - u8 remote_vars_version; /* version# of partition's vars */ - enum xpc_retval reason; /* reason partition is deactivating */ - int reason_line; /* line# deactivation initiated from */ - int reactivate_nasid; /* nasid in partition to reactivate */ - - unsigned long disengage_request_timeout; /* timeout in jiffies */ - struct timer_list disengage_request_timer; - - - /* XPC infrastructure referencing and teardown control */ - - volatile u8 setup_state; /* infrastructure setup state */ - wait_queue_head_t teardown_wq; /* kthread waiting to teardown infra */ - atomic_t references; /* #of references to infrastructure */ - - - /* - * NONE OF THE PRECEDING FIELDS OF THIS STRUCTURE WILL BE CLEARED WHEN - * XPC SETS UP THE NECESSARY INFRASTRUCTURE TO SUPPORT CROSS PARTITION - * COMMUNICATION. ALL OF THE FOLLOWING FIELDS WILL BE CLEARED. (THE - * 'nchannels' FIELD MUST BE THE FIRST OF THE FIELDS TO BE CLEARED.) - */ - - - u8 nchannels; /* #of defined channels supported */ - atomic_t nchannels_active; /* #of channels that are not DISCONNECTED */ - atomic_t nchannels_engaged;/* #of channels engaged with remote part */ - struct xpc_channel *channels;/* array of channel structures */ - - void *local_GPs_base; /* base address of kmalloc'd space */ - struct xpc_gp *local_GPs; /* local Get/Put values */ - void *remote_GPs_base; /* base address of kmalloc'd space */ - struct xpc_gp *remote_GPs;/* copy of remote partition's local Get/Put */ - /* values */ - u64 remote_GPs_pa; /* phys address of remote partition's local */ - /* Get/Put values */ - - - /* fields used to pass args when opening or closing a channel */ - - void *local_openclose_args_base; /* base address of kmalloc'd space */ - struct xpc_openclose_args *local_openclose_args; /* local's args */ - void *remote_openclose_args_base; /* base address of kmalloc'd space */ - struct xpc_openclose_args *remote_openclose_args; /* copy of remote's */ - /* args */ - u64 remote_openclose_args_pa; /* phys addr of remote's args */ - - - /* IPI sending, receiving and handling related fields */ - - int remote_IPI_nasid; /* nasid of where to send IPIs */ - int remote_IPI_phys_cpuid; /* phys CPU ID of where to send IPIs */ - AMO_t *remote_IPI_amo_va; /* address of remote IPI AMO_t structure */ - - AMO_t *local_IPI_amo_va; /* address of IPI AMO_t structure */ - u64 local_IPI_amo; /* IPI amo flags yet to be handled */ - char IPI_owner[8]; /* IPI owner's name */ - struct timer_list dropped_IPI_timer; /* dropped IPI timer */ - - spinlock_t IPI_lock; /* IPI handler lock */ - - - /* channel manager related fields */ - - atomic_t channel_mgr_requests; /* #of requests to activate chan mgr */ - wait_queue_head_t channel_mgr_wq; /* channel mgr's wait queue */ - -} ____cacheline_aligned; - - -/* struct xpc_partition act_state values (for XPC HB) */ - -#define XPC_P_INACTIVE 0x00 /* partition is not active */ -#define XPC_P_ACTIVATION_REQ 0x01 /* created thread to activate */ -#define XPC_P_ACTIVATING 0x02 /* activation thread started */ -#define XPC_P_ACTIVE 0x03 /* xpc_partition_up() was called */ -#define XPC_P_DEACTIVATING 0x04 /* partition deactivation initiated */ - - -#define XPC_DEACTIVATE_PARTITION(_p, _reason) \ - xpc_deactivate_partition(__LINE__, (_p), (_reason)) - - -/* struct xpc_partition setup_state values */ - -#define XPC_P_UNSET 0x00 /* infrastructure was never setup */ -#define XPC_P_SETUP 0x01 /* infrastructure is setup */ -#define XPC_P_WTEARDOWN 0x02 /* waiting to teardown infrastructure */ -#define XPC_P_TORNDOWN 0x03 /* infrastructure is torndown */ - - - -/* - * struct xpc_partition IPI_timer #of seconds to wait before checking for - * dropped IPIs. These occur whenever an IPI amo write doesn't complete until - * after the IPI was received. - */ -#define XPC_P_DROPPED_IPI_WAIT (0.25 * HZ) - - -/* number of seconds to wait for other partitions to disengage */ -#define XPC_DISENGAGE_REQUEST_DEFAULT_TIMELIMIT 90 - -/* interval in seconds to print 'waiting disengagement' messages */ -#define XPC_DISENGAGE_PRINTMSG_INTERVAL 10 - - -#define XPC_PARTID(_p) ((partid_t) ((_p) - &xpc_partitions[0])) - - - -/* found in xp_main.c */ -extern struct xpc_registration xpc_registrations[]; - - -/* found in xpc_main.c */ -extern struct device *xpc_part; -extern struct device *xpc_chan; -extern int xpc_disengage_request_timelimit; -extern irqreturn_t xpc_notify_IRQ_handler(int, void *, struct pt_regs *); -extern void xpc_dropped_IPI_check(struct xpc_partition *); -extern void xpc_activate_partition(struct xpc_partition *); -extern void xpc_activate_kthreads(struct xpc_channel *, int); -extern void xpc_create_kthreads(struct xpc_channel *, int); -extern void xpc_disconnect_wait(int); - - -/* found in xpc_partition.c */ -extern int xpc_exiting; -extern struct xpc_vars *xpc_vars; -extern struct xpc_rsvd_page *xpc_rsvd_page; -extern struct xpc_vars_part *xpc_vars_part; -extern struct xpc_partition xpc_partitions[XP_MAX_PARTITIONS + 1]; -extern char xpc_remote_copy_buffer[]; -extern struct xpc_rsvd_page *xpc_rsvd_page_init(void); -extern void xpc_allow_IPI_ops(void); -extern void xpc_restrict_IPI_ops(void); -extern int xpc_identify_act_IRQ_sender(void); -extern int xpc_partition_disengaged(struct xpc_partition *); -extern enum xpc_retval xpc_mark_partition_active(struct xpc_partition *); -extern void xpc_mark_partition_inactive(struct xpc_partition *); -extern void xpc_discovery(void); -extern void xpc_check_remote_hb(void); -extern void xpc_deactivate_partition(const int, struct xpc_partition *, - enum xpc_retval); -extern enum xpc_retval xpc_initiate_partid_to_nasids(partid_t, void *); - - -/* found in xpc_channel.c */ -extern void xpc_initiate_connect(int); -extern void xpc_initiate_disconnect(int); -extern enum xpc_retval xpc_initiate_allocate(partid_t, int, u32, void **); -extern enum xpc_retval xpc_initiate_send(partid_t, int, void *); -extern enum xpc_retval xpc_initiate_send_notify(partid_t, int, void *, - xpc_notify_func, void *); -extern void xpc_initiate_received(partid_t, int, void *); -extern enum xpc_retval xpc_setup_infrastructure(struct xpc_partition *); -extern enum xpc_retval xpc_pull_remote_vars_part(struct xpc_partition *); -extern void xpc_process_channel_activity(struct xpc_partition *); -extern void xpc_connected_callout(struct xpc_channel *); -extern void xpc_deliver_msg(struct xpc_channel *); -extern void xpc_disconnect_channel(const int, struct xpc_channel *, - enum xpc_retval, unsigned long *); -extern void xpc_disconnecting_callout(struct xpc_channel *); -extern void xpc_partition_going_down(struct xpc_partition *, enum xpc_retval); -extern void xpc_teardown_infrastructure(struct xpc_partition *); - - - -static inline void -xpc_wakeup_channel_mgr(struct xpc_partition *part) -{ - if (atomic_inc_return(&part->channel_mgr_requests) == 1) { - wake_up(&part->channel_mgr_wq); - } -} - - - -/* - * These next two inlines are used to keep us from tearing down a channel's - * msg queues while a thread may be referencing them. - */ -static inline void -xpc_msgqueue_ref(struct xpc_channel *ch) -{ - atomic_inc(&ch->references); -} - -static inline void -xpc_msgqueue_deref(struct xpc_channel *ch) -{ - s32 refs = atomic_dec_return(&ch->references); - - DBUG_ON(refs < 0); - if (refs == 0) { - xpc_wakeup_channel_mgr(&xpc_partitions[ch->partid]); - } -} - - - -#define XPC_DISCONNECT_CHANNEL(_ch, _reason, _irqflgs) \ - xpc_disconnect_channel(__LINE__, _ch, _reason, _irqflgs) - - -/* - * These two inlines are used to keep us from tearing down a partition's - * setup infrastructure while a thread may be referencing it. - */ -static inline void -xpc_part_deref(struct xpc_partition *part) -{ - s32 refs = atomic_dec_return(&part->references); - - - DBUG_ON(refs < 0); - if (refs == 0 && part->setup_state == XPC_P_WTEARDOWN) { - wake_up(&part->teardown_wq); - } -} - -static inline int -xpc_part_ref(struct xpc_partition *part) -{ - int setup; - - - atomic_inc(&part->references); - setup = (part->setup_state == XPC_P_SETUP); - if (!setup) { - xpc_part_deref(part); - } - return setup; -} - - - -/* - * The following macro is to be used for the setting of the reason and - * reason_line fields in both the struct xpc_channel and struct xpc_partition - * structures. - */ -#define XPC_SET_REASON(_p, _reason, _line) \ - { \ - (_p)->reason = _reason; \ - (_p)->reason_line = _line; \ - } - - - -/* - * This next set of inlines are used to keep track of when a partition is - * potentially engaged in accessing memory belonging to another partition. - */ - -static inline void -xpc_mark_partition_engaged(struct xpc_partition *part) -{ - unsigned long irq_flags; - AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + - (XPC_ENGAGED_PARTITIONS_AMO * sizeof(AMO_t))); - - - local_irq_save(irq_flags); - - /* set bit corresponding to our partid in remote partition's AMO */ - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_OR, - (1UL << sn_partition_id)); - /* - * We must always use the nofault function regardless of whether we - * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we - * didn't, we'd never know that the other partition is down and would - * keep sending IPIs and AMOs to it until the heartbeat times out. - */ - (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> - variable), xp_nofault_PIOR_target)); - - local_irq_restore(irq_flags); -} - -static inline void -xpc_mark_partition_disengaged(struct xpc_partition *part) -{ - unsigned long irq_flags; - AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + - (XPC_ENGAGED_PARTITIONS_AMO * sizeof(AMO_t))); - - - local_irq_save(irq_flags); - - /* clear bit corresponding to our partid in remote partition's AMO */ - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, - ~(1UL << sn_partition_id)); - /* - * We must always use the nofault function regardless of whether we - * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we - * didn't, we'd never know that the other partition is down and would - * keep sending IPIs and AMOs to it until the heartbeat times out. - */ - (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> - variable), xp_nofault_PIOR_target)); - - local_irq_restore(irq_flags); -} - -static inline void -xpc_request_partition_disengage(struct xpc_partition *part) -{ - unsigned long irq_flags; - AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + - (XPC_DISENGAGE_REQUEST_AMO * sizeof(AMO_t))); - - - local_irq_save(irq_flags); - - /* set bit corresponding to our partid in remote partition's AMO */ - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_OR, - (1UL << sn_partition_id)); - /* - * We must always use the nofault function regardless of whether we - * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we - * didn't, we'd never know that the other partition is down and would - * keep sending IPIs and AMOs to it until the heartbeat times out. - */ - (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> - variable), xp_nofault_PIOR_target)); - - local_irq_restore(irq_flags); -} - -static inline void -xpc_cancel_partition_disengage_request(struct xpc_partition *part) -{ - unsigned long irq_flags; - AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + - (XPC_DISENGAGE_REQUEST_AMO * sizeof(AMO_t))); - - - local_irq_save(irq_flags); - - /* clear bit corresponding to our partid in remote partition's AMO */ - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, - ~(1UL << sn_partition_id)); - /* - * We must always use the nofault function regardless of whether we - * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we - * didn't, we'd never know that the other partition is down and would - * keep sending IPIs and AMOs to it until the heartbeat times out. - */ - (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> - variable), xp_nofault_PIOR_target)); - - local_irq_restore(irq_flags); -} - -static inline u64 -xpc_partition_engaged(u64 partid_mask) -{ - AMO_t *amo = xpc_vars->amos_page + XPC_ENGAGED_PARTITIONS_AMO; - - - /* return our partition's AMO variable ANDed with partid_mask */ - return (FETCHOP_LOAD_OP(TO_AMO((u64) &amo->variable), FETCHOP_LOAD) & - partid_mask); -} - -static inline u64 -xpc_partition_disengage_requested(u64 partid_mask) -{ - AMO_t *amo = xpc_vars->amos_page + XPC_DISENGAGE_REQUEST_AMO; - - - /* return our partition's AMO variable ANDed with partid_mask */ - return (FETCHOP_LOAD_OP(TO_AMO((u64) &amo->variable), FETCHOP_LOAD) & - partid_mask); -} - -static inline void -xpc_clear_partition_engaged(u64 partid_mask) -{ - AMO_t *amo = xpc_vars->amos_page + XPC_ENGAGED_PARTITIONS_AMO; - - - /* clear bit(s) based on partid_mask in our partition's AMO */ - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, - ~partid_mask); -} - -static inline void -xpc_clear_partition_disengage_request(u64 partid_mask) -{ - AMO_t *amo = xpc_vars->amos_page + XPC_DISENGAGE_REQUEST_AMO; - - - /* clear bit(s) based on partid_mask in our partition's AMO */ - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, - ~partid_mask); -} - - - -/* - * The following set of macros and inlines are used for the sending and - * receiving of IPIs (also known as IRQs). There are two flavors of IPIs, - * one that is associated with partition activity (SGI_XPC_ACTIVATE) and - * the other that is associated with channel activity (SGI_XPC_NOTIFY). - */ - -static inline u64 -xpc_IPI_receive(AMO_t *amo) -{ - return FETCHOP_LOAD_OP(TO_AMO((u64) &amo->variable), FETCHOP_CLEAR); -} - - -static inline enum xpc_retval -xpc_IPI_send(AMO_t *amo, u64 flag, int nasid, int phys_cpuid, int vector) -{ - int ret = 0; - unsigned long irq_flags; - - - local_irq_save(irq_flags); - - FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_OR, flag); - sn_send_IPI_phys(nasid, phys_cpuid, vector, 0); - - /* - * We must always use the nofault function regardless of whether we - * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we - * didn't, we'd never know that the other partition is down and would - * keep sending IPIs and AMOs to it until the heartbeat times out. - */ - ret = xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo->variable), - xp_nofault_PIOR_target)); - - local_irq_restore(irq_flags); - - return ((ret == 0) ? xpcSuccess : xpcPioReadError); -} - - -/* - * IPIs associated with SGI_XPC_ACTIVATE IRQ. - */ - -/* - * Flag the appropriate AMO variable and send an IPI to the specified node. - */ -static inline void -xpc_activate_IRQ_send(u64 amos_page_pa, int from_nasid, int to_nasid, - int to_phys_cpuid) -{ - int w_index = XPC_NASID_W_INDEX(from_nasid); - int b_index = XPC_NASID_B_INDEX(from_nasid); - AMO_t *amos = (AMO_t *) __va(amos_page_pa + - (XPC_ACTIVATE_IRQ_AMOS * sizeof(AMO_t))); - - - (void) xpc_IPI_send(&amos[w_index], (1UL << b_index), to_nasid, - to_phys_cpuid, SGI_XPC_ACTIVATE); -} - -static inline void -xpc_IPI_send_activate(struct xpc_vars *vars) -{ - xpc_activate_IRQ_send(vars->amos_page_pa, cnodeid_to_nasid(0), - vars->act_nasid, vars->act_phys_cpuid); -} - -static inline void -xpc_IPI_send_activated(struct xpc_partition *part) -{ - xpc_activate_IRQ_send(part->remote_amos_page_pa, cnodeid_to_nasid(0), - part->remote_act_nasid, part->remote_act_phys_cpuid); -} - -static inline void -xpc_IPI_send_reactivate(struct xpc_partition *part) -{ - xpc_activate_IRQ_send(xpc_vars->amos_page_pa, part->reactivate_nasid, - xpc_vars->act_nasid, xpc_vars->act_phys_cpuid); -} - -static inline void -xpc_IPI_send_disengage(struct xpc_partition *part) -{ - xpc_activate_IRQ_send(part->remote_amos_page_pa, cnodeid_to_nasid(0), - part->remote_act_nasid, part->remote_act_phys_cpuid); -} - - -/* - * IPIs associated with SGI_XPC_NOTIFY IRQ. - */ - -/* - * Send an IPI to the remote partition that is associated with the - * specified channel. - */ -#define XPC_NOTIFY_IRQ_SEND(_ch, _ipi_f, _irq_f) \ - xpc_notify_IRQ_send(_ch, _ipi_f, #_ipi_f, _irq_f) - -static inline void -xpc_notify_IRQ_send(struct xpc_channel *ch, u8 ipi_flag, char *ipi_flag_string, - unsigned long *irq_flags) -{ - struct xpc_partition *part = &xpc_partitions[ch->partid]; - enum xpc_retval ret; - - - if (likely(part->act_state != XPC_P_DEACTIVATING)) { - ret = xpc_IPI_send(part->remote_IPI_amo_va, - (u64) ipi_flag << (ch->number * 8), - part->remote_IPI_nasid, - part->remote_IPI_phys_cpuid, - SGI_XPC_NOTIFY); - dev_dbg(xpc_chan, "%s sent to partid=%d, channel=%d, ret=%d\n", - ipi_flag_string, ch->partid, ch->number, ret); - if (unlikely(ret != xpcSuccess)) { - if (irq_flags != NULL) { - spin_unlock_irqrestore(&ch->lock, *irq_flags); - } - XPC_DEACTIVATE_PARTITION(part, ret); - if (irq_flags != NULL) { - spin_lock_irqsave(&ch->lock, *irq_flags); - } - } - } -} - - -/* - * Make it look like the remote partition, which is associated with the - * specified channel, sent us an IPI. This faked IPI will be handled - * by xpc_dropped_IPI_check(). - */ -#define XPC_NOTIFY_IRQ_SEND_LOCAL(_ch, _ipi_f) \ - xpc_notify_IRQ_send_local(_ch, _ipi_f, #_ipi_f) - -static inline void -xpc_notify_IRQ_send_local(struct xpc_channel *ch, u8 ipi_flag, - char *ipi_flag_string) -{ - struct xpc_partition *part = &xpc_partitions[ch->partid]; - - - FETCHOP_STORE_OP(TO_AMO((u64) &part->local_IPI_amo_va->variable), - FETCHOP_OR, ((u64) ipi_flag << (ch->number * 8))); - dev_dbg(xpc_chan, "%s sent local from partid=%d, channel=%d\n", - ipi_flag_string, ch->partid, ch->number); -} - - -/* - * The sending and receiving of IPIs includes the setting of an AMO variable - * to indicate the reason the IPI was sent. The 64-bit variable is divided - * up into eight bytes, ordered from right to left. Byte zero pertains to - * channel 0, byte one to channel 1, and so on. Each byte is described by - * the following IPI flags. - */ - -#define XPC_IPI_CLOSEREQUEST 0x01 -#define XPC_IPI_CLOSEREPLY 0x02 -#define XPC_IPI_OPENREQUEST 0x04 -#define XPC_IPI_OPENREPLY 0x08 -#define XPC_IPI_MSGREQUEST 0x10 - - -/* given an AMO variable and a channel#, get its associated IPI flags */ -#define XPC_GET_IPI_FLAGS(_amo, _c) ((u8) (((_amo) >> ((_c) * 8)) & 0xff)) -#define XPC_SET_IPI_FLAGS(_amo, _c, _f) (_amo) |= ((u64) (_f) << ((_c) * 8)) - -#define XPC_ANY_OPENCLOSE_IPI_FLAGS_SET(_amo) ((_amo) & 0x0f0f0f0f0f0f0f0f) -#define XPC_ANY_MSG_IPI_FLAGS_SET(_amo) ((_amo) & 0x1010101010101010) - - -static inline void -xpc_IPI_send_closerequest(struct xpc_channel *ch, unsigned long *irq_flags) -{ - struct xpc_openclose_args *args = ch->local_openclose_args; - - - args->reason = ch->reason; - - XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_CLOSEREQUEST, irq_flags); -} - -static inline void -xpc_IPI_send_closereply(struct xpc_channel *ch, unsigned long *irq_flags) -{ - XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_CLOSEREPLY, irq_flags); -} - -static inline void -xpc_IPI_send_openrequest(struct xpc_channel *ch, unsigned long *irq_flags) -{ - struct xpc_openclose_args *args = ch->local_openclose_args; - - - args->msg_size = ch->msg_size; - args->local_nentries = ch->local_nentries; - - XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_OPENREQUEST, irq_flags); -} - -static inline void -xpc_IPI_send_openreply(struct xpc_channel *ch, unsigned long *irq_flags) -{ - struct xpc_openclose_args *args = ch->local_openclose_args; - - - args->remote_nentries = ch->remote_nentries; - args->local_nentries = ch->local_nentries; - args->local_msgqueue_pa = __pa(ch->local_msgqueue); - - XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_OPENREPLY, irq_flags); -} - -static inline void -xpc_IPI_send_msgrequest(struct xpc_channel *ch) -{ - XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_MSGREQUEST, NULL); -} - -static inline void -xpc_IPI_send_local_msgrequest(struct xpc_channel *ch) -{ - XPC_NOTIFY_IRQ_SEND_LOCAL(ch, XPC_IPI_MSGREQUEST); -} - - -/* - * Memory for XPC's AMO variables is allocated by the MSPEC driver. These - * pages are located in the lowest granule. The lowest granule uses 4k pages - * for cached references and an alternate TLB handler to never provide a - * cacheable mapping for the entire region. This will prevent speculative - * reading of cached copies of our lines from being issued which will cause - * a PI FSB Protocol error to be generated by the SHUB. For XPC, we need 64 - * AMO variables (based on XP_MAX_PARTITIONS) for message notification and an - * additional 128 AMO variables (based on XP_NASID_MASK_WORDS) for partition - * activation and 2 AMO variables for partition deactivation. - */ -static inline AMO_t * -xpc_IPI_init(int index) -{ - AMO_t *amo = xpc_vars->amos_page + index; - - - (void) xpc_IPI_receive(amo); /* clear AMO variable */ - return amo; -} - - - -static inline enum xpc_retval -xpc_map_bte_errors(bte_result_t error) -{ - switch (error) { - case BTE_SUCCESS: return xpcSuccess; - case BTEFAIL_DIR: return xpcBteDirectoryError; - case BTEFAIL_POISON: return xpcBtePoisonError; - case BTEFAIL_WERR: return xpcBteWriteError; - case BTEFAIL_ACCESS: return xpcBteAccessError; - case BTEFAIL_PWERR: return xpcBtePWriteError; - case BTEFAIL_PRERR: return xpcBtePReadError; - case BTEFAIL_TOUT: return xpcBteTimeOutError; - case BTEFAIL_XTERR: return xpcBteXtalkError; - case BTEFAIL_NOTAVAIL: return xpcBteNotAvailable; - default: return xpcBteUnmappedError; - } -} - - - -static inline void * -xpc_kmalloc_cacheline_aligned(size_t size, gfp_t flags, void **base) -{ - /* see if kmalloc will give us cachline aligned memory by default */ - *base = kmalloc(size, flags); - if (*base == NULL) { - return NULL; - } - if ((u64) *base == L1_CACHE_ALIGN((u64) *base)) { - return *base; - } - kfree(*base); - - /* nope, we'll have to do it ourselves */ - *base = kmalloc(size + L1_CACHE_BYTES, flags); - if (*base == NULL) { - return NULL; - } - return (void *) L1_CACHE_ALIGN((u64) *base); -} - - -/* - * Check to see if there is any channel activity to/from the specified - * partition. - */ -static inline void -xpc_check_for_channel_activity(struct xpc_partition *part) -{ - u64 IPI_amo; - unsigned long irq_flags; - - - IPI_amo = xpc_IPI_receive(part->local_IPI_amo_va); - if (IPI_amo == 0) { - return; - } - - spin_lock_irqsave(&part->IPI_lock, irq_flags); - part->local_IPI_amo |= IPI_amo; - spin_unlock_irqrestore(&part->IPI_lock, irq_flags); - - dev_dbg(xpc_chan, "received IPI from partid=%d, IPI_amo=0x%lx\n", - XPC_PARTID(part), IPI_amo); - - xpc_wakeup_channel_mgr(part); -} - - -#endif /* _IA64_SN_KERNEL_XPC_H */ - diff --git a/arch/ia64/sn/kernel/xpc_channel.c b/arch/ia64/sn/kernel/xpc_channel.c index abf4fc2..0c0a689 100644 --- a/arch/ia64/sn/kernel/xpc_channel.c +++ b/arch/ia64/sn/kernel/xpc_channel.c @@ -3,7 +3,7 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (c) 2004-2005 Silicon Graphics, Inc. All Rights Reserved. + * Copyright (c) 2004-2006 Silicon Graphics, Inc. All Rights Reserved. */ @@ -24,7 +24,7 @@ #include #include #include -#include "xpc.h" +#include /* @@ -779,6 +779,12 @@ xpc_process_disconnect(struct xpc_channe /* both sides are disconnected now */ + if (ch->flags & XPC_C_CONNECTCALLOUT) { + spin_unlock_irqrestore(&ch->lock, *irq_flags); + xpc_disconnect_callout(ch, xpcDisconnected); + spin_lock_irqsave(&ch->lock, *irq_flags); + } + /* it's now safe to free the channel's message queues */ xpc_free_msgqueues(ch); @@ -1645,7 +1651,7 @@ xpc_disconnect_channel(const int line, s void -xpc_disconnecting_callout(struct xpc_channel *ch) +xpc_disconnect_callout(struct xpc_channel *ch, enum xpc_retval reason) { /* * Let the channel's registerer know that the channel is being @@ -1654,15 +1660,13 @@ xpc_disconnecting_callout(struct xpc_cha */ if (ch->func != NULL) { - dev_dbg(xpc_chan, "ch->func() called, reason=xpcDisconnecting," - " partid=%d, channel=%d\n", ch->partid, ch->number); + dev_dbg(xpc_chan, "ch->func() called, reason=%d, partid=%d, " + "channel=%d\n", reason, ch->partid, ch->number); - ch->func(xpcDisconnecting, ch->partid, ch->number, NULL, - ch->key); + ch->func(reason, ch->partid, ch->number, NULL, ch->key); - dev_dbg(xpc_chan, "ch->func() returned, reason=" - "xpcDisconnecting, partid=%d, channel=%d\n", - ch->partid, ch->number); + dev_dbg(xpc_chan, "ch->func() returned, reason=%d, partid=%d, " + "channel=%d\n", reason, ch->partid, ch->number); } } diff --git a/arch/ia64/sn/kernel/xpc_main.c b/arch/ia64/sn/kernel/xpc_main.c index b617236..8930586 100644 --- a/arch/ia64/sn/kernel/xpc_main.c +++ b/arch/ia64/sn/kernel/xpc_main.c @@ -3,7 +3,7 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (c) 2004-2005 Silicon Graphics, Inc. All Rights Reserved. + * Copyright (c) 2004-2006 Silicon Graphics, Inc. All Rights Reserved. */ @@ -59,7 +59,7 @@ #include #include #include -#include "xpc.h" +#include /* define two XPC debug device structures to be used with dev_dbg() et al */ @@ -82,6 +82,9 @@ struct device *xpc_part = &xpc_part_dbg_ struct device *xpc_chan = &xpc_chan_dbg_subname; +static int xpc_kdebug_ignore; + + /* systune related variables for /proc/sys directories */ static int xpc_hb_interval = XPC_HB_DEFAULT_INTERVAL; @@ -162,6 +165,8 @@ static ctl_table xpc_sys_dir[] = { }; static struct ctl_table_header *xpc_sysctl; +/* non-zero if any remote partition disengage request was timed out */ +int xpc_disengage_request_timedout; /* #of IRQs received */ static atomic_t xpc_act_IRQ_rcvd; @@ -773,7 +778,7 @@ xpc_daemonize_kthread(void *args) ch->flags |= XPC_C_DISCONNECTCALLOUT; spin_unlock_irqrestore(&ch->lock, irq_flags); - xpc_disconnecting_callout(ch); + xpc_disconnect_callout(ch, xpcDisconnecting); } else { spin_unlock_irqrestore(&ch->lock, irq_flags); } @@ -921,9 +926,9 @@ static void xpc_do_exit(enum xpc_retval reason) { partid_t partid; - int active_part_count; + int active_part_count, printed_waiting_msg = 0; struct xpc_partition *part; - unsigned long printmsg_time; + unsigned long printmsg_time, disengage_request_timeout = 0; /* a 'rmmod XPC' and a 'reboot' cannot both end up here together */ @@ -953,7 +958,8 @@ xpc_do_exit(enum xpc_retval reason) /* wait for all partitions to become inactive */ - printmsg_time = jiffies; + printmsg_time = jiffies + (XPC_DISENGAGE_PRINTMSG_INTERVAL * HZ); + xpc_disengage_request_timedout = 0; do { active_part_count = 0; @@ -969,20 +975,39 @@ xpc_do_exit(enum xpc_retval reason) active_part_count++; XPC_DEACTIVATE_PARTITION(part, reason); - } - if (active_part_count == 0) { - break; + if (part->disengage_request_timeout > + disengage_request_timeout) { + disengage_request_timeout = + part->disengage_request_timeout; + } } - if (jiffies >= printmsg_time) { - dev_info(xpc_part, "waiting for partitions to " - "deactivate/disengage, active count=%d, remote " - "engaged=0x%lx\n", active_part_count, - xpc_partition_engaged(1UL << partid)); - - printmsg_time = jiffies + + if (xpc_partition_engaged(-1UL)) { + if (time_after(jiffies, printmsg_time)) { + dev_info(xpc_part, "waiting for remote " + "partitions to disengage, timeout in " + "%ld seconds\n", + (disengage_request_timeout - jiffies) + / HZ); + printmsg_time = jiffies + (XPC_DISENGAGE_PRINTMSG_INTERVAL * HZ); + printed_waiting_msg = 1; + } + + } else if (active_part_count > 0) { + if (printed_waiting_msg) { + dev_info(xpc_part, "waiting for local partition" + " to disengage\n"); + printed_waiting_msg = 0; + } + + } else { + if (!xpc_disengage_request_timedout) { + dev_info(xpc_part, "all partitions have " + "disengaged\n"); + } + break; } /* sleep for a 1/3 of a second or so */ @@ -1000,11 +1025,13 @@ xpc_do_exit(enum xpc_retval reason) del_timer_sync(&xpc_hb_timer); DBUG_ON(xpc_vars->heartbeating_to_mask != 0); - /* take ourselves off of the reboot_notifier_list */ - (void) unregister_reboot_notifier(&xpc_reboot_notifier); + if (reason == xpcUnloading) { + /* take ourselves off of the reboot_notifier_list */ + (void) unregister_reboot_notifier(&xpc_reboot_notifier); - /* take ourselves off of the die_notifier list */ - (void) unregister_die_notifier(&xpc_die_notifier); + /* take ourselves off of the die_notifier list */ + (void) unregister_die_notifier(&xpc_die_notifier); + } /* close down protections for IPI operations */ xpc_restrict_IPI_ops(); @@ -1020,7 +1047,35 @@ xpc_do_exit(enum xpc_retval reason) /* - * Called when the system is about to be either restarted or halted. + * This function is called when the system is being rebooted. + */ +static int +xpc_system_reboot(struct notifier_block *nb, unsigned long event, void *unused) +{ + enum xpc_retval reason; + + + switch (event) { + case SYS_RESTART: + reason = xpcSystemReboot; + break; + case SYS_HALT: + reason = xpcSystemHalt; + break; + case SYS_POWER_OFF: + reason = xpcSystemPoweroff; + break; + default: + reason = xpcSystemGoingDown; + } + + xpc_do_exit(reason); + return NOTIFY_DONE; +} + + +/* + * Notify other partitions to disengage from all references to our memory. */ static void xpc_die_disengage(void) @@ -1028,7 +1083,7 @@ xpc_die_disengage(void) struct xpc_partition *part; partid_t partid; unsigned long engaged; - long time, print_time, disengage_request_timeout; + long time, printmsg_time, disengage_request_timeout; /* keep xpc_hb_checker thread from doing anything (just in case) */ @@ -1055,57 +1110,53 @@ xpc_die_disengage(void) } } - print_time = rtc_time(); - disengage_request_timeout = print_time + + time = rtc_time(); + printmsg_time = time + + (XPC_DISENGAGE_PRINTMSG_INTERVAL * sn_rtc_cycles_per_second); + disengage_request_timeout = time + (xpc_disengage_request_timelimit * sn_rtc_cycles_per_second); /* wait for all other partitions to disengage from us */ - while ((engaged = xpc_partition_engaged(-1UL)) && - (time = rtc_time()) < disengage_request_timeout) { + while (1) { + engaged = xpc_partition_engaged(-1UL); + if (!engaged) { + dev_info(xpc_part, "all partitions have disengaged\n"); + break; + } - if (time >= print_time) { + time = rtc_time(); + if (time >= disengage_request_timeout) { + for (partid = 1; partid < XP_MAX_PARTITIONS; partid++) { + if (engaged & (1UL << partid)) { + dev_info(xpc_part, "disengage from " + "remote partition %d timed " + "out\n", partid); + } + } + break; + } + + if (time >= printmsg_time) { dev_info(xpc_part, "waiting for remote partitions to " - "disengage, engaged=0x%lx\n", engaged); - print_time = time + (XPC_DISENGAGE_PRINTMSG_INTERVAL * + "disengage, timeout in %ld seconds\n", + (disengage_request_timeout - time) / + sn_rtc_cycles_per_second); + printmsg_time = time + + (XPC_DISENGAGE_PRINTMSG_INTERVAL * sn_rtc_cycles_per_second); } } - dev_info(xpc_part, "finished waiting for remote partitions to " - "disengage, engaged=0x%lx\n", engaged); -} - - -/* - * This function is called when the system is being rebooted. - */ -static int -xpc_system_reboot(struct notifier_block *nb, unsigned long event, void *unused) -{ - enum xpc_retval reason; - - - switch (event) { - case SYS_RESTART: - reason = xpcSystemReboot; - break; - case SYS_HALT: - reason = xpcSystemHalt; - break; - case SYS_POWER_OFF: - reason = xpcSystemPoweroff; - break; - default: - reason = xpcSystemGoingDown; - } - - xpc_do_exit(reason); - return NOTIFY_DONE; } /* - * This function is called when the system is being rebooted. + * This function is called when the system is being restarted or halted due + * to some sort of system failure. If this is the case we need to notify the + * other partitions to disengage from all references to our memory. + * This function can also be called when our heartbeater could be offlined + * for a time. In this case we need to notify other partitions to not worry + * about the lack of a heartbeat. */ static int xpc_system_die(struct notifier_block *nb, unsigned long event, void *unused) @@ -1115,11 +1166,25 @@ xpc_system_die(struct notifier_block *nb case DIE_MACHINE_HALT: xpc_die_disengage(); break; + + case DIE_KDEBUG_ENTER: + /* Should lack of heartbeat be ignored by other partitions? */ + if (!xpc_kdebug_ignore) { + break; + } + /* fall through */ case DIE_MCA_MONARCH_ENTER: case DIE_INIT_MONARCH_ENTER: xpc_vars->heartbeat++; xpc_vars->heartbeat_offline = 1; break; + + case DIE_KDEBUG_LEAVE: + /* Is lack of heartbeat being ignored by other partitions? */ + if (!xpc_kdebug_ignore) { + break; + } + /* fall through */ case DIE_MCA_MONARCH_LEAVE: case DIE_INIT_MONARCH_LEAVE: xpc_vars->heartbeat++; @@ -1344,3 +1409,7 @@ module_param(xpc_disengage_request_timel MODULE_PARM_DESC(xpc_disengage_request_timelimit, "Number of seconds to wait " "for disengage request to complete."); +module_param(xpc_kdebug_ignore, int, 0); +MODULE_PARM_DESC(xpc_kdebug_ignore, "Should lack of heartbeat be ignored by " + "other partitions when dropping into kdebug."); + diff --git a/arch/ia64/sn/kernel/xpc_partition.c b/arch/ia64/sn/kernel/xpc_partition.c index cdd6431..88a730e 100644 --- a/arch/ia64/sn/kernel/xpc_partition.c +++ b/arch/ia64/sn/kernel/xpc_partition.c @@ -3,7 +3,7 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (c) 2004-2005 Silicon Graphics, Inc. All Rights Reserved. + * Copyright (c) 2004-2006 Silicon Graphics, Inc. All Rights Reserved. */ @@ -28,7 +28,7 @@ #include #include #include -#include "xpc.h" +#include /* XPC is exiting flag */ @@ -771,7 +771,8 @@ xpc_identify_act_IRQ_req(int nasid) } } - if (!xpc_partition_disengaged(part)) { + if (part->disengage_request_timeout > 0 && + !xpc_partition_disengaged(part)) { /* still waiting on other side to disengage from us */ return; } @@ -873,6 +874,9 @@ xpc_partition_disengaged(struct xpc_part * request in a timely fashion, so assume it's dead. */ + dev_info(xpc_part, "disengage from remote partition %d " + "timed out\n", partid); + xpc_disengage_request_timedout = 1; xpc_clear_partition_engaged(1UL << partid); disengaged = 1; } diff --git a/arch/ia64/sn/pci/pcibr/pcibr_dma.c b/arch/ia64/sn/pci/pcibr/pcibr_dma.c index 3409347..e68332d 100644 --- a/arch/ia64/sn/pci/pcibr/pcibr_dma.c +++ b/arch/ia64/sn/pci/pcibr/pcibr_dma.c @@ -218,7 +218,9 @@ void sn_dma_flush(uint64_t addr) uint64_t flags; uint64_t itte; struct hubdev_info *hubinfo; - volatile struct sn_flush_device_list *p; + volatile struct sn_flush_device_kernel *p; + volatile struct sn_flush_device_common *common; + struct sn_flush_nasid_entry *flush_nasid_list; if (!sn_ioif_inited) @@ -268,17 +270,17 @@ void sn_dma_flush(uint64_t addr) p = &flush_nasid_list->widget_p[wid_num][0]; /* find a matching BAR */ - for (i = 0; i < DEV_PER_WIDGET; i++) { + for (i = 0; i < DEV_PER_WIDGET; i++,p++) { + common = p->common; for (j = 0; j < PCI_ROM_RESOURCE; j++) { - if (p->sfdl_bar_list[j].start == 0) + if (common->sfdl_bar_list[j].start == 0) break; - if (addr >= p->sfdl_bar_list[j].start - && addr <= p->sfdl_bar_list[j].end) + if (addr >= common->sfdl_bar_list[j].start + && addr <= common->sfdl_bar_list[j].end) break; } - if (j < PCI_ROM_RESOURCE && p->sfdl_bar_list[j].start != 0) + if (j < PCI_ROM_RESOURCE && common->sfdl_bar_list[j].start != 0) break; - p++; } /* if no matching BAR, return without doing anything. */ @@ -304,24 +306,24 @@ void sn_dma_flush(uint64_t addr) if ((1 << XWIDGET_PART_REV_NUM_REV(revnum)) & PV907516) { return; } else { - pcireg_wrb_flush_get(p->sfdl_pcibus_info, - (p->sfdl_slot - 1)); + pcireg_wrb_flush_get(common->sfdl_pcibus_info, + (common->sfdl_slot - 1)); } } else { - spin_lock_irqsave(&((struct sn_flush_device_list *)p)-> - sfdl_flush_lock, flags); - - *p->sfdl_flush_addr = 0; + spin_lock_irqsave((spinlock_t *)&p->sfdl_flush_lock, + flags); + *common->sfdl_flush_addr = 0; /* force an interrupt. */ - *(volatile uint32_t *)(p->sfdl_force_int_addr) = 1; + *(volatile uint32_t *)(common->sfdl_force_int_addr) = 1; /* wait for the interrupt to come back. */ - while (*(p->sfdl_flush_addr) != 0x10f) + while (*(common->sfdl_flush_addr) != 0x10f) cpu_relax(); /* okay, everything is synched up. */ - spin_unlock_irqrestore((spinlock_t *)&p->sfdl_flush_lock, flags); + spin_unlock_irqrestore((spinlock_t *)&p->sfdl_flush_lock, + flags); } return; } diff --git a/arch/ia64/sn/pci/pcibr/pcibr_provider.c b/arch/ia64/sn/pci/pcibr/pcibr_provider.c index 1f500c8..e328e94 100644 --- a/arch/ia64/sn/pci/pcibr/pcibr_provider.c +++ b/arch/ia64/sn/pci/pcibr/pcibr_provider.c @@ -92,7 +92,8 @@ pcibr_bus_fixup(struct pcibus_bussoft *p cnodeid_t near_cnode; struct hubdev_info *hubdev_info; struct pcibus_info *soft; - struct sn_flush_device_list *sn_flush_device_list; + struct sn_flush_device_kernel *sn_flush_device_kernel; + struct sn_flush_device_common *common; if (! IS_PCI_BRIDGE_ASIC(prom_bussoft->bs_asic_type)) { return NULL; @@ -137,20 +138,19 @@ pcibr_bus_fixup(struct pcibus_bussoft *p hubdev_info = (struct hubdev_info *)(NODEPDA(cnode)->pdinfo); if (hubdev_info->hdi_flush_nasid_list.widget_p) { - sn_flush_device_list = hubdev_info->hdi_flush_nasid_list. + sn_flush_device_kernel = hubdev_info->hdi_flush_nasid_list. widget_p[(int)soft->pbi_buscommon.bs_xid]; - if (sn_flush_device_list) { + if (sn_flush_device_kernel) { for (j = 0; j < DEV_PER_WIDGET; - j++, sn_flush_device_list++) { - if (sn_flush_device_list->sfdl_slot == -1) + j++, sn_flush_device_kernel++) { + common = sn_flush_device_kernel->common; + if (common->sfdl_slot == -1) continue; - if ((sn_flush_device_list-> - sfdl_persistent_segment == + if ((common->sfdl_persistent_segment == soft->pbi_buscommon.bs_persist_segment) && - (sn_flush_device_list-> - sfdl_persistent_busnum == + (common->sfdl_persistent_busnum == soft->pbi_buscommon.bs_persist_busnum)) - sn_flush_device_list->sfdl_pcibus_info = + common->sfdl_pcibus_info = soft; } } diff --git a/include/asm-ia64/kprobes.h b/include/asm-ia64/kprobes.h index a74b681..8c0fc22 100644 --- a/include/asm-ia64/kprobes.h +++ b/include/asm-ia64/kprobes.h @@ -68,10 +68,14 @@ struct prev_kprobe { unsigned long status; }; +#define MAX_PARAM_RSE_SIZE (0x60+0x60/0x3f) /* per-cpu kprobe control block */ struct kprobe_ctlblk { unsigned long kprobe_status; struct pt_regs jprobe_saved_regs; + unsigned long jprobes_saved_stacked_regs[MAX_PARAM_RSE_SIZE]; + unsigned long *bsp; + unsigned long cfm; struct prev_kprobe prev_kprobe; }; @@ -118,5 +122,7 @@ extern int kprobe_exceptions_notify(stru static inline void jprobe_return(void) { } +extern void invalidate_stacked_regs(void); +extern void flush_register_stack(void); #endif /* _ASM_KPROBES_H */ diff --git a/include/asm-ia64/mca.h b/include/asm-ia64/mca.h index c7d9c9e..bfbbb8d 100644 --- a/include/asm-ia64/mca.h +++ b/include/asm-ia64/mca.h @@ -131,6 +131,8 @@ struct ia64_mca_cpu { /* Array of physical addresses of each CPU's MCA area. */ extern unsigned long __per_cpu_mca[NR_CPUS]; +extern int cpe_vector; +extern int ia64_cpe_irq; extern void ia64_mca_init(void); extern void ia64_mca_cpu_init(void *); extern void ia64_os_mca_dispatch(void); diff --git a/include/asm-ia64/sn/sn_sal.h b/include/asm-ia64/sn/sn_sal.h index 2a8b0d9..8b9e10e 100644 --- a/include/asm-ia64/sn/sn_sal.h +++ b/include/asm-ia64/sn/sn_sal.h @@ -75,7 +75,8 @@ #define SN_SAL_IOIF_GET_HUBDEV_INFO 0x02000055 #define SN_SAL_IOIF_GET_PCIBUS_INFO 0x02000056 #define SN_SAL_IOIF_GET_PCIDEV_INFO 0x02000057 -#define SN_SAL_IOIF_GET_WIDGET_DMAFLUSH_LIST 0x02000058 +#define SN_SAL_IOIF_GET_WIDGET_DMAFLUSH_LIST 0x02000058 // deprecated +#define SN_SAL_IOIF_GET_DEVICE_DMAFLUSH_LIST 0x0200005a #define SN_SAL_HUB_ERROR_INTERRUPT 0x02000060 #define SN_SAL_BTE_RECOVER 0x02000061 @@ -1100,7 +1101,7 @@ ia64_sn_bte_recovery(nasid_t nasid) struct ia64_sal_retval rv; rv.status = 0; - SAL_CALL_NOLOCK(rv, SN_SAL_BTE_RECOVER, 0, 0, 0, 0, 0, 0, 0); + SAL_CALL_NOLOCK(rv, SN_SAL_BTE_RECOVER, (u64)nasid, 0, 0, 0, 0, 0, 0); if (rv.status == SALRET_NOT_IMPLEMENTED) return 0; return (int) rv.status; diff --git a/include/asm-ia64/sn/xp.h b/include/asm-ia64/sn/xp.h index 49faf8f..203945a 100644 --- a/include/asm-ia64/sn/xp.h +++ b/include/asm-ia64/sn/xp.h @@ -227,7 +227,9 @@ enum xpc_retval { xpcOpenCloseError, /* 50: channel open/close protocol error */ - xpcUnknownReason /* 51: unknown reason -- must be last in list */ + xpcDisconnected, /* 51: channel disconnected (closed) */ + + xpcUnknownReason /* 52: unknown reason -- must be last in list */ }; diff --git a/include/asm-ia64/sn/xpc.h b/include/asm-ia64/sn/xpc.h new file mode 100644 index 0000000..87e9cd5 --- /dev/null +++ b/include/asm-ia64/sn/xpc.h @@ -0,0 +1,1274 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (c) 2004-2006 Silicon Graphics, Inc. All Rights Reserved. + */ + + +/* + * Cross Partition Communication (XPC) structures and macros. + */ + +#ifndef _ASM_IA64_SN_XPC_H +#define _ASM_IA64_SN_XPC_H + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +/* + * XPC Version numbers consist of a major and minor number. XPC can always + * talk to versions with same major #, and never talk to versions with a + * different major #. + */ +#define _XPC_VERSION(_maj, _min) (((_maj) << 4) | ((_min) & 0xf)) +#define XPC_VERSION_MAJOR(_v) ((_v) >> 4) +#define XPC_VERSION_MINOR(_v) ((_v) & 0xf) + + +/* + * The next macros define word or bit representations for given + * C-brick nasid in either the SAL provided bit array representing + * nasids in the partition/machine or the AMO_t array used for + * inter-partition initiation communications. + * + * For SN2 machines, C-Bricks are alway even numbered NASIDs. As + * such, some space will be saved by insisting that nasid information + * passed from SAL always be packed for C-Bricks and the + * cross-partition interrupts use the same packing scheme. + */ +#define XPC_NASID_W_INDEX(_n) (((_n) / 64) / 2) +#define XPC_NASID_B_INDEX(_n) (((_n) / 2) & (64 - 1)) +#define XPC_NASID_IN_ARRAY(_n, _p) ((_p)[XPC_NASID_W_INDEX(_n)] & \ + (1UL << XPC_NASID_B_INDEX(_n))) +#define XPC_NASID_FROM_W_B(_w, _b) (((_w) * 64 + (_b)) * 2) + +#define XPC_HB_DEFAULT_INTERVAL 5 /* incr HB every x secs */ +#define XPC_HB_CHECK_DEFAULT_INTERVAL 20 /* check HB every x secs */ + +/* define the process name of HB checker and the CPU it is pinned to */ +#define XPC_HB_CHECK_THREAD_NAME "xpc_hb" +#define XPC_HB_CHECK_CPU 0 + +/* define the process name of the discovery thread */ +#define XPC_DISCOVERY_THREAD_NAME "xpc_discovery" + + +/* + * the reserved page + * + * SAL reserves one page of memory per partition for XPC. Though a full page + * in length (16384 bytes), its starting address is not page aligned, but it + * is cacheline aligned. The reserved page consists of the following: + * + * reserved page header + * + * The first cacheline of the reserved page contains the header + * (struct xpc_rsvd_page). Before SAL initialization has completed, + * SAL has set up the following fields of the reserved page header: + * SAL_signature, SAL_version, partid, and nasids_size. The other + * fields are set up by XPC. (xpc_rsvd_page points to the local + * partition's reserved page.) + * + * part_nasids mask + * mach_nasids mask + * + * SAL also sets up two bitmaps (or masks), one that reflects the actual + * nasids in this partition (part_nasids), and the other that reflects + * the actual nasids in the entire machine (mach_nasids). We're only + * interested in the even numbered nasids (which contain the processors + * and/or memory), so we only need half as many bits to represent the + * nasids. The part_nasids mask is located starting at the first cacheline + * following the reserved page header. The mach_nasids mask follows right + * after the part_nasids mask. The size in bytes of each mask is reflected + * by the reserved page header field 'nasids_size'. (Local partition's + * mask pointers are xpc_part_nasids and xpc_mach_nasids.) + * + * vars + * vars part + * + * Immediately following the mach_nasids mask are the XPC variables + * required by other partitions. First are those that are generic to all + * partitions (vars), followed on the next available cacheline by those + * which are partition specific (vars part). These are setup by XPC. + * (Local partition's vars pointers are xpc_vars and xpc_vars_part.) + * + * Note: Until vars_pa is set, the partition XPC code has not been initialized. + */ +struct xpc_rsvd_page { + u64 SAL_signature; /* SAL: unique signature */ + u64 SAL_version; /* SAL: version */ + u8 partid; /* SAL: partition ID */ + u8 version; + u8 pad1[6]; /* align to next u64 in cacheline */ + volatile u64 vars_pa; + struct timespec stamp; /* time when reserved page was setup by XPC */ + u64 pad2[9]; /* align to last u64 in cacheline */ + u64 nasids_size; /* SAL: size of each nasid mask in bytes */ +}; + +#define XPC_RP_VERSION _XPC_VERSION(1,1) /* version 1.1 of the reserved page */ + +#define XPC_SUPPORTS_RP_STAMP(_version) \ + (_version >= _XPC_VERSION(1,1)) + +/* + * compare stamps - the return value is: + * + * < 0, if stamp1 < stamp2 + * = 0, if stamp1 == stamp2 + * > 0, if stamp1 > stamp2 + */ +static inline int +xpc_compare_stamps(struct timespec *stamp1, struct timespec *stamp2) +{ + int ret; + + + if ((ret = stamp1->tv_sec - stamp2->tv_sec) == 0) { + ret = stamp1->tv_nsec - stamp2->tv_nsec; + } + return ret; +} + + +/* + * Define the structures by which XPC variables can be exported to other + * partitions. (There are two: struct xpc_vars and struct xpc_vars_part) + */ + +/* + * The following structure describes the partition generic variables + * needed by other partitions in order to properly initialize. + * + * struct xpc_vars version number also applies to struct xpc_vars_part. + * Changes to either structure and/or related functionality should be + * reflected by incrementing either the major or minor version numbers + * of struct xpc_vars. + */ +struct xpc_vars { + u8 version; + u64 heartbeat; + u64 heartbeating_to_mask; + u64 heartbeat_offline; /* if 0, heartbeat should be changing */ + int act_nasid; + int act_phys_cpuid; + u64 vars_part_pa; + u64 amos_page_pa; /* paddr of page of AMOs from MSPEC driver */ + AMO_t *amos_page; /* vaddr of page of AMOs from MSPEC driver */ +}; + +#define XPC_V_VERSION _XPC_VERSION(3,1) /* version 3.1 of the cross vars */ + +#define XPC_SUPPORTS_DISENGAGE_REQUEST(_version) \ + (_version >= _XPC_VERSION(3,1)) + + +static inline int +xpc_hb_allowed(partid_t partid, struct xpc_vars *vars) +{ + return ((vars->heartbeating_to_mask & (1UL << partid)) != 0); +} + +static inline void +xpc_allow_hb(partid_t partid, struct xpc_vars *vars) +{ + u64 old_mask, new_mask; + + do { + old_mask = vars->heartbeating_to_mask; + new_mask = (old_mask | (1UL << partid)); + } while (cmpxchg(&vars->heartbeating_to_mask, old_mask, new_mask) != + old_mask); +} + +static inline void +xpc_disallow_hb(partid_t partid, struct xpc_vars *vars) +{ + u64 old_mask, new_mask; + + do { + old_mask = vars->heartbeating_to_mask; + new_mask = (old_mask & ~(1UL << partid)); + } while (cmpxchg(&vars->heartbeating_to_mask, old_mask, new_mask) != + old_mask); +} + + +/* + * The AMOs page consists of a number of AMO variables which are divided into + * four groups, The first two groups are used to identify an IRQ's sender. + * These two groups consist of 64 and 128 AMO variables respectively. The last + * two groups, consisting of just one AMO variable each, are used to identify + * the remote partitions that are currently engaged (from the viewpoint of + * the XPC running on the remote partition). + */ +#define XPC_NOTIFY_IRQ_AMOS 0 +#define XPC_ACTIVATE_IRQ_AMOS (XPC_NOTIFY_IRQ_AMOS + XP_MAX_PARTITIONS) +#define XPC_ENGAGED_PARTITIONS_AMO (XPC_ACTIVATE_IRQ_AMOS + XP_NASID_MASK_WORDS) +#define XPC_DISENGAGE_REQUEST_AMO (XPC_ENGAGED_PARTITIONS_AMO + 1) + + +/* + * The following structure describes the per partition specific variables. + * + * An array of these structures, one per partition, will be defined. As a + * partition becomes active XPC will copy the array entry corresponding to + * itself from that partition. It is desirable that the size of this + * structure evenly divide into a cacheline, such that none of the entries + * in this array crosses a cacheline boundary. As it is now, each entry + * occupies half a cacheline. + */ +struct xpc_vars_part { + volatile u64 magic; + + u64 openclose_args_pa; /* physical address of open and close args */ + u64 GPs_pa; /* physical address of Get/Put values */ + + u64 IPI_amo_pa; /* physical address of IPI AMO_t structure */ + int IPI_nasid; /* nasid of where to send IPIs */ + int IPI_phys_cpuid; /* physical CPU ID of where to send IPIs */ + + u8 nchannels; /* #of defined channels supported */ + + u8 reserved[23]; /* pad to a full 64 bytes */ +}; + +/* + * The vars_part MAGIC numbers play a part in the first contact protocol. + * + * MAGIC1 indicates that the per partition specific variables for a remote + * partition have been initialized by this partition. + * + * MAGIC2 indicates that this partition has pulled the remote partititions + * per partition variables that pertain to this partition. + */ +#define XPC_VP_MAGIC1 0x0053524156435058L /* 'XPCVARS\0'L (little endian) */ +#define XPC_VP_MAGIC2 0x0073726176435058L /* 'XPCvars\0'L (little endian) */ + + +/* the reserved page sizes and offsets */ + +#define XPC_RP_HEADER_SIZE L1_CACHE_ALIGN(sizeof(struct xpc_rsvd_page)) +#define XPC_RP_VARS_SIZE L1_CACHE_ALIGN(sizeof(struct xpc_vars)) + +#define XPC_RP_PART_NASIDS(_rp) (u64 *) ((u8 *) _rp + XPC_RP_HEADER_SIZE) +#define XPC_RP_MACH_NASIDS(_rp) (XPC_RP_PART_NASIDS(_rp) + xp_nasid_mask_words) +#define XPC_RP_VARS(_rp) ((struct xpc_vars *) XPC_RP_MACH_NASIDS(_rp) + xp_nasid_mask_words) +#define XPC_RP_VARS_PART(_rp) (struct xpc_vars_part *) ((u8 *) XPC_RP_VARS(rp) + XPC_RP_VARS_SIZE) + + +/* + * Functions registered by add_timer() or called by kernel_thread() only + * allow for a single 64-bit argument. The following macros can be used to + * pack and unpack two (32-bit, 16-bit or 8-bit) arguments into or out from + * the passed argument. + */ +#define XPC_PACK_ARGS(_arg1, _arg2) \ + ((((u64) _arg1) & 0xffffffff) | \ + ((((u64) _arg2) & 0xffffffff) << 32)) + +#define XPC_UNPACK_ARG1(_args) (((u64) _args) & 0xffffffff) +#define XPC_UNPACK_ARG2(_args) ((((u64) _args) >> 32) & 0xffffffff) + + + +/* + * Define a Get/Put value pair (pointers) used with a message queue. + */ +struct xpc_gp { + volatile s64 get; /* Get value */ + volatile s64 put; /* Put value */ +}; + +#define XPC_GP_SIZE \ + L1_CACHE_ALIGN(sizeof(struct xpc_gp) * XPC_NCHANNELS) + + + +/* + * Define a structure that contains arguments associated with opening and + * closing a channel. + */ +struct xpc_openclose_args { + u16 reason; /* reason why channel is closing */ + u16 msg_size; /* sizeof each message entry */ + u16 remote_nentries; /* #of message entries in remote msg queue */ + u16 local_nentries; /* #of message entries in local msg queue */ + u64 local_msgqueue_pa; /* physical address of local message queue */ +}; + +#define XPC_OPENCLOSE_ARGS_SIZE \ + L1_CACHE_ALIGN(sizeof(struct xpc_openclose_args) * XPC_NCHANNELS) + + + +/* struct xpc_msg flags */ + +#define XPC_M_DONE 0x01 /* msg has been received/consumed */ +#define XPC_M_READY 0x02 /* msg is ready to be sent */ +#define XPC_M_INTERRUPT 0x04 /* send interrupt when msg consumed */ + + +#define XPC_MSG_ADDRESS(_payload) \ + ((struct xpc_msg *)((u8 *)(_payload) - XPC_MSG_PAYLOAD_OFFSET)) + + + +/* + * Defines notify entry. + * + * This is used to notify a message's sender that their message was received + * and consumed by the intended recipient. + */ +struct xpc_notify { + struct semaphore sema; /* notify semaphore */ + volatile u8 type; /* type of notification */ + + /* the following two fields are only used if type == XPC_N_CALL */ + xpc_notify_func func; /* user's notify function */ + void *key; /* pointer to user's key */ +}; + +/* struct xpc_notify type of notification */ + +#define XPC_N_CALL 0x01 /* notify function provided by user */ + + + +/* + * Define the structure that manages all the stuff required by a channel. In + * particular, they are used to manage the messages sent across the channel. + * + * This structure is private to a partition, and is NOT shared across the + * partition boundary. + * + * There is an array of these structures for each remote partition. It is + * allocated at the time a partition becomes active. The array contains one + * of these structures for each potential channel connection to that partition. + * + * Each of these structures manages two message queues (circular buffers). + * They are allocated at the time a channel connection is made. One of + * these message queues (local_msgqueue) holds the locally created messages + * that are destined for the remote partition. The other of these message + * queues (remote_msgqueue) is a locally cached copy of the remote partition's + * own local_msgqueue. + * + * The following is a description of the Get/Put pointers used to manage these + * two message queues. Consider the local_msgqueue to be on one partition + * and the remote_msgqueue to be its cached copy on another partition. A + * description of what each of the lettered areas contains is included. + * + * + * local_msgqueue remote_msgqueue + * + * |/////////| |/////////| + * w_remote_GP.get --> +---------+ |/////////| + * | F | |/////////| + * remote_GP.get --> +---------+ +---------+ <-- local_GP->get + * | | | | + * | | | E | + * | | | | + * | | +---------+ <-- w_local_GP.get + * | B | |/////////| + * | | |////D////| + * | | |/////////| + * | | +---------+ <-- w_remote_GP.put + * | | |////C////| + * local_GP->put --> +---------+ +---------+ <-- remote_GP.put + * | | |/////////| + * | A | |/////////| + * | | |/////////| + * w_local_GP.put --> +---------+ |/////////| + * |/////////| |/////////| + * + * + * ( remote_GP.[get|put] are cached copies of the remote + * partition's local_GP->[get|put], and thus their values can + * lag behind their counterparts on the remote partition. ) + * + * + * A - Messages that have been allocated, but have not yet been sent to the + * remote partition. + * + * B - Messages that have been sent, but have not yet been acknowledged by the + * remote partition as having been received. + * + * C - Area that needs to be prepared for the copying of sent messages, by + * the clearing of the message flags of any previously received messages. + * + * D - Area into which sent messages are to be copied from the remote + * partition's local_msgqueue and then delivered to their intended + * recipients. [ To allow for a multi-message copy, another pointer + * (next_msg_to_pull) has been added to keep track of the next message + * number needing to be copied (pulled). It chases after w_remote_GP.put. + * Any messages lying between w_local_GP.get and next_msg_to_pull have + * been copied and are ready to be delivered. ] + * + * E - Messages that have been copied and delivered, but have not yet been + * acknowledged by the recipient as having been received. + * + * F - Messages that have been acknowledged, but XPC has not yet notified the + * sender that the message was received by its intended recipient. + * This is also an area that needs to be prepared for the allocating of + * new messages, by the clearing of the message flags of the acknowledged + * messages. + */ +struct xpc_channel { + partid_t partid; /* ID of remote partition connected */ + spinlock_t lock; /* lock for updating this structure */ + u32 flags; /* general flags */ + + enum xpc_retval reason; /* reason why channel is disconnect'g */ + int reason_line; /* line# disconnect initiated from */ + + u16 number; /* channel # */ + + u16 msg_size; /* sizeof each msg entry */ + u16 local_nentries; /* #of msg entries in local msg queue */ + u16 remote_nentries; /* #of msg entries in remote msg queue*/ + + void *local_msgqueue_base; /* base address of kmalloc'd space */ + struct xpc_msg *local_msgqueue; /* local message queue */ + void *remote_msgqueue_base; /* base address of kmalloc'd space */ + struct xpc_msg *remote_msgqueue;/* cached copy of remote partition's */ + /* local message queue */ + u64 remote_msgqueue_pa; /* phys addr of remote partition's */ + /* local message queue */ + + atomic_t references; /* #of external references to queues */ + + atomic_t n_on_msg_allocate_wq; /* #on msg allocation wait queue */ + wait_queue_head_t msg_allocate_wq; /* msg allocation wait queue */ + + u8 delayed_IPI_flags; /* IPI flags received, but delayed */ + /* action until channel disconnected */ + + /* queue of msg senders who want to be notified when msg received */ + + atomic_t n_to_notify; /* #of msg senders to notify */ + struct xpc_notify *notify_queue;/* notify queue for messages sent */ + + xpc_channel_func func; /* user's channel function */ + void *key; /* pointer to user's key */ + + struct semaphore msg_to_pull_sema; /* next msg to pull serialization */ + struct semaphore wdisconnect_sema; /* wait for channel disconnect */ + + struct xpc_openclose_args *local_openclose_args; /* args passed on */ + /* opening or closing of channel */ + + /* various flavors of local and remote Get/Put values */ + + struct xpc_gp *local_GP; /* local Get/Put values */ + struct xpc_gp remote_GP; /* remote Get/Put values */ + struct xpc_gp w_local_GP; /* working local Get/Put values */ + struct xpc_gp w_remote_GP; /* working remote Get/Put values */ + s64 next_msg_to_pull; /* Put value of next msg to pull */ + + /* kthread management related fields */ + +// >>> rethink having kthreads_assigned_limit and kthreads_idle_limit; perhaps +// >>> allow the assigned limit be unbounded and let the idle limit be dynamic +// >>> dependent on activity over the last interval of time + atomic_t kthreads_assigned; /* #of kthreads assigned to channel */ + u32 kthreads_assigned_limit; /* limit on #of kthreads assigned */ + atomic_t kthreads_idle; /* #of kthreads idle waiting for work */ + u32 kthreads_idle_limit; /* limit on #of kthreads idle */ + atomic_t kthreads_active; /* #of kthreads actively working */ + // >>> following field is temporary + u32 kthreads_created; /* total #of kthreads created */ + + wait_queue_head_t idle_wq; /* idle kthread wait queue */ + +} ____cacheline_aligned; + + +/* struct xpc_channel flags */ + +#define XPC_C_WASCONNECTED 0x00000001 /* channel was connected */ + +#define XPC_C_ROPENREPLY 0x00000002 /* remote open channel reply */ +#define XPC_C_OPENREPLY 0x00000004 /* local open channel reply */ +#define XPC_C_ROPENREQUEST 0x00000008 /* remote open channel request */ +#define XPC_C_OPENREQUEST 0x00000010 /* local open channel request */ + +#define XPC_C_SETUP 0x00000020 /* channel's msgqueues are alloc'd */ +#define XPC_C_CONNECTCALLOUT 0x00000040 /* channel connected callout made */ +#define XPC_C_CONNECTED 0x00000080 /* local channel is connected */ +#define XPC_C_CONNECTING 0x00000100 /* channel is being connected */ + +#define XPC_C_RCLOSEREPLY 0x00000200 /* remote close channel reply */ +#define XPC_C_CLOSEREPLY 0x00000400 /* local close channel reply */ +#define XPC_C_RCLOSEREQUEST 0x00000800 /* remote close channel request */ +#define XPC_C_CLOSEREQUEST 0x00001000 /* local close channel request */ + +#define XPC_C_DISCONNECTED 0x00002000 /* channel is disconnected */ +#define XPC_C_DISCONNECTING 0x00004000 /* channel is being disconnected */ +#define XPC_C_DISCONNECTCALLOUT 0x00008000 /* chan disconnected callout made */ +#define XPC_C_WDISCONNECT 0x00010000 /* waiting for channel disconnect */ + + + +/* + * Manages channels on a partition basis. There is one of these structures + * for each partition (a partition will never utilize the structure that + * represents itself). + */ +struct xpc_partition { + + /* XPC HB infrastructure */ + + u8 remote_rp_version; /* version# of partition's rsvd pg */ + struct timespec remote_rp_stamp;/* time when rsvd pg was initialized */ + u64 remote_rp_pa; /* phys addr of partition's rsvd pg */ + u64 remote_vars_pa; /* phys addr of partition's vars */ + u64 remote_vars_part_pa; /* phys addr of partition's vars part */ + u64 last_heartbeat; /* HB at last read */ + u64 remote_amos_page_pa; /* phys addr of partition's amos page */ + int remote_act_nasid; /* active part's act/deact nasid */ + int remote_act_phys_cpuid; /* active part's act/deact phys cpuid */ + u32 act_IRQ_rcvd; /* IRQs since activation */ + spinlock_t act_lock; /* protect updating of act_state */ + u8 act_state; /* from XPC HB viewpoint */ + u8 remote_vars_version; /* version# of partition's vars */ + enum xpc_retval reason; /* reason partition is deactivating */ + int reason_line; /* line# deactivation initiated from */ + int reactivate_nasid; /* nasid in partition to reactivate */ + + unsigned long disengage_request_timeout; /* timeout in jiffies */ + struct timer_list disengage_request_timer; + + + /* XPC infrastructure referencing and teardown control */ + + volatile u8 setup_state; /* infrastructure setup state */ + wait_queue_head_t teardown_wq; /* kthread waiting to teardown infra */ + atomic_t references; /* #of references to infrastructure */ + + + /* + * NONE OF THE PRECEDING FIELDS OF THIS STRUCTURE WILL BE CLEARED WHEN + * XPC SETS UP THE NECESSARY INFRASTRUCTURE TO SUPPORT CROSS PARTITION + * COMMUNICATION. ALL OF THE FOLLOWING FIELDS WILL BE CLEARED. (THE + * 'nchannels' FIELD MUST BE THE FIRST OF THE FIELDS TO BE CLEARED.) + */ + + + u8 nchannels; /* #of defined channels supported */ + atomic_t nchannels_active; /* #of channels that are not DISCONNECTED */ + atomic_t nchannels_engaged;/* #of channels engaged with remote part */ + struct xpc_channel *channels;/* array of channel structures */ + + void *local_GPs_base; /* base address of kmalloc'd space */ + struct xpc_gp *local_GPs; /* local Get/Put values */ + void *remote_GPs_base; /* base address of kmalloc'd space */ + struct xpc_gp *remote_GPs;/* copy of remote partition's local Get/Put */ + /* values */ + u64 remote_GPs_pa; /* phys address of remote partition's local */ + /* Get/Put values */ + + + /* fields used to pass args when opening or closing a channel */ + + void *local_openclose_args_base; /* base address of kmalloc'd space */ + struct xpc_openclose_args *local_openclose_args; /* local's args */ + void *remote_openclose_args_base; /* base address of kmalloc'd space */ + struct xpc_openclose_args *remote_openclose_args; /* copy of remote's */ + /* args */ + u64 remote_openclose_args_pa; /* phys addr of remote's args */ + + + /* IPI sending, receiving and handling related fields */ + + int remote_IPI_nasid; /* nasid of where to send IPIs */ + int remote_IPI_phys_cpuid; /* phys CPU ID of where to send IPIs */ + AMO_t *remote_IPI_amo_va; /* address of remote IPI AMO_t structure */ + + AMO_t *local_IPI_amo_va; /* address of IPI AMO_t structure */ + u64 local_IPI_amo; /* IPI amo flags yet to be handled */ + char IPI_owner[8]; /* IPI owner's name */ + struct timer_list dropped_IPI_timer; /* dropped IPI timer */ + + spinlock_t IPI_lock; /* IPI handler lock */ + + + /* channel manager related fields */ + + atomic_t channel_mgr_requests; /* #of requests to activate chan mgr */ + wait_queue_head_t channel_mgr_wq; /* channel mgr's wait queue */ + +} ____cacheline_aligned; + + +/* struct xpc_partition act_state values (for XPC HB) */ + +#define XPC_P_INACTIVE 0x00 /* partition is not active */ +#define XPC_P_ACTIVATION_REQ 0x01 /* created thread to activate */ +#define XPC_P_ACTIVATING 0x02 /* activation thread started */ +#define XPC_P_ACTIVE 0x03 /* xpc_partition_up() was called */ +#define XPC_P_DEACTIVATING 0x04 /* partition deactivation initiated */ + + +#define XPC_DEACTIVATE_PARTITION(_p, _reason) \ + xpc_deactivate_partition(__LINE__, (_p), (_reason)) + + +/* struct xpc_partition setup_state values */ + +#define XPC_P_UNSET 0x00 /* infrastructure was never setup */ +#define XPC_P_SETUP 0x01 /* infrastructure is setup */ +#define XPC_P_WTEARDOWN 0x02 /* waiting to teardown infrastructure */ +#define XPC_P_TORNDOWN 0x03 /* infrastructure is torndown */ + + + +/* + * struct xpc_partition IPI_timer #of seconds to wait before checking for + * dropped IPIs. These occur whenever an IPI amo write doesn't complete until + * after the IPI was received. + */ +#define XPC_P_DROPPED_IPI_WAIT (0.25 * HZ) + + +/* number of seconds to wait for other partitions to disengage */ +#define XPC_DISENGAGE_REQUEST_DEFAULT_TIMELIMIT 90 + +/* interval in seconds to print 'waiting disengagement' messages */ +#define XPC_DISENGAGE_PRINTMSG_INTERVAL 10 + + +#define XPC_PARTID(_p) ((partid_t) ((_p) - &xpc_partitions[0])) + + + +/* found in xp_main.c */ +extern struct xpc_registration xpc_registrations[]; + + +/* found in xpc_main.c */ +extern struct device *xpc_part; +extern struct device *xpc_chan; +extern int xpc_disengage_request_timelimit; +extern int xpc_disengage_request_timedout; +extern irqreturn_t xpc_notify_IRQ_handler(int, void *, struct pt_regs *); +extern void xpc_dropped_IPI_check(struct xpc_partition *); +extern void xpc_activate_partition(struct xpc_partition *); +extern void xpc_activate_kthreads(struct xpc_channel *, int); +extern void xpc_create_kthreads(struct xpc_channel *, int); +extern void xpc_disconnect_wait(int); + + +/* found in xpc_partition.c */ +extern int xpc_exiting; +extern struct xpc_vars *xpc_vars; +extern struct xpc_rsvd_page *xpc_rsvd_page; +extern struct xpc_vars_part *xpc_vars_part; +extern struct xpc_partition xpc_partitions[XP_MAX_PARTITIONS + 1]; +extern char xpc_remote_copy_buffer[]; +extern struct xpc_rsvd_page *xpc_rsvd_page_init(void); +extern void xpc_allow_IPI_ops(void); +extern void xpc_restrict_IPI_ops(void); +extern int xpc_identify_act_IRQ_sender(void); +extern int xpc_partition_disengaged(struct xpc_partition *); +extern enum xpc_retval xpc_mark_partition_active(struct xpc_partition *); +extern void xpc_mark_partition_inactive(struct xpc_partition *); +extern void xpc_discovery(void); +extern void xpc_check_remote_hb(void); +extern void xpc_deactivate_partition(const int, struct xpc_partition *, + enum xpc_retval); +extern enum xpc_retval xpc_initiate_partid_to_nasids(partid_t, void *); + + +/* found in xpc_channel.c */ +extern void xpc_initiate_connect(int); +extern void xpc_initiate_disconnect(int); +extern enum xpc_retval xpc_initiate_allocate(partid_t, int, u32, void **); +extern enum xpc_retval xpc_initiate_send(partid_t, int, void *); +extern enum xpc_retval xpc_initiate_send_notify(partid_t, int, void *, + xpc_notify_func, void *); +extern void xpc_initiate_received(partid_t, int, void *); +extern enum xpc_retval xpc_setup_infrastructure(struct xpc_partition *); +extern enum xpc_retval xpc_pull_remote_vars_part(struct xpc_partition *); +extern void xpc_process_channel_activity(struct xpc_partition *); +extern void xpc_connected_callout(struct xpc_channel *); +extern void xpc_deliver_msg(struct xpc_channel *); +extern void xpc_disconnect_channel(const int, struct xpc_channel *, + enum xpc_retval, unsigned long *); +extern void xpc_disconnect_callout(struct xpc_channel *, enum xpc_retval); +extern void xpc_partition_going_down(struct xpc_partition *, enum xpc_retval); +extern void xpc_teardown_infrastructure(struct xpc_partition *); + + + +static inline void +xpc_wakeup_channel_mgr(struct xpc_partition *part) +{ + if (atomic_inc_return(&part->channel_mgr_requests) == 1) { + wake_up(&part->channel_mgr_wq); + } +} + + + +/* + * These next two inlines are used to keep us from tearing down a channel's + * msg queues while a thread may be referencing them. + */ +static inline void +xpc_msgqueue_ref(struct xpc_channel *ch) +{ + atomic_inc(&ch->references); +} + +static inline void +xpc_msgqueue_deref(struct xpc_channel *ch) +{ + s32 refs = atomic_dec_return(&ch->references); + + DBUG_ON(refs < 0); + if (refs == 0) { + xpc_wakeup_channel_mgr(&xpc_partitions[ch->partid]); + } +} + + + +#define XPC_DISCONNECT_CHANNEL(_ch, _reason, _irqflgs) \ + xpc_disconnect_channel(__LINE__, _ch, _reason, _irqflgs) + + +/* + * These two inlines are used to keep us from tearing down a partition's + * setup infrastructure while a thread may be referencing it. + */ +static inline void +xpc_part_deref(struct xpc_partition *part) +{ + s32 refs = atomic_dec_return(&part->references); + + + DBUG_ON(refs < 0); + if (refs == 0 && part->setup_state == XPC_P_WTEARDOWN) { + wake_up(&part->teardown_wq); + } +} + +static inline int +xpc_part_ref(struct xpc_partition *part) +{ + int setup; + + + atomic_inc(&part->references); + setup = (part->setup_state == XPC_P_SETUP); + if (!setup) { + xpc_part_deref(part); + } + return setup; +} + + + +/* + * The following macro is to be used for the setting of the reason and + * reason_line fields in both the struct xpc_channel and struct xpc_partition + * structures. + */ +#define XPC_SET_REASON(_p, _reason, _line) \ + { \ + (_p)->reason = _reason; \ + (_p)->reason_line = _line; \ + } + + + +/* + * This next set of inlines are used to keep track of when a partition is + * potentially engaged in accessing memory belonging to another partition. + */ + +static inline void +xpc_mark_partition_engaged(struct xpc_partition *part) +{ + unsigned long irq_flags; + AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + + (XPC_ENGAGED_PARTITIONS_AMO * sizeof(AMO_t))); + + + local_irq_save(irq_flags); + + /* set bit corresponding to our partid in remote partition's AMO */ + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_OR, + (1UL << sn_partition_id)); + /* + * We must always use the nofault function regardless of whether we + * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we + * didn't, we'd never know that the other partition is down and would + * keep sending IPIs and AMOs to it until the heartbeat times out. + */ + (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> + variable), xp_nofault_PIOR_target)); + + local_irq_restore(irq_flags); +} + +static inline void +xpc_mark_partition_disengaged(struct xpc_partition *part) +{ + unsigned long irq_flags; + AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + + (XPC_ENGAGED_PARTITIONS_AMO * sizeof(AMO_t))); + + + local_irq_save(irq_flags); + + /* clear bit corresponding to our partid in remote partition's AMO */ + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, + ~(1UL << sn_partition_id)); + /* + * We must always use the nofault function regardless of whether we + * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we + * didn't, we'd never know that the other partition is down and would + * keep sending IPIs and AMOs to it until the heartbeat times out. + */ + (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> + variable), xp_nofault_PIOR_target)); + + local_irq_restore(irq_flags); +} + +static inline void +xpc_request_partition_disengage(struct xpc_partition *part) +{ + unsigned long irq_flags; + AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + + (XPC_DISENGAGE_REQUEST_AMO * sizeof(AMO_t))); + + + local_irq_save(irq_flags); + + /* set bit corresponding to our partid in remote partition's AMO */ + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_OR, + (1UL << sn_partition_id)); + /* + * We must always use the nofault function regardless of whether we + * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we + * didn't, we'd never know that the other partition is down and would + * keep sending IPIs and AMOs to it until the heartbeat times out. + */ + (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> + variable), xp_nofault_PIOR_target)); + + local_irq_restore(irq_flags); +} + +static inline void +xpc_cancel_partition_disengage_request(struct xpc_partition *part) +{ + unsigned long irq_flags; + AMO_t *amo = (AMO_t *) __va(part->remote_amos_page_pa + + (XPC_DISENGAGE_REQUEST_AMO * sizeof(AMO_t))); + + + local_irq_save(irq_flags); + + /* clear bit corresponding to our partid in remote partition's AMO */ + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, + ~(1UL << sn_partition_id)); + /* + * We must always use the nofault function regardless of whether we + * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we + * didn't, we'd never know that the other partition is down and would + * keep sending IPIs and AMOs to it until the heartbeat times out. + */ + (void) xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo-> + variable), xp_nofault_PIOR_target)); + + local_irq_restore(irq_flags); +} + +static inline u64 +xpc_partition_engaged(u64 partid_mask) +{ + AMO_t *amo = xpc_vars->amos_page + XPC_ENGAGED_PARTITIONS_AMO; + + + /* return our partition's AMO variable ANDed with partid_mask */ + return (FETCHOP_LOAD_OP(TO_AMO((u64) &amo->variable), FETCHOP_LOAD) & + partid_mask); +} + +static inline u64 +xpc_partition_disengage_requested(u64 partid_mask) +{ + AMO_t *amo = xpc_vars->amos_page + XPC_DISENGAGE_REQUEST_AMO; + + + /* return our partition's AMO variable ANDed with partid_mask */ + return (FETCHOP_LOAD_OP(TO_AMO((u64) &amo->variable), FETCHOP_LOAD) & + partid_mask); +} + +static inline void +xpc_clear_partition_engaged(u64 partid_mask) +{ + AMO_t *amo = xpc_vars->amos_page + XPC_ENGAGED_PARTITIONS_AMO; + + + /* clear bit(s) based on partid_mask in our partition's AMO */ + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, + ~partid_mask); +} + +static inline void +xpc_clear_partition_disengage_request(u64 partid_mask) +{ + AMO_t *amo = xpc_vars->amos_page + XPC_DISENGAGE_REQUEST_AMO; + + + /* clear bit(s) based on partid_mask in our partition's AMO */ + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_AND, + ~partid_mask); +} + + + +/* + * The following set of macros and inlines are used for the sending and + * receiving of IPIs (also known as IRQs). There are two flavors of IPIs, + * one that is associated with partition activity (SGI_XPC_ACTIVATE) and + * the other that is associated with channel activity (SGI_XPC_NOTIFY). + */ + +static inline u64 +xpc_IPI_receive(AMO_t *amo) +{ + return FETCHOP_LOAD_OP(TO_AMO((u64) &amo->variable), FETCHOP_CLEAR); +} + + +static inline enum xpc_retval +xpc_IPI_send(AMO_t *amo, u64 flag, int nasid, int phys_cpuid, int vector) +{ + int ret = 0; + unsigned long irq_flags; + + + local_irq_save(irq_flags); + + FETCHOP_STORE_OP(TO_AMO((u64) &amo->variable), FETCHOP_OR, flag); + sn_send_IPI_phys(nasid, phys_cpuid, vector, 0); + + /* + * We must always use the nofault function regardless of whether we + * are on a Shub 1.1 system or a Shub 1.2 slice 0xc processor. If we + * didn't, we'd never know that the other partition is down and would + * keep sending IPIs and AMOs to it until the heartbeat times out. + */ + ret = xp_nofault_PIOR((u64 *) GLOBAL_MMR_ADDR(NASID_GET(&amo->variable), + xp_nofault_PIOR_target)); + + local_irq_restore(irq_flags); + + return ((ret == 0) ? xpcSuccess : xpcPioReadError); +} + + +/* + * IPIs associated with SGI_XPC_ACTIVATE IRQ. + */ + +/* + * Flag the appropriate AMO variable and send an IPI to the specified node. + */ +static inline void +xpc_activate_IRQ_send(u64 amos_page_pa, int from_nasid, int to_nasid, + int to_phys_cpuid) +{ + int w_index = XPC_NASID_W_INDEX(from_nasid); + int b_index = XPC_NASID_B_INDEX(from_nasid); + AMO_t *amos = (AMO_t *) __va(amos_page_pa + + (XPC_ACTIVATE_IRQ_AMOS * sizeof(AMO_t))); + + + (void) xpc_IPI_send(&amos[w_index], (1UL << b_index), to_nasid, + to_phys_cpuid, SGI_XPC_ACTIVATE); +} + +static inline void +xpc_IPI_send_activate(struct xpc_vars *vars) +{ + xpc_activate_IRQ_send(vars->amos_page_pa, cnodeid_to_nasid(0), + vars->act_nasid, vars->act_phys_cpuid); +} + +static inline void +xpc_IPI_send_activated(struct xpc_partition *part) +{ + xpc_activate_IRQ_send(part->remote_amos_page_pa, cnodeid_to_nasid(0), + part->remote_act_nasid, part->remote_act_phys_cpuid); +} + +static inline void +xpc_IPI_send_reactivate(struct xpc_partition *part) +{ + xpc_activate_IRQ_send(xpc_vars->amos_page_pa, part->reactivate_nasid, + xpc_vars->act_nasid, xpc_vars->act_phys_cpuid); +} + +static inline void +xpc_IPI_send_disengage(struct xpc_partition *part) +{ + xpc_activate_IRQ_send(part->remote_amos_page_pa, cnodeid_to_nasid(0), + part->remote_act_nasid, part->remote_act_phys_cpuid); +} + + +/* + * IPIs associated with SGI_XPC_NOTIFY IRQ. + */ + +/* + * Send an IPI to the remote partition that is associated with the + * specified channel. + */ +#define XPC_NOTIFY_IRQ_SEND(_ch, _ipi_f, _irq_f) \ + xpc_notify_IRQ_send(_ch, _ipi_f, #_ipi_f, _irq_f) + +static inline void +xpc_notify_IRQ_send(struct xpc_channel *ch, u8 ipi_flag, char *ipi_flag_string, + unsigned long *irq_flags) +{ + struct xpc_partition *part = &xpc_partitions[ch->partid]; + enum xpc_retval ret; + + + if (likely(part->act_state != XPC_P_DEACTIVATING)) { + ret = xpc_IPI_send(part->remote_IPI_amo_va, + (u64) ipi_flag << (ch->number * 8), + part->remote_IPI_nasid, + part->remote_IPI_phys_cpuid, + SGI_XPC_NOTIFY); + dev_dbg(xpc_chan, "%s sent to partid=%d, channel=%d, ret=%d\n", + ipi_flag_string, ch->partid, ch->number, ret); + if (unlikely(ret != xpcSuccess)) { + if (irq_flags != NULL) { + spin_unlock_irqrestore(&ch->lock, *irq_flags); + } + XPC_DEACTIVATE_PARTITION(part, ret); + if (irq_flags != NULL) { + spin_lock_irqsave(&ch->lock, *irq_flags); + } + } + } +} + + +/* + * Make it look like the remote partition, which is associated with the + * specified channel, sent us an IPI. This faked IPI will be handled + * by xpc_dropped_IPI_check(). + */ +#define XPC_NOTIFY_IRQ_SEND_LOCAL(_ch, _ipi_f) \ + xpc_notify_IRQ_send_local(_ch, _ipi_f, #_ipi_f) + +static inline void +xpc_notify_IRQ_send_local(struct xpc_channel *ch, u8 ipi_flag, + char *ipi_flag_string) +{ + struct xpc_partition *part = &xpc_partitions[ch->partid]; + + + FETCHOP_STORE_OP(TO_AMO((u64) &part->local_IPI_amo_va->variable), + FETCHOP_OR, ((u64) ipi_flag << (ch->number * 8))); + dev_dbg(xpc_chan, "%s sent local from partid=%d, channel=%d\n", + ipi_flag_string, ch->partid, ch->number); +} + + +/* + * The sending and receiving of IPIs includes the setting of an AMO variable + * to indicate the reason the IPI was sent. The 64-bit variable is divided + * up into eight bytes, ordered from right to left. Byte zero pertains to + * channel 0, byte one to channel 1, and so on. Each byte is described by + * the following IPI flags. + */ + +#define XPC_IPI_CLOSEREQUEST 0x01 +#define XPC_IPI_CLOSEREPLY 0x02 +#define XPC_IPI_OPENREQUEST 0x04 +#define XPC_IPI_OPENREPLY 0x08 +#define XPC_IPI_MSGREQUEST 0x10 + + +/* given an AMO variable and a channel#, get its associated IPI flags */ +#define XPC_GET_IPI_FLAGS(_amo, _c) ((u8) (((_amo) >> ((_c) * 8)) & 0xff)) +#define XPC_SET_IPI_FLAGS(_amo, _c, _f) (_amo) |= ((u64) (_f) << ((_c) * 8)) + +#define XPC_ANY_OPENCLOSE_IPI_FLAGS_SET(_amo) ((_amo) & 0x0f0f0f0f0f0f0f0f) +#define XPC_ANY_MSG_IPI_FLAGS_SET(_amo) ((_amo) & 0x1010101010101010) + + +static inline void +xpc_IPI_send_closerequest(struct xpc_channel *ch, unsigned long *irq_flags) +{ + struct xpc_openclose_args *args = ch->local_openclose_args; + + + args->reason = ch->reason; + + XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_CLOSEREQUEST, irq_flags); +} + +static inline void +xpc_IPI_send_closereply(struct xpc_channel *ch, unsigned long *irq_flags) +{ + XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_CLOSEREPLY, irq_flags); +} + +static inline void +xpc_IPI_send_openrequest(struct xpc_channel *ch, unsigned long *irq_flags) +{ + struct xpc_openclose_args *args = ch->local_openclose_args; + + + args->msg_size = ch->msg_size; + args->local_nentries = ch->local_nentries; + + XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_OPENREQUEST, irq_flags); +} + +static inline void +xpc_IPI_send_openreply(struct xpc_channel *ch, unsigned long *irq_flags) +{ + struct xpc_openclose_args *args = ch->local_openclose_args; + + + args->remote_nentries = ch->remote_nentries; + args->local_nentries = ch->local_nentries; + args->local_msgqueue_pa = __pa(ch->local_msgqueue); + + XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_OPENREPLY, irq_flags); +} + +static inline void +xpc_IPI_send_msgrequest(struct xpc_channel *ch) +{ + XPC_NOTIFY_IRQ_SEND(ch, XPC_IPI_MSGREQUEST, NULL); +} + +static inline void +xpc_IPI_send_local_msgrequest(struct xpc_channel *ch) +{ + XPC_NOTIFY_IRQ_SEND_LOCAL(ch, XPC_IPI_MSGREQUEST); +} + + +/* + * Memory for XPC's AMO variables is allocated by the MSPEC driver. These + * pages are located in the lowest granule. The lowest granule uses 4k pages + * for cached references and an alternate TLB handler to never provide a + * cacheable mapping for the entire region. This will prevent speculative + * reading of cached copies of our lines from being issued which will cause + * a PI FSB Protocol error to be generated by the SHUB. For XPC, we need 64 + * AMO variables (based on XP_MAX_PARTITIONS) for message notification and an + * additional 128 AMO variables (based on XP_NASID_MASK_WORDS) for partition + * activation and 2 AMO variables for partition deactivation. + */ +static inline AMO_t * +xpc_IPI_init(int index) +{ + AMO_t *amo = xpc_vars->amos_page + index; + + + (void) xpc_IPI_receive(amo); /* clear AMO variable */ + return amo; +} + + + +static inline enum xpc_retval +xpc_map_bte_errors(bte_result_t error) +{ + switch (error) { + case BTE_SUCCESS: return xpcSuccess; + case BTEFAIL_DIR: return xpcBteDirectoryError; + case BTEFAIL_POISON: return xpcBtePoisonError; + case BTEFAIL_WERR: return xpcBteWriteError; + case BTEFAIL_ACCESS: return xpcBteAccessError; + case BTEFAIL_PWERR: return xpcBtePWriteError; + case BTEFAIL_PRERR: return xpcBtePReadError; + case BTEFAIL_TOUT: return xpcBteTimeOutError; + case BTEFAIL_XTERR: return xpcBteXtalkError; + case BTEFAIL_NOTAVAIL: return xpcBteNotAvailable; + default: return xpcBteUnmappedError; + } +} + + + +static inline void * +xpc_kmalloc_cacheline_aligned(size_t size, gfp_t flags, void **base) +{ + /* see if kmalloc will give us cachline aligned memory by default */ + *base = kmalloc(size, flags); + if (*base == NULL) { + return NULL; + } + if ((u64) *base == L1_CACHE_ALIGN((u64) *base)) { + return *base; + } + kfree(*base); + + /* nope, we'll have to do it ourselves */ + *base = kmalloc(size + L1_CACHE_BYTES, flags); + if (*base == NULL) { + return NULL; + } + return (void *) L1_CACHE_ALIGN((u64) *base); +} + + +/* + * Check to see if there is any channel activity to/from the specified + * partition. + */ +static inline void +xpc_check_for_channel_activity(struct xpc_partition *part) +{ + u64 IPI_amo; + unsigned long irq_flags; + + + IPI_amo = xpc_IPI_receive(part->local_IPI_amo_va); + if (IPI_amo == 0) { + return; + } + + spin_lock_irqsave(&part->IPI_lock, irq_flags); + part->local_IPI_amo |= IPI_amo; + spin_unlock_irqrestore(&part->IPI_lock, irq_flags); + + dev_dbg(xpc_chan, "received IPI from partid=%d, IPI_amo=0x%lx\n", + XPC_PARTID(part), IPI_amo); + + xpc_wakeup_channel_mgr(part); +} + + +#endif /* _ASM_IA64_SN_XPC_H */ + diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h index 653bb7f..1d6518f 100644 --- a/include/asm-ia64/thread_info.h +++ b/include/asm-ia64/thread_info.h @@ -93,6 +93,7 @@ struct thread_info { #define TIF_POLLING_NRFLAG 16 /* true if poll_idle() is polling TIF_NEED_RESCHED */ #define TIF_MEMDIE 17 #define TIF_MCA_INIT 18 /* this task is processing MCA or INIT */ +#define TIF_DB_DISABLED 19 /* debug trap disabled for fsyscall */ #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) @@ -100,9 +101,10 @@ struct thread_info { #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) -#define _TIF_SIGDELAYED (1 << TIF_SIGDELAYED) +#define _TIF_SIGDELAYED (1 << TIF_SIGDELAYED) #define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG) #define _TIF_MCA_INIT (1 << TIF_MCA_INIT) +#define _TIF_DB_DISABLED (1 << TIF_DB_DISABLED) /* "work to do on user-return" bits */ #define TIF_ALLWORK_MASK (_TIF_NOTIFY_RESUME|_TIF_SIGPENDING|_TIF_NEED_RESCHED|_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SIGDELAYED)