GIT 27a0b4e8e617190c35fa527b01efe94f38b24ae0 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6.17.git commit 27a0b4e8e617190c35fa527b01efe94f38b24ae0 Author: David S. Miller Date: Tue Feb 7 16:09:12 2006 -0800 [SPARC64]: Niagara copy/clear page. Happily we have no D-cache aliasing issues on these chips, so the implementation is very straightforward. Add a stub in bootup which will be where the patching calls will be made for niagara/sun4v/hypervisor. Signed-off-by: David S. Miller commit 9fc937704be7e0b1d28cdb142de21d4045d69598 Author: David S. Miller Date: Tue Feb 7 00:00:16 2006 -0800 [SPARC64]: Rename gl_{1,2}insn_patch --> sun4v_{1,2}insn_patch Signed-off-by: David S. Miller commit 1a34f3feb5c8da814064044d690affb6e713a29f Author: David S. Miller Date: Mon Feb 6 23:44:37 2006 -0800 [SPARC64]: Initial sun4v TLB miss handling infrastructure. Things are a little tricky because, unlike sun4u, we have to: 1) do a hypervisor trap to do the TLB load. 2) do the TSB lookup calculations by hand Signed-off-by: David S. Miller commit c7fc12136d4d779006379f7c44cc1ae7525aad76 Author: David S. Miller Date: Mon Feb 6 15:52:05 2006 -0800 [SPARC64]: Add missing memory barriers to instruction patching functions. V9 requires a write memory barrier before the instruction flush. Signed-off-by: David S. Miller commit 5286fb1916be91861c45927419fab5291e8bcc45 Author: David S. Miller Date: Sun Feb 5 22:27:28 2006 -0800 [SPARC64]: Sanitize %pstate writes for sun4v. If we're just switching between different alternate global sets, nop it out on sun4v. Also, get rid of all of the alternate global save/restore in the OBP CIF trampoline code. Signed-off-by: David S. Miller commit f25357061d830e2e722337c41970cdc72b35aad7 Author: David S. Miller Date: Sun Feb 5 21:59:03 2006 -0800 [SPARC64]: Kill all %pstate changes in context switch code. They are totally unnecessary because: 1) Interrupts are already disabled when switch_to() runs. 2) We don't use hard-coded alternate globals any longer. This found a case in rtrap, which still assumed alternate global %g6 was current_thread_info(), and that is fixed by this changeset as well. Signed-off-by: David S. Miller commit 554962377526844c7b6f43943cf3e83b130927cf Author: David S. Miller Date: Sun Feb 5 21:29:28 2006 -0800 [SPARC64]: Add initial code to twiddle %gl on trap entry/exit. Instead of setting/clearing PSTATE_AG we have to change the %gl register value on sun4v. Signed-off-by: David S. Miller commit b5e917e730d3f754f58573a1763bfcdaf7d759a5 Author: David S. Miller Date: Sun Feb 5 20:47:26 2006 -0800 [SPARC64]: Fill dead cycles on trap entry with real work. As we save trap state onto the stack, the store buffer fills up mid-way through and we stall for several cycles as the store buffer trickles out to the L2 cache. Meanwhile we can do some privileged register reads and other calculations, essentially for free. Signed-off-by: David S. Miller commit f4965cd4b60ac65f257d23d8a04d0679b4761d3c Author: David S. Miller Date: Sat Feb 4 23:59:38 2006 -0800 [SPARC64]: Add define for "GL" field of sun4v %tstate register. Signed-off-by: David S. Miller commit 937fc54baa26e41a28a301b1e7486bef81b6739f Author: David S. Miller Date: Sat Feb 4 15:40:53 2006 -0800 [SPARC64]: Add sun4v case to __GET_CPUID() patch tables. Signed-off-by: David S. Miller commit 20c3bb78663dcce0d83848873df99c9337d2417d Author: David S. Miller Date: Sat Feb 4 03:12:14 2006 -0800 [SPARC64]: Sun4v interrupt queue register definitions. Signed-off-by: David S. Miller commit 3526291751bb029805abf2137ed127cf38190269 Author: David S. Miller Date: Sat Feb 4 03:12:02 2006 -0800 [SPARC64]: Sun4v scratchpad register layout. Signed-off-by: David S. Miller commit c33e9e1023229ffe164020720849fb1a55f187da Author: David S. Miller Date: Sat Feb 4 03:11:50 2006 -0800 [SPARC64]: Sun4v specific ASI defines. Signed-off-by: David S. Miller commit 0510a9884b1da7b8132c4f9d101382e6e3a79adc Author: David S. Miller Date: Sat Feb 4 03:11:34 2006 -0800 [SPARC64]: Niagara optimized memcpy() and copy_{to,from}_user(). Signed-off-by: David S. Miller commit 8dbb1e0be13363079f98264a0a5a0cdb8bd81e7a Author: David S. Miller Date: Sat Feb 4 03:11:17 2006 -0800 [SPARC64]: Add Niagara init-store twin-load ASI defines. Signed-off-by: David S. Miller commit a7cda0fcb567f5c622fdb2bfcffa584e9c02449d Author: David S. Miller Date: Sat Feb 4 03:10:53 2006 -0800 [SPARC64]: Add some hypervisor tlb_type checks. And more consistently check cheetah{,_plus} instead of assuming anything not spitfire is cheetah{,_plus}. Signed-off-by: David S. Miller commit 7ecfbdafb40b31d44115320393f6f3a158aed181 Author: David S. Miller Date: Sat Feb 4 03:09:03 2006 -0800 [SPARC64]: Add 'hypervisor' to ultra_tlb_type enumeration. Signed-off-by: David S. Miller commit a9c4977579476e062ad7c3e58c1bf6b4cb2c5e52 Author: David S. Miller Date: Sat Feb 4 03:08:37 2006 -0800 [SPARC64]: SUN4V hypervisor TLB flush support code. Signed-off-by: David S. Miller commit ed982a193fa5d96a60ae5e405395f4a53338ef1c Author: David S. Miller Date: Sat Feb 4 03:01:45 2006 -0800 [SPARC64]: SUN4V hypervisor interface defines. Signed-off-by: David S. Miller commit 9a8e6191499a090d38df9154484908023ee9fd24 Author: David S. Miller Date: Sat Feb 4 00:10:01 2006 -0800 [SPARC64]: Refine register window trap handling. When saving and restoing trap state, do the window spill/fill handling inline so that we never trap deeper than 2 trap levels. This is important for chips like Niagara. The window fixup code is massively simplified, and many more improvements are now possible. Signed-off-by: David S. Miller commit d314ff8ec8a8f96b922d078e2c357e1d5bd05ac5 Author: David S. Miller Date: Thu Feb 2 21:55:10 2006 -0800 [SPARC64]: Add explicit register args to trap state loading macros. This, as well as making the code cleaner, allows a simplification in the TSB miss handling path. Signed-off-by: David S. Miller commit 5f94bf44581552cc48cf28d689342c74de798cf8 Author: David S. Miller Date: Thu Feb 2 21:04:25 2006 -0800 [SPARC64]: Refine code sequences to get the cpu id. On uniprocessor, it's always zero for optimize that. On SMP, the jmpl to the stub kills the return address stack in the cpu branch prediction logic, so expand the code sequence inline and use a code patching section to fix things up. This also always better and explicit register selection, which will be taken advantage of in a future changeset. The hard_smp_processor_id() function is big, so do not inline it. Fix up tests for Jalapeno to also test for Serrano chips too. These tests want "jbus Ultra-IIIi" cases to match, so that is what we should test for. Signed-off-by: David S. Miller commit c63c765c83678f743458993b0e9685c84f316cfe Author: David S. Miller Date: Thu Feb 2 16:16:24 2006 -0800 [SPARC64]: Turn off TSB growing for now. There are several tricky races involved with growing the TSB. So just use base-size TSBs for user contexts and we can revisit enabling this later. One part of the SMP problems is that tsb_context_switch() can see partially updated TSB configuration state if tsb_grow() is running in parallel. That's easily solved with a seqlock taken as a writer by tsb_grow() and taken as a reader to capture all the TSB config state in tsb_context_switch(). Then there is flush_tsb_user() running in parallel with a tsb_grow(). In theory we could take the seqlock as a reader there too, and just resample the TSB pointer and reflush but that looks really ugly. Lastly, I believe there is a case with threads that results in a TSB entry lock bit being set spuriously which will cause the next access to that TSB entry to wedge the cpu (since the TSB entry lock bit will never clear). It's either copy_tsb() or some bug elsewhere in the TSB assembly. Signed-off-by: David S. Miller commit 8dbe3c49a33cd2c6a8c98005a2c247b737f48d26 Author: David S. Miller Date: Thu Feb 2 01:20:18 2006 -0800 [SPARC64]: Correctable ECC errors cannot occur at trap level > 0. The are distrupting, which by the sparc v9 definition means they can only occur when interrupts are enabled in the %pstate register. This never occurs in any of the trap handling code running at trap levels > 0. So just mark it as an unexpected trap. This allows us to kill off the cee_stuff member of struct thread_info. Signed-off-by: David S. Miller commit 674e15d734d441ab0711ee84473c86c217d5aea2 Author: David S. Miller Date: Wed Feb 1 15:55:21 2006 -0800 [SPARC64]: Access TSB with physical addresses when possible. This way we don't need to lock the TSB into the TLB. The trick is that every TSB load/store is registered into a special instruction patch section. The default uses virtual addresses, and the patch instructions use physical address load/stores. We can't do this on all chips because only cheetah+ and later have the physical variant of the atomic quad load. Signed-off-by: David S. Miller commit 6f55fb1c283b09d730a8c3ae4d4c8f48ad431e3b Author: David S. Miller Date: Tue Jan 31 23:13:29 2006 -0800 [SPARC64]: Kill out-of-date commentary in asm-sparc64/tsb.h Signed-off-by: David S. Miller commit 60414e6d7a684fb75b0dabbba7d85dfa973bf0f9 Author: David S. Miller Date: Tue Jan 31 18:35:05 2006 -0800 [SPARC64]: Don't clobber alt-global %g4 on window fixups. If we are returning back to kernel mode, %g4 could be live (for example, in the case where we window spill in the etrap code). So do not change it's value if going back to kernel. Signed-off-by: David S. Miller commit fb1b57d9ea6076c5c29b0260cdf1f4fed6297e82 Author: David S. Miller Date: Tue Jan 31 18:34:51 2006 -0800 [SPARC64]: Fix race in LOAD_PER_CPU_BASE() Since we use %g5 itself as a temporary, it can get clobbered if we take an interrupt mid-stream and thus cause end up with the final %g5 value too early as a result of rtrap processing. Set %g5 at the very end, atomically, to avoid this problem. Signed-off-by: David S. Miller commit 5b3e8656ed1bb3076e16a8f307a62be07371db57 Author: David S. Miller Date: Tue Jan 31 18:34:34 2006 -0800 [SPARC64]: Kill swapper_pgd_zero, totally unused. Signed-off-by: David S. Miller commit bd9e10287f506a17b6378b652b67d68f41495785 Author: David S. Miller Date: Tue Jan 31 18:34:21 2006 -0800 [SPARC64]: Fix too early reference to %g6 %g6 is not necessarily set to current_thread_info() at sparc64_realfault_common. So store the fault code and address after we invoke etrap and %g6 is properly set up. Signed-off-by: David S. Miller commit e4aa81144706c9e299e7481d819f845353f675f0 Author: David S. Miller Date: Tue Jan 31 18:34:06 2006 -0800 [SPARC64]: Kill hard-coded %pstate setting in sparc_exit. Just flip the bit off of whatever it's currently set to. PSTATE_IE is guarenteed to be enabled when we get here. Signed-off-by: David S. Miller commit 4e439fbbea83a3fe3b85e33a7d9d9ec26918f410 Author: David S. Miller Date: Tue Jan 31 18:33:49 2006 -0800 [SPARC64]: Increase swapper_tsb size to 32K. Signed-off-by: David S. Miller commit 955eb6e832c242f7a1f4b8d2d6fd42d3abac0be1 Author: David S. Miller Date: Tue Jan 31 18:33:37 2006 -0800 [SPARC64]: Kill sole argument passed to setup_tba(). No longer used, and move extern declaration to a header file. Signed-off-by: David S. Miller commit 6b2c059fdca96c2eb3164d4c9639fe44b1508078 Author: David S. Miller Date: Tue Jan 31 18:33:25 2006 -0800 [SPARC64]: Kill PROM locked TLB entry preservation code. It is totally unnecessary complexity. After we take over the trap table, we handle all PROM tlb misses fully. Signed-off-by: David S. Miller commit a6dc2308316a8df673f747590ffe250f014752a2 Author: David S. Miller Date: Tue Jan 31 18:33:12 2006 -0800 [SPARC64]: Use sparc64_highest_unlocked_tlb_ent in __tsb_context_switch() Instead of ugly hard-coded value. Signed-off-by: David S. Miller commit b1d4ee6f2b75ab50bf64af9ccd24c3ac59c7aec2 Author: David S. Miller Date: Tue Jan 31 18:33:00 2006 -0800 [SPARC64]: Fix bogus flush instruction usage. Some of the trap code was still assuming that alternate global %g6 was hard coded with current_thread_info(). Let's just consistently flush at KERNBASE when we need a pipeline synchronization. That's locked into the TLB and will always work. Signed-off-by: David S. Miller commit 9209cf1129d67e903bf54279ee737132721a37cf Author: David S. Miller Date: Tue Jan 31 18:32:44 2006 -0800 [SPARC64]: Fix incorrect TSB lock bit handling. The TSB_LOCK_BIT define is actually a special value shifted down by 32-bits for the assembler code macros. In C code, this isn't what we want. Signed-off-by: David S. Miller commit 2485ac61a508bd2cc41a2b3bba1ab337e2b12a3b Author: David S. Miller Date: Tue Jan 31 18:32:29 2006 -0800 [SPARC64]: Kill {save,restore}_alternate_globals() No longer needed now that we no longer have hard-coded alternate global register usage. Signed-off-by: David S. Miller commit 35470536463c279a00bf15f3e5223d716582ab66 Author: David S. Miller Date: Tue Jan 31 18:32:04 2006 -0800 [SPARC64]: Preload TSB entries from update_mmu_cache(). Signed-off-by: David S. Miller commit bb595ebaad990e9f88046b7b6966e573843be6dd Author: David S. Miller Date: Tue Jan 31 18:31:38 2006 -0800 [SPARC64]: Dynamically grow TSB in response to RSS growth. As the RSS grows, grow the TSB in order to reduce the likelyhood of hash collisions and thus poor hit rates in the TSB. This definitely needs some serious tuning. Signed-off-by: David S. Miller commit 7ba262d810e72124b1c9a70d0dc860771c91e52c Author: David S. Miller Date: Tue Jan 31 18:31:20 2006 -0800 [SPARC64]: Add infrastructure for dynamic TSB sizing. This also cleans up tsb_context_switch(). The assembler routine is now __tsb_context_switch() and the former is an inline function that picks out the bits from the mm_struct and passes it into the assembler code as arguments. setup_tsb_parms() computes the locked TLB entry to map the TSB. Later when we support using the physical address quad load instructions of Cheetah+ and later, we'll simply use the physical address for the TSB register value and set the map virtual and PTE both to zero. Signed-off-by: David S. Miller commit c4971552a2efe0e6e5519c8960c90821a31fa2b8 Author: David S. Miller Date: Tue Jan 31 18:31:06 2006 -0800 [SPARC64]: TSB refinements. Move {init_new,destroy}_context() out of line. Do not put huge pages into the TSB, only base page size translations. There are some clever things we could do here, but for now let's be correct instead of fancy. Signed-off-by: David S. Miller commit 481f422d4277cd77a376e396337194d1f3cc4b09 Author: David S. Miller Date: Tue Jan 31 18:30:51 2006 -0800 [SPARC64]: Elminate all usage of hard-coded trap globals. UltraSPARC has special sets of global registers which are switched to for certain trap types. There is one set for MMU related traps, one set of Interrupt Vector processing, and another set (called the Alternate globals) for all other trap types. For what seems like forever we've hard coded the values in some of these trap registers. Some examples include: 1) Interrupt Vector global %g6 holds current processors interrupt work struct where received interrupts are managed for IRQ handler dispatch. 2) MMU global %g7 holds the base of the page tables of the currently active address space. 3) Alternate global %g6 held the current_thread_info() value. Such hardcoding has resulted in some serious issues in many areas. There are some code sequences where having another register available would help clean up the implementation. Taking traps such as cross-calls from the OBP firmware requires some trick code sequences wherein we have to save away and restore all of the special sets of global registers when we enter/exit OBP. We were also using the IMMU TSB register on SMP to hold the per-cpu area base address, which doesn't work any longer now that we actually use the TSB facility of the cpu. The implementation is pretty straight forward. One tricky bit is getting the current processor ID as that is different on different cpu variants. We use a stub with a fancy calling convention which we patch at boot time. The calling convention is that the stub is branched to and the (PC - 4) to return to is in register %g1. The cpu number is left in %g6. This stub can be invoked by using the __GET_CPUID macro. We use an array of per-cpu trap state to store the current thread and physical address of the current address space's page tables. The TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this table, it uses __GET_CPUID and also clobbers %g1. TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load the current processor's IRQ software state into %g6. It also uses __GET_CPUID and clobbers %g1. Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the current address space's page tables into %g7, it clobbers %g1 and uses __GET_CPUID. Many refinements are possible, as well as some tuning, with this stuff in place. Signed-off-by: David S. Miller commit 591eabe3efdac4e5a4eb2ed2cd82897d77bb46c3 Author: David S. Miller Date: Tue Jan 31 18:30:27 2006 -0800 [SPARC64]: Kill pgtable quicklists and use SLAB. Taking a nod from the powerpc port. With the per-cpu caching of both the page allocator and SLAB, the pgtable quicklist scheme becomes relatively silly and primitive. Signed-off-by: David S. Miller commit 087524227c692459c3ef0e9bea87c6fd790a8548 Author: David S. Miller Date: Tue Jan 31 18:30:13 2006 -0800 [SPARC64]: No need to D-cache color page tables any longer. Unlike the virtual page tables, the new TSB scheme does not require this ugly hack. Signed-off-by: David S. Miller commit 77005095bbd6750ddec63eff42120a8c35ed3ca3 Author: David S. Miller Date: Tue Jan 31 18:29:18 2006 -0800 [SPARC64]: Move away from virtual page tables, part 1. We now use the TSB hardware assist features of the UltraSPARC MMUs. SMP is currently knowingly broken, we need to find another place to store the per-cpu base pointers. We hid them away in the TSB base register, and that obviously will not work any more :-) Another known broken case is non-8KB base page size. Also noticed that flush_tlb_all() is not referenced anywhere, only the internal __flush_tlb_all() (local cpu only) is used by the sparc64 port, so we can get rid of flush_tlb_all(). The kernel gets it's own 8KB TSB (swapper_tsb) and each address space gets it's own private 8K TSB. Later we can add code to dynamically increase the size of per-process TSB as the RSS grows. An 8KB TSB is good enough for up to about a 4MB RSS, after which the TSB starts to incur many capacity and conflict misses. We even accumulate OBP translations into the kernel TSB. Another area for refinement is large page size support. We could use a secondary address space TSB to handle those. Signed-off-by: David S. Miller --- diff --git a/arch/sparc64/kernel/Makefile b/arch/sparc64/kernel/Makefile index 83d67eb..a482a9f 100644 --- a/arch/sparc64/kernel/Makefile +++ b/arch/sparc64/kernel/Makefile @@ -38,5 +38,5 @@ else CMODEL_CFLAG := -m64 -mcmodel=medlow endif -head.o: head.S ttable.S itlb_base.S dtlb_base.S dtlb_backend.S dtlb_prot.S \ +head.o: head.S ttable.S itlb_miss.S dtlb_miss.S ktlb.S tsb.S \ etrap.S rtrap.S winfixup.S entry.S diff --git a/arch/sparc64/kernel/binfmt_aout32.c b/arch/sparc64/kernel/binfmt_aout32.c index 202a80c..181c8cd 100644 --- a/arch/sparc64/kernel/binfmt_aout32.c +++ b/arch/sparc64/kernel/binfmt_aout32.c @@ -31,6 +31,7 @@ #include #include #include +#include static int load_aout32_binary(struct linux_binprm *, struct pt_regs * regs); static int load_aout32_library(struct file*); @@ -329,15 +330,8 @@ beyond_if: current->mm->start_stack = (unsigned long) create_aout32_tables((char __user *)bprm->p, bprm); - if (!(orig_thr_flags & _TIF_32BIT)) { - unsigned long pgd_cache = get_pgd_cache(current->mm->pgd); + tsb_context_switch(mm); - __asm__ __volatile__("stxa\t%0, [%1] %2\n\t" - "membar #Sync" - : /* no outputs */ - : "r" (pgd_cache), - "r" (TSB_REG), "i" (ASI_DMMU)); - } start_thread32(regs, ex.a_entry, current->mm->start_stack); if (current->ptrace & PT_PTRACED) send_sig(SIGTRAP, current, 0); diff --git a/arch/sparc64/kernel/dtlb_backend.S b/arch/sparc64/kernel/dtlb_backend.S deleted file mode 100644 index acc889a..0000000 --- a/arch/sparc64/kernel/dtlb_backend.S +++ /dev/null @@ -1,170 +0,0 @@ -/* $Id: dtlb_backend.S,v 1.16 2001/10/09 04:02:11 davem Exp $ - * dtlb_backend.S: Back end to DTLB miss replacement strategy. - * This is included directly into the trap table. - * - * Copyright (C) 1996,1998 David S. Miller (davem@redhat.com) - * Copyright (C) 1997,1998 Jakub Jelinek (jj@ultra.linux.cz) - */ - -#include -#include - -#define VALID_SZ_BITS (_PAGE_VALID | _PAGE_SZBITS) - -#define VPTE_BITS (_PAGE_CP | _PAGE_CV | _PAGE_P ) -#define VPTE_SHIFT (PAGE_SHIFT - 3) - -/* Ways we can get here: - * - * 1) Nucleus loads and stores to/from PA-->VA direct mappings at tl>1. - * 2) Nucleus loads and stores to/from user/kernel window save areas. - * 3) VPTE misses from dtlb_base and itlb_base. - * - * We need to extract out the PMD and PGDIR indexes from the - * linear virtual page table access address. The PTE index - * is at the bottom, but we are not concerned with it. Bits - * 0 to 2 are clear since each PTE is 8 bytes in size. Each - * PMD and PGDIR entry are 4 bytes in size. Thus, this - * address looks something like: - * - * |---------------------------------------------------------------| - * | ... | PGDIR index | PMD index | PTE index | | - * |---------------------------------------------------------------| - * 63 F E D C B A 3 2 0 <- bit nr - * - * The variable bits above are defined as: - * A --> 3 + (PAGE_SHIFT - log2(8)) - * --> 3 + (PAGE_SHIFT - 3) - 1 - * (ie. this is "bit 3" + PAGE_SIZE - size of PTE entry in bits - 1) - * B --> A + 1 - * C --> B + (PAGE_SHIFT - log2(4)) - * --> B + (PAGE_SHIFT - 2) - 1 - * (ie. this is "bit B" + PAGE_SIZE - size of PMD entry in bits - 1) - * D --> C + 1 - * E --> D + (PAGE_SHIFT - log2(4)) - * --> D + (PAGE_SHIFT - 2) - 1 - * (ie. this is "bit D" + PAGE_SIZE - size of PGDIR entry in bits - 1) - * F --> E + 1 - * - * (Note how "B" always evalutes to PAGE_SHIFT, all the other constants - * cancel out.) - * - * For 8K PAGE_SIZE (thus, PAGE_SHIFT of 13) the bit numbers are: - * A --> 12 - * B --> 13 - * C --> 23 - * D --> 24 - * E --> 34 - * F --> 35 - * - * For 64K PAGE_SIZE (thus, PAGE_SHIFT of 16) the bit numbers are: - * A --> 15 - * B --> 16 - * C --> 29 - * D --> 30 - * E --> 43 - * F --> 44 - * - * Because bits both above and below each PGDIR and PMD index need to - * be masked out, and the index can be as long as 14 bits (when using a - * 64K PAGE_SIZE, and thus a PAGE_SHIFT of 16), we need 3 instructions - * to extract each index out. - * - * Shifts do not pair very well on UltraSPARC-I, II, IIi, and IIe, so - * we try to avoid using them for the entire operation. We could setup - * a mask anywhere from bit 31 down to bit 10 using the sethi instruction. - * - * We need a mask covering bits B --> C and one covering D --> E. - * For 8K PAGE_SIZE these masks are 0x00ffe000 and 0x7ff000000. - * For 64K PAGE_SIZE these masks are 0x3fff0000 and 0xfffc0000000. - * The second in each set cannot be loaded with a single sethi - * instruction, because the upper bits are past bit 32. We would - * need to use a sethi + a shift. - * - * For the time being, we use 2 shifts and a simple "and" mask. - * We shift left to clear the bits above the index, we shift down - * to clear the bits below the index (sans the log2(4 or 8) bits) - * and a mask to clear the log2(4 or 8) bits. We need therefore - * define 4 shift counts, all of which are relative to PAGE_SHIFT. - * - * Although unsupportable for other reasons, this does mean that - * 512K and 4MB page sizes would be generaally supported by the - * kernel. (ELF binaries would break with > 64K PAGE_SIZE since - * the sections are only aligned that strongly). - * - * The operations performed for extraction are thus: - * - * ((X << FOO_SHIFT_LEFT) >> FOO_SHIFT_RIGHT) & ~0x3 - * - */ - -#define A (3 + (PAGE_SHIFT - 3) - 1) -#define B (A + 1) -#define C (B + (PAGE_SHIFT - 2) - 1) -#define D (C + 1) -#define E (D + (PAGE_SHIFT - 2) - 1) -#define F (E + 1) - -#define PMD_SHIFT_LEFT (64 - D) -#define PMD_SHIFT_RIGHT (64 - (D - B) - 2) -#define PGDIR_SHIFT_LEFT (64 - F) -#define PGDIR_SHIFT_RIGHT (64 - (F - D) - 2) -#define LOW_MASK_BITS 0x3 - -/* TLB1 ** ICACHE line 1: tl1 DTLB and quick VPTE miss */ - ldxa [%g1 + %g1] ASI_DMMU, %g4 ! Get TAG_ACCESS - add %g3, %g3, %g5 ! Compute VPTE base - cmp %g4, %g5 ! VPTE miss? - bgeu,pt %xcc, 1f ! Continue here - andcc %g4, TAG_CONTEXT_BITS, %g5 ! tl0 miss Nucleus test - ba,a,pt %xcc, from_tl1_trap ! Fall to tl0 miss -1: sllx %g6, VPTE_SHIFT, %g4 ! Position TAG_ACCESS - or %g4, %g5, %g4 ! Prepare TAG_ACCESS - -/* TLB1 ** ICACHE line 2: Quick VPTE miss */ - mov TSB_REG, %g1 ! Grab TSB reg - ldxa [%g1] ASI_DMMU, %g5 ! Doing PGD caching? - sllx %g6, PMD_SHIFT_LEFT, %g1 ! Position PMD offset - be,pn %xcc, sparc64_vpte_nucleus ! Is it from Nucleus? - srlx %g1, PMD_SHIFT_RIGHT, %g1 ! Mask PMD offset bits - brnz,pt %g5, sparc64_vpte_continue ! Yep, go like smoke - andn %g1, LOW_MASK_BITS, %g1 ! Final PMD mask - sllx %g6, PGDIR_SHIFT_LEFT, %g5 ! Position PGD offset - -/* TLB1 ** ICACHE line 3: Quick VPTE miss */ - srlx %g5, PGDIR_SHIFT_RIGHT, %g5 ! Mask PGD offset bits - andn %g5, LOW_MASK_BITS, %g5 ! Final PGD mask - lduwa [%g7 + %g5] ASI_PHYS_USE_EC, %g5! Load PGD - brz,pn %g5, vpte_noent ! Valid? -sparc64_kpte_continue: - sllx %g5, 11, %g5 ! Shift into place -sparc64_vpte_continue: - lduwa [%g5 + %g1] ASI_PHYS_USE_EC, %g5! Load PMD - sllx %g5, 11, %g5 ! Shift into place - brz,pn %g5, vpte_noent ! Valid? - -/* TLB1 ** ICACHE line 4: Quick VPTE miss */ - mov (VALID_SZ_BITS >> 61), %g1 ! upper vpte into %g1 - sllx %g1, 61, %g1 ! finish calc - or %g5, VPTE_BITS, %g5 ! Prepare VPTE data - or %g5, %g1, %g5 ! ... - mov TLB_SFSR, %g1 ! Restore %g1 value - stxa %g5, [%g0] ASI_DTLB_DATA_IN ! Load VPTE into TLB - stxa %g4, [%g1 + %g1] ASI_DMMU ! Restore previous TAG_ACCESS - retry ! Load PTE once again - -#undef VALID_SZ_BITS -#undef VPTE_SHIFT -#undef VPTE_BITS -#undef A -#undef B -#undef C -#undef D -#undef E -#undef F -#undef PMD_SHIFT_LEFT -#undef PMD_SHIFT_RIGHT -#undef PGDIR_SHIFT_LEFT -#undef PGDIR_SHIFT_RIGHT -#undef LOW_MASK_BITS - diff --git a/arch/sparc64/kernel/dtlb_base.S b/arch/sparc64/kernel/dtlb_base.S deleted file mode 100644 index 6528786..0000000 --- a/arch/sparc64/kernel/dtlb_base.S +++ /dev/null @@ -1,109 +0,0 @@ -/* $Id: dtlb_base.S,v 1.17 2001/10/11 22:33:52 davem Exp $ - * dtlb_base.S: Front end to DTLB miss replacement strategy. - * This is included directly into the trap table. - * - * Copyright (C) 1996,1998 David S. Miller (davem@redhat.com) - * Copyright (C) 1997,1998 Jakub Jelinek (jj@ultra.linux.cz) - */ - -#include -#include - -/* %g1 TLB_SFSR (%g1 + %g1 == TLB_TAG_ACCESS) - * %g2 (KERN_HIGHBITS | KERN_LOWBITS) - * %g3 VPTE base (0xfffffffe00000000) Spitfire/Blackbird (44-bit VA space) - * (0xffe0000000000000) Cheetah (64-bit VA space) - * %g7 __pa(current->mm->pgd) - * - * The VPTE base value is completely magic, but note that - * few places in the kernel other than these TLB miss - * handlers know anything about the VPTE mechanism or - * how it works (see VPTE_SIZE, TASK_SIZE and PTRS_PER_PGD). - * Consider the 44-bit VADDR Ultra-I/II case as an example: - * - * VA[0 : (1<<43)] produce VPTE index [%g3 : 0] - * VA[0 : -(1<<43)] produce VPTE index [%g3-(1<<(43-PAGE_SHIFT+3)) : %g3] - * - * For Cheetah's 64-bit VADDR space this is: - * - * VA[0 : (1<<63)] produce VPTE index [%g3 : 0] - * VA[0 : -(1<<63)] produce VPTE index [%g3-(1<<(63-PAGE_SHIFT+3)) : %g3] - * - * If you're paying attention you'll notice that this means half of - * the VPTE table is above %g3 and half is below, low VA addresses - * map progressively upwards from %g3, and high VA addresses map - * progressively upwards towards %g3. This trick was needed to make - * the same 8 instruction handler work both for Spitfire/Blackbird's - * peculiar VA space hole configuration and the full 64-bit VA space - * one of Cheetah at the same time. - */ - -/* Ways we can get here: - * - * 1) Nucleus loads and stores to/from PA-->VA direct mappings. - * 2) Nucleus loads and stores to/from vmalloc() areas. - * 3) User loads and stores. - * 4) User space accesses by nucleus at tl0 - */ - -#if PAGE_SHIFT == 13 -/* - * To compute vpte offset, we need to do ((addr >> 13) << 3), - * which can be optimized to (addr >> 10) if bits 10/11/12 can - * be guaranteed to be 0 ... mmu_context.h does guarantee this - * by only using 10 bits in the hwcontext value. - */ -#define CREATE_VPTE_OFFSET1(r1, r2) nop -#define CREATE_VPTE_OFFSET2(r1, r2) \ - srax r1, 10, r2 -#else -#define CREATE_VPTE_OFFSET1(r1, r2) \ - srax r1, PAGE_SHIFT, r2 -#define CREATE_VPTE_OFFSET2(r1, r2) \ - sllx r2, 3, r2 -#endif - -/* DTLB ** ICACHE line 1: Quick user TLB misses */ - mov TLB_SFSR, %g1 - ldxa [%g1 + %g1] ASI_DMMU, %g4 ! Get TAG_ACCESS - andcc %g4, TAG_CONTEXT_BITS, %g0 ! From Nucleus? -from_tl1_trap: - rdpr %tl, %g5 ! For TL==3 test - CREATE_VPTE_OFFSET1(%g4, %g6) ! Create VPTE offset - be,pn %xcc, kvmap ! Yep, special processing - CREATE_VPTE_OFFSET2(%g4, %g6) ! Create VPTE offset - cmp %g5, 4 ! Last trap level? - -/* DTLB ** ICACHE line 2: User finish + quick kernel TLB misses */ - be,pn %xcc, longpath ! Yep, cannot risk VPTE miss - nop ! delay slot - ldxa [%g3 + %g6] ASI_S, %g5 ! Load VPTE -1: brgez,pn %g5, longpath ! Invalid, branch out - nop ! Delay-slot -9: stxa %g5, [%g0] ASI_DTLB_DATA_IN ! Reload TLB - retry ! Trap return - nop - -/* DTLB ** ICACHE line 3: winfixups+real_faults */ -longpath: - rdpr %pstate, %g5 ! Move into alternate globals - wrpr %g5, PSTATE_AG|PSTATE_MG, %pstate - rdpr %tl, %g4 ! See where we came from. - cmp %g4, 1 ! Is etrap/rtrap window fault? - mov TLB_TAG_ACCESS, %g4 ! Prepare for fault processing - ldxa [%g4] ASI_DMMU, %g5 ! Load faulting VA page - be,pt %xcc, sparc64_realfault_common ! Jump to normal fault handling - mov FAULT_CODE_DTLB, %g4 ! It was read from DTLB - -/* DTLB ** ICACHE line 4: Unused... */ - ba,a,pt %xcc, winfix_trampoline ! Call window fixup code - nop - nop - nop - nop - nop - nop - nop - -#undef CREATE_VPTE_OFFSET1 -#undef CREATE_VPTE_OFFSET2 diff --git a/arch/sparc64/kernel/dtlb_miss.S b/arch/sparc64/kernel/dtlb_miss.S new file mode 100644 index 0000000..2ef6f6e --- /dev/null +++ b/arch/sparc64/kernel/dtlb_miss.S @@ -0,0 +1,39 @@ +/* DTLB ** ICACHE line 1: Context 0 check and TSB load */ + ldxa [%g0] ASI_DMMU_TSB_8KB_PTR, %g1 ! Get TSB 8K pointer + ldxa [%g0] ASI_DMMU, %g6 ! Get TAG TARGET + srlx %g6, 48, %g5 ! Get context + brz,pn %g5, kvmap_dtlb ! Context 0 processing + nop ! Delay slot (fill me) + TSB_LOAD_QUAD(%g1, %g4) ! Load TSB entry + nop ! Push branch to next I$ line + cmp %g4, %g6 ! Compare TAG + +/* DTLB ** ICACHE line 2: TSB compare and TLB load */ + bne,pn %xcc, tsb_miss_dtlb ! Miss + mov FAULT_CODE_DTLB, %g3 + stxa %g5, [%g0] ASI_DTLB_DATA_IN ! Load TLB + retry ! Trap done + nop + nop + nop + nop + +/* DTLB ** ICACHE line 3: */ + nop + nop + nop + nop + nop + nop + nop + nop + +/* DTLB ** ICACHE line 4: */ + nop + nop + nop + nop + nop + nop + nop + nop diff --git a/arch/sparc64/kernel/entry.S b/arch/sparc64/kernel/entry.S index 12911e7..93042df 100644 --- a/arch/sparc64/kernel/entry.S +++ b/arch/sparc64/kernel/entry.S @@ -50,7 +50,8 @@ do_fpdis: add %g0, %g0, %g0 ba,a,pt %xcc, rtrap_clr_l6 -1: ldub [%g6 + TI_FPSAVED], %g5 +1: TRAP_LOAD_THREAD_REG(%g6, %g1) + ldub [%g6 + TI_FPSAVED], %g5 wr %g0, FPRS_FEF, %fprs andcc %g5, FPRS_FEF, %g0 be,a,pt %icc, 1f @@ -189,6 +190,7 @@ fp_other_bounce: .globl do_fpother_check_fitos .align 32 do_fpother_check_fitos: + TRAP_LOAD_THREAD_REG(%g6, %g1) sethi %hi(fp_other_bounce - 4), %g7 or %g7, %lo(fp_other_bounce - 4), %g7 @@ -353,8 +355,6 @@ do_fptrap_after_fsr: * * With this method we can do most of the cross-call tlb/cache * flushing very quickly. - * - * Current CPU's IRQ worklist table is locked into %g6, don't touch. */ .text .align 32 @@ -378,6 +378,8 @@ do_ivec: sllx %g2, %g4, %g2 sllx %g4, 2, %g4 + TRAP_LOAD_IRQ_WORK(%g6, %g1) + lduw [%g6 + %g4], %g5 /* g5 = irq_work(cpu, pil) */ stw %g5, [%g3 + 0x00] /* bucket->irq_chain = g5 */ stw %g3, [%g6 + %g4] /* irq_work(cpu, pil) = bucket */ @@ -399,76 +401,6 @@ do_ivec_xcall: 1: jmpl %g3, %g0 nop - .globl save_alternate_globals -save_alternate_globals: /* %o0 = save_area */ - rdpr %pstate, %o5 - andn %o5, PSTATE_IE, %o1 - wrpr %o1, PSTATE_AG, %pstate - stx %g0, [%o0 + 0x00] - stx %g1, [%o0 + 0x08] - stx %g2, [%o0 + 0x10] - stx %g3, [%o0 + 0x18] - stx %g4, [%o0 + 0x20] - stx %g5, [%o0 + 0x28] - stx %g6, [%o0 + 0x30] - stx %g7, [%o0 + 0x38] - wrpr %o1, PSTATE_IG, %pstate - stx %g0, [%o0 + 0x40] - stx %g1, [%o0 + 0x48] - stx %g2, [%o0 + 0x50] - stx %g3, [%o0 + 0x58] - stx %g4, [%o0 + 0x60] - stx %g5, [%o0 + 0x68] - stx %g6, [%o0 + 0x70] - stx %g7, [%o0 + 0x78] - wrpr %o1, PSTATE_MG, %pstate - stx %g0, [%o0 + 0x80] - stx %g1, [%o0 + 0x88] - stx %g2, [%o0 + 0x90] - stx %g3, [%o0 + 0x98] - stx %g4, [%o0 + 0xa0] - stx %g5, [%o0 + 0xa8] - stx %g6, [%o0 + 0xb0] - stx %g7, [%o0 + 0xb8] - wrpr %o5, 0x0, %pstate - retl - nop - - .globl restore_alternate_globals -restore_alternate_globals: /* %o0 = save_area */ - rdpr %pstate, %o5 - andn %o5, PSTATE_IE, %o1 - wrpr %o1, PSTATE_AG, %pstate - ldx [%o0 + 0x00], %g0 - ldx [%o0 + 0x08], %g1 - ldx [%o0 + 0x10], %g2 - ldx [%o0 + 0x18], %g3 - ldx [%o0 + 0x20], %g4 - ldx [%o0 + 0x28], %g5 - ldx [%o0 + 0x30], %g6 - ldx [%o0 + 0x38], %g7 - wrpr %o1, PSTATE_IG, %pstate - ldx [%o0 + 0x40], %g0 - ldx [%o0 + 0x48], %g1 - ldx [%o0 + 0x50], %g2 - ldx [%o0 + 0x58], %g3 - ldx [%o0 + 0x60], %g4 - ldx [%o0 + 0x68], %g5 - ldx [%o0 + 0x70], %g6 - ldx [%o0 + 0x78], %g7 - wrpr %o1, PSTATE_MG, %pstate - ldx [%o0 + 0x80], %g0 - ldx [%o0 + 0x88], %g1 - ldx [%o0 + 0x90], %g2 - ldx [%o0 + 0x98], %g3 - ldx [%o0 + 0xa0], %g4 - ldx [%o0 + 0xa8], %g5 - ldx [%o0 + 0xb0], %g6 - ldx [%o0 + 0xb8], %g7 - wrpr %o5, 0x0, %pstate - retl - nop - .globl getcc, setcc getcc: ldx [%o0 + PT_V9_TSTATE], %o1 @@ -488,9 +420,24 @@ setcc: retl stx %o1, [%o0 + PT_V9_TSTATE] - .globl utrap, utrap_ill -utrap: brz,pn %g1, etrap + .globl utrap_trap +utrap_trap: /* %g3=handler,%g4=level */ + TRAP_LOAD_THREAD_REG(%g6, %g1) + ldx [%g6 + TI_UTRAPS], %g1 + brnz,pt %g1, invoke_utrap nop + + ba,pt %xcc, etrap + rd %pc, %g7 + mov %l4, %o1 + call bad_trap + add %sp, PTREGS_OFF, %o0 + ba,pt %xcc, rtrap + clr %l6 + +invoke_utrap: + sllx %g3, 3, %g3 + ldx [%g1 + %g3], %g1 save %sp, -128, %sp rdpr %tstate, %l6 rdpr %cwp, %l7 @@ -500,17 +447,6 @@ utrap: brz,pn %g1, etrap rdpr %tnpc, %l7 wrpr %g1, 0, %tnpc done -utrap_ill: - call bad_trap - add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - clr %l6 - - /* XXX Here is stuff we still need to write... -DaveM XXX */ - .globl netbsd_syscall -netbsd_syscall: - retl - nop /* We need to carefully read the error status, ACK * the errors, prevent recursive traps, and pass the @@ -1001,7 +937,7 @@ dcpe_icpe_tl1_common: * %g3: scratch * %g4: AFSR * %g5: AFAR - * %g6: current thread ptr + * %g6: unused, will have current thread ptr after etrap * %g7: scratch */ __cheetah_log_error: @@ -1539,13 +1475,14 @@ ret_from_syscall: 1: b,pt %xcc, ret_sys_call ldx [%sp + PTREGS_OFF + PT_V9_I0], %o0 -sparc_exit: wrpr %g0, (PSTATE_RMO | PSTATE_PEF | PSTATE_PRIV), %pstate +sparc_exit: rdpr %pstate, %g2 + wrpr %g2, PSTATE_IE, %pstate rdpr %otherwin, %g1 rdpr %cansave, %g3 add %g3, %g1, %g3 wrpr %g3, 0x0, %cansave wrpr %g0, 0x0, %otherwin - wrpr %g0, (PSTATE_RMO | PSTATE_PEF | PSTATE_PRIV | PSTATE_IE), %pstate + wrpr %g2, 0x0, %pstate ba,pt %xcc, sys_exit stb %g0, [%g6 + TI_WSAVED] @@ -1690,3 +1627,11 @@ __flushw_user: restore %g0, %g0, %g0 2: retl nop + +#ifdef CONFIG_SMP + .globl hard_smp_processor_id +hard_smp_processor_id: + __GET_CPUID(%o0) + retl + nop +#endif diff --git a/arch/sparc64/kernel/etrap.S b/arch/sparc64/kernel/etrap.S index 0d8eba2..d8c062a 100644 --- a/arch/sparc64/kernel/etrap.S +++ b/arch/sparc64/kernel/etrap.S @@ -31,6 +31,7 @@ .globl etrap, etrap_irq, etraptl1 etrap: rdpr %pil, %g2 etrap_irq: + TRAP_LOAD_THREAD_REG(%g6, %g1) rdpr %tstate, %g1 sllx %g2, 20, %g3 andcc %g1, TSTATE_PRIV, %g0 @@ -54,7 +55,31 @@ etrap_irq: rd %y, %g3 stx %g1, [%g2 + STACKFRAME_SZ + PT_V9_TNPC] st %g3, [%g2 + STACKFRAME_SZ + PT_V9_Y] - save %g2, -STACK_BIAS, %sp ! Ordering here is critical + + rdpr %cansave, %g1 + brnz,pt %g1, etrap_save + nop + + rdpr %cwp, %g1 + add %g1, 2, %g1 + wrpr %g1, %cwp + be,pt %xcc, etrap_user_spill + mov ASI_AIUP, %g3 + + rdpr %otherwin, %g3 + brz %g3, etrap_kernel_spill + mov ASI_AIUS, %g3 + +etrap_user_spill: + + wr %g3, 0x0, %asi + ldx [%g6 + TI_FLAGS], %g3 + and %g3, _TIF_32BIT, %g3 + brnz,pt %g3, etrap_user_spill_32bit + nop + ba,a,pt %xcc, etrap_user_spill_64bit + +etrap_save: save %g2, -STACK_BIAS, %sp mov %g6, %l6 bne,pn %xcc, 3f @@ -71,41 +96,49 @@ etrap_irq: sethi %hi(sparc64_kern_pri_context), %g2 ldx [%g2 + %lo(sparc64_kern_pri_context)], %g3 stxa %g3, [%l4] ASI_DMMU - flush %l6 - wr %g0, ASI_AIUS, %asi -2: wrpr %g0, 0x0, %tl - mov %g4, %l4 + sethi %hi(KERNBASE), %l4 + flush %l4 + mov ASI_AIUS, %l7 +2: mov %g4, %l4 mov %g5, %l5 + add %g7, 4, %l2 + + /* Go to trap time globals so we can save them. */ +661: wrpr %g0, ETRAP_PSTATE1, %pstate + .section .sun4v_1insn_patch, "ax" + .word 661b + SET_GL(0) + .previous - mov %g7, %l2 - wrpr %g0, ETRAP_PSTATE1, %pstate stx %g1, [%sp + PTREGS_OFF + PT_V9_G1] stx %g2, [%sp + PTREGS_OFF + PT_V9_G2] + sllx %l7, 24, %l7 stx %g3, [%sp + PTREGS_OFF + PT_V9_G3] + rdpr %cwp, %l0 stx %g4, [%sp + PTREGS_OFF + PT_V9_G4] stx %g5, [%sp + PTREGS_OFF + PT_V9_G5] stx %g6, [%sp + PTREGS_OFF + PT_V9_G6] - stx %g7, [%sp + PTREGS_OFF + PT_V9_G7] + or %l7, %l0, %l7 + sethi %hi(TSTATE_RMO | TSTATE_PEF), %l0 + or %l7, %l0, %l7 + wrpr %l2, %tnpc + wrpr %l7, (TSTATE_PRIV | TSTATE_IE), %tstate stx %i0, [%sp + PTREGS_OFF + PT_V9_I0] stx %i1, [%sp + PTREGS_OFF + PT_V9_I1] stx %i2, [%sp + PTREGS_OFF + PT_V9_I2] stx %i3, [%sp + PTREGS_OFF + PT_V9_I3] stx %i4, [%sp + PTREGS_OFF + PT_V9_I4] stx %i5, [%sp + PTREGS_OFF + PT_V9_I5] - stx %i6, [%sp + PTREGS_OFF + PT_V9_I6] - stx %i7, [%sp + PTREGS_OFF + PT_V9_I7] - wrpr %g0, ETRAP_PSTATE2, %pstate mov %l6, %g6 -#ifdef CONFIG_SMP - mov TSB_REG, %g3 - ldxa [%g3] ASI_IMMU, %g5 -#endif - jmpl %l2 + 0x4, %g0 - ldx [%g6 + TI_TASK], %g4 + stx %i7, [%sp + PTREGS_OFF + PT_V9_I7] + LOAD_PER_CPU_BASE(%g5, %g6, %g4, %g3, %l1) + ldx [%g6 + TI_TASK], %g4 + done -3: ldub [%l6 + TI_FPDEPTH], %l5 +3: mov ASI_P, %l7 + ldub [%l6 + TI_FPDEPTH], %l5 add %l6, TI_FPSAVED + 1, %l4 srl %l5, 1, %l3 add %l5, 2, %l5 @@ -125,6 +158,7 @@ etraptl1: /* Save tstate/tpc/tnpc of TL * 0x58 TL4's TT * 0x60 TL */ + TRAP_LOAD_THREAD_REG(%g6, %g1) sub %sp, ((4 * 8) * 4) + 8, %g2 rdpr %tl, %g1 @@ -168,91 +202,19 @@ etraptl1: /* Save tstate/tpc/tnpc of TL rdpr %tt, %g3 stx %g3, [%g2 + STACK_BIAS + 0x78] - wrpr %g1, %tl stx %g1, [%g2 + STACK_BIAS + 0x80] + wrpr %g0, 1, %tl +661: nop + .section .sun4v_1insn_patch, "ax" + .word 661b + SET_GL(1) + .previous + rdpr %tstate, %g1 sub %g2, STACKFRAME_SZ + TRACEREG_SZ - STACK_BIAS, %g2 ba,pt %xcc, 1b andcc %g1, TSTATE_PRIV, %g0 - .align 64 - .globl scetrap -scetrap: rdpr %pil, %g2 - rdpr %tstate, %g1 - sllx %g2, 20, %g3 - andcc %g1, TSTATE_PRIV, %g0 - or %g1, %g3, %g1 - bne,pn %xcc, 1f - sub %sp, (STACKFRAME_SZ+TRACEREG_SZ-STACK_BIAS), %g2 - wrpr %g0, 7, %cleanwin - - sllx %g1, 51, %g3 - sethi %hi(TASK_REGOFF), %g2 - or %g2, %lo(TASK_REGOFF), %g2 - brlz,pn %g3, 1f - add %g6, %g2, %g2 - wr %g0, 0, %fprs -1: rdpr %tpc, %g3 - stx %g1, [%g2 + STACKFRAME_SZ + PT_V9_TSTATE] - - rdpr %tnpc, %g1 - stx %g3, [%g2 + STACKFRAME_SZ + PT_V9_TPC] - stx %g1, [%g2 + STACKFRAME_SZ + PT_V9_TNPC] - save %g2, -STACK_BIAS, %sp ! Ordering here is critical - mov %g6, %l6 - bne,pn %xcc, 2f - mov ASI_P, %l7 - rdpr %canrestore, %g3 - - rdpr %wstate, %g2 - wrpr %g0, 0, %canrestore - sll %g2, 3, %g2 - mov PRIMARY_CONTEXT, %l4 - wrpr %g3, 0, %otherwin - wrpr %g2, 0, %wstate - sethi %hi(sparc64_kern_pri_context), %g2 - ldx [%g2 + %lo(sparc64_kern_pri_context)], %g3 - stxa %g3, [%l4] ASI_DMMU - flush %l6 - - mov ASI_AIUS, %l7 -2: mov %g4, %l4 - mov %g5, %l5 - add %g7, 0x4, %l2 - wrpr %g0, ETRAP_PSTATE1, %pstate - stx %g1, [%sp + PTREGS_OFF + PT_V9_G1] - stx %g2, [%sp + PTREGS_OFF + PT_V9_G2] - sllx %l7, 24, %l7 - - stx %g3, [%sp + PTREGS_OFF + PT_V9_G3] - rdpr %cwp, %l0 - stx %g4, [%sp + PTREGS_OFF + PT_V9_G4] - stx %g5, [%sp + PTREGS_OFF + PT_V9_G5] - stx %g6, [%sp + PTREGS_OFF + PT_V9_G6] - stx %g7, [%sp + PTREGS_OFF + PT_V9_G7] - or %l7, %l0, %l7 - sethi %hi(TSTATE_RMO | TSTATE_PEF), %l0 - - or %l7, %l0, %l7 - wrpr %l2, %tnpc - wrpr %l7, (TSTATE_PRIV | TSTATE_IE), %tstate - stx %i0, [%sp + PTREGS_OFF + PT_V9_I0] - stx %i1, [%sp + PTREGS_OFF + PT_V9_I1] - stx %i2, [%sp + PTREGS_OFF + PT_V9_I2] - stx %i3, [%sp + PTREGS_OFF + PT_V9_I3] - stx %i4, [%sp + PTREGS_OFF + PT_V9_I4] - - stx %i5, [%sp + PTREGS_OFF + PT_V9_I5] - stx %i6, [%sp + PTREGS_OFF + PT_V9_I6] - mov %l6, %g6 - stx %i7, [%sp + PTREGS_OFF + PT_V9_I7] -#ifdef CONFIG_SMP - mov TSB_REG, %g3 - ldxa [%g3] ASI_IMMU, %g5 -#endif - ldx [%g6 + TI_TASK], %g4 - done - #undef TASK_REGOFF #undef ETRAP_PSTATE1 diff --git a/arch/sparc64/kernel/head.S b/arch/sparc64/kernel/head.S index b49dcd4..f04f739 100644 --- a/arch/sparc64/kernel/head.S +++ b/arch/sparc64/kernel/head.S @@ -26,6 +26,7 @@ #include #include #include +#include /* This section from from _start to sparc64_boot_end should fit into * 0x0000000000404000 to 0x0000000000408000. @@ -315,6 +316,24 @@ sun4u_init: ba,pt %xcc, spitfire_tlb_fixup nop + /* XXX Nothing branches to here yet, when %ver register indicates + * XXX Niagara we should do this. + */ +niagara_tlb_fixup: + mov 3, %g2 /* Set TLB type to hypervisor. */ + sethi %hi(tlb_type), %g1 + stw %g2, [%g1 + %lo(tlb_type)] + + /* Patch copy/clear ops. */ + call niagara_patch_copyops + nop + call niagara_patch_pageops + nop + + /* Patch TLB/cache ops. */ + call hypervisor_patch_cachetlbops + nop + cheetah_tlb_fixup: mov 2, %g2 /* Set TLB type to cheetah+. */ BRANCH_IF_CHEETAH_PLUS_OR_FOLLOWON(g1,g7,1f) @@ -421,69 +440,6 @@ setup_trap_table: stxa %g2, [%g1] ASI_DMMU membar #Sync - /* The Linux trap handlers expect various trap global registers - * to be setup with some fixed values. So here we set these - * up very carefully. These globals are: - * - * Alternate Globals (PSTATE_AG): - * - * %g6 --> current_thread_info() - * - * MMU Globals (PSTATE_MG): - * - * %g1 --> TLB_SFSR - * %g2 --> ((_PAGE_VALID | _PAGE_SZ4MB | - * _PAGE_CP | _PAGE_CV | _PAGE_P | _PAGE_W) - * ^ 0xfffff80000000000) - * (this %g2 value is used for computing the PAGE_OFFSET kernel - * TLB entries quickly, the virtual address of the fault XOR'd - * with this %g2 value is the PTE to load into the TLB) - * %g3 --> VPTE_BASE_CHEETAH or VPTE_BASE_SPITFIRE - * - * Interrupt Globals (PSTATE_IG, setup by init_irqwork_curcpu()): - * - * %g6 --> __irq_work[smp_processor_id()] - */ - - rdpr %pstate, %o1 - mov %g6, %o2 - wrpr %o1, PSTATE_AG, %pstate - mov %o2, %g6 - -#define KERN_HIGHBITS ((_PAGE_VALID|_PAGE_SZ4MB)^0xfffff80000000000) -#define KERN_LOWBITS (_PAGE_CP | _PAGE_CV | _PAGE_P | _PAGE_W) - wrpr %o1, PSTATE_MG, %pstate - mov TSB_REG, %g1 - stxa %g0, [%g1] ASI_DMMU - membar #Sync - stxa %g0, [%g1] ASI_IMMU - membar #Sync - mov TLB_SFSR, %g1 - sethi %uhi(KERN_HIGHBITS), %g2 - or %g2, %ulo(KERN_HIGHBITS), %g2 - sllx %g2, 32, %g2 - or %g2, KERN_LOWBITS, %g2 - - BRANCH_IF_ANY_CHEETAH(g3,g7,8f) - ba,pt %xcc, 9f - nop - -8: - sethi %uhi(VPTE_BASE_CHEETAH), %g3 - or %g3, %ulo(VPTE_BASE_CHEETAH), %g3 - ba,pt %xcc, 2f - sllx %g3, 32, %g3 - -9: - sethi %uhi(VPTE_BASE_SPITFIRE), %g3 - or %g3, %ulo(VPTE_BASE_SPITFIRE), %g3 - sllx %g3, 32, %g3 - -2: - clr %g7 -#undef KERN_HIGHBITS -#undef KERN_LOWBITS - /* Kill PROM timer */ sethi %hi(0x80000000), %o2 sllx %o2, 32, %o2 @@ -502,7 +458,6 @@ setup_trap_table: 2: wrpr %g0, %g0, %wstate - wrpr %o1, 0x0, %pstate call init_irqwork_curcpu nop @@ -517,7 +472,7 @@ setup_trap_table: restore .globl setup_tba -setup_tba: /* i0 = is_starfire */ +setup_tba: save %sp, -192, %sp /* The boot processor is the only cpu which invokes this @@ -537,7 +492,9 @@ setup_tba: /* i0 = is_starfire */ sparc64_boot_end: #include "systbls.S" +#include "sun4v_tlb_miss.S" #include "ktlb.S" +#include "tsb.S" #include "etrap.S" #include "rtrap.S" #include "winfixup.S" @@ -546,16 +503,16 @@ sparc64_boot_end: /* * The following skip makes sure the trap table in ttable.S is aligned * on a 32K boundary as required by the v9 specs for TBA register. + * + * We align to a 32K boundary, then we have the 32K kernel TSB, + * then the 32K aligned trap table. */ 1: .skip 0x4000 + _start - 1b -#ifdef CONFIG_SBUS -/* This is just a hack to fool make depend config.h discovering - strategy: As the .S files below need config.h, but - make depend does not find it for them, we include config.h - in head.S */ -#endif + .globl swapper_tsb +swapper_tsb: + .skip (32 * 1024) ! 0x0000000000408000 diff --git a/arch/sparc64/kernel/irq.c b/arch/sparc64/kernel/irq.c index 233526b..56ce33d 100644 --- a/arch/sparc64/kernel/irq.c +++ b/arch/sparc64/kernel/irq.c @@ -39,6 +39,7 @@ #include #include #include +#include #ifdef CONFIG_SMP static void distribute_irqs(void); @@ -153,7 +154,8 @@ void enable_irq(unsigned int irq) unsigned long ver; __asm__ ("rdpr %%ver, %0" : "=r" (ver)); - if ((ver >> 32) == 0x003e0016) { + if ((ver >> 32) == __JALAPENO_ID || + (ver >> 32) == __SERRANO_ID) { /* We set it to our JBUS ID. */ __asm__ __volatile__("ldxa [%%g0] %1, %0" : "=r" (tid) @@ -848,33 +850,9 @@ static void kill_prom_timer(void) void init_irqwork_curcpu(void) { - register struct irq_work_struct *workp asm("o2"); - register unsigned long tmp asm("o3"); int cpu = hard_smp_processor_id(); - memset(__irq_work + cpu, 0, sizeof(*workp)); - - /* Make sure we are called with PSTATE_IE disabled. */ - __asm__ __volatile__("rdpr %%pstate, %0\n\t" - : "=r" (tmp)); - if (tmp & PSTATE_IE) { - prom_printf("BUG: init_irqwork_curcpu() called with " - "PSTATE_IE enabled, bailing.\n"); - __asm__ __volatile__("mov %%i7, %0\n\t" - : "=r" (tmp)); - prom_printf("BUG: Called from %lx\n", tmp); - prom_halt(); - } - - /* Set interrupt globals. */ - workp = &__irq_work[cpu]; - __asm__ __volatile__( - "rdpr %%pstate, %0\n\t" - "wrpr %0, %1, %%pstate\n\t" - "mov %2, %%g6\n\t" - "wrpr %0, 0x0, %%pstate\n\t" - : "=&r" (tmp) - : "i" (PSTATE_IG), "r" (workp)); + memset(__irq_work + cpu, 0, sizeof(struct irq_work_struct)); } /* Only invoked on boot processor. */ diff --git a/arch/sparc64/kernel/itlb_base.S b/arch/sparc64/kernel/itlb_base.S deleted file mode 100644 index 4951ff8..0000000 --- a/arch/sparc64/kernel/itlb_base.S +++ /dev/null @@ -1,79 +0,0 @@ -/* $Id: itlb_base.S,v 1.12 2002/02/09 19:49:30 davem Exp $ - * itlb_base.S: Front end to ITLB miss replacement strategy. - * This is included directly into the trap table. - * - * Copyright (C) 1996,1998 David S. Miller (davem@redhat.com) - * Copyright (C) 1997,1998 Jakub Jelinek (jj@ultra.linux.cz) - */ - -#if PAGE_SHIFT == 13 -/* - * To compute vpte offset, we need to do ((addr >> 13) << 3), - * which can be optimized to (addr >> 10) if bits 10/11/12 can - * be guaranteed to be 0 ... mmu_context.h does guarantee this - * by only using 10 bits in the hwcontext value. - */ -#define CREATE_VPTE_OFFSET1(r1, r2) \ - srax r1, 10, r2 -#define CREATE_VPTE_OFFSET2(r1, r2) nop -#else /* PAGE_SHIFT */ -#define CREATE_VPTE_OFFSET1(r1, r2) \ - srax r1, PAGE_SHIFT, r2 -#define CREATE_VPTE_OFFSET2(r1, r2) \ - sllx r2, 3, r2 -#endif /* PAGE_SHIFT */ - - -/* Ways we can get here: - * - * 1) Nucleus instruction misses from module code. - * 2) All user instruction misses. - * - * All real page faults merge their code paths to the - * sparc64_realfault_common label below. - */ - -/* ITLB ** ICACHE line 1: Quick user TLB misses */ - mov TLB_SFSR, %g1 - ldxa [%g1 + %g1] ASI_IMMU, %g4 ! Get TAG_ACCESS - CREATE_VPTE_OFFSET1(%g4, %g6) ! Create VPTE offset - CREATE_VPTE_OFFSET2(%g4, %g6) ! Create VPTE offset - ldxa [%g3 + %g6] ASI_P, %g5 ! Load VPTE -1: brgez,pn %g5, 3f ! Not valid, branch out - sethi %hi(_PAGE_EXEC), %g4 ! Delay-slot - andcc %g5, %g4, %g0 ! Executable? - -/* ITLB ** ICACHE line 2: Real faults */ - be,pn %xcc, 3f ! Nope, branch. - nop ! Delay-slot -2: stxa %g5, [%g0] ASI_ITLB_DATA_IN ! Load PTE into TLB - retry ! Trap return -3: rdpr %pstate, %g4 ! Move into alt-globals - wrpr %g4, PSTATE_AG|PSTATE_MG, %pstate - rdpr %tpc, %g5 ! And load faulting VA - mov FAULT_CODE_ITLB, %g4 ! It was read from ITLB - -/* ITLB ** ICACHE line 3: Finish faults */ -sparc64_realfault_common: ! Called by dtlb_miss - stb %g4, [%g6 + TI_FAULT_CODE] - stx %g5, [%g6 + TI_FAULT_ADDR] - ba,pt %xcc, etrap ! Save state -1: rd %pc, %g7 ! ... - call do_sparc64_fault ! Call fault handler - add %sp, PTREGS_OFF, %o0! Compute pt_regs arg - ba,pt %xcc, rtrap_clr_l6 ! Restore cpu state - nop - -/* ITLB ** ICACHE line 4: Window fixups */ -winfix_trampoline: - rdpr %tpc, %g3 ! Prepare winfixup TNPC - or %g3, 0x7c, %g3 ! Compute branch offset - wrpr %g3, %tnpc ! Write it into TNPC - done ! Do it to it - nop - nop - nop - nop - -#undef CREATE_VPTE_OFFSET1 -#undef CREATE_VPTE_OFFSET2 diff --git a/arch/sparc64/kernel/itlb_miss.S b/arch/sparc64/kernel/itlb_miss.S new file mode 100644 index 0000000..97facce --- /dev/null +++ b/arch/sparc64/kernel/itlb_miss.S @@ -0,0 +1,39 @@ +/* ITLB ** ICACHE line 1: Context 0 check and TSB load */ + ldxa [%g0] ASI_IMMU_TSB_8KB_PTR, %g1 ! Get TSB 8K pointer + ldxa [%g0] ASI_IMMU, %g6 ! Get TAG TARGET + srlx %g6, 48, %g5 ! Get context + brz,pn %g5, kvmap_itlb ! Context 0 processing + nop ! Delay slot (fill me) + TSB_LOAD_QUAD(%g1, %g4) ! Load TSB entry + cmp %g4, %g6 ! Compare TAG + sethi %hi(_PAGE_EXEC), %g4 ! Setup exec check + +/* ITLB ** ICACHE line 2: TSB compare and TLB load */ + bne,pn %xcc, tsb_miss_itlb ! Miss + mov FAULT_CODE_ITLB, %g3 + andcc %g5, %g4, %g0 ! Executable? + be,pn %xcc, tsb_do_fault + nop ! Delay slot, fill me + stxa %g5, [%g0] ASI_ITLB_DATA_IN ! Load TLB + retry ! Trap done + nop + +/* ITLB ** ICACHE line 3: */ + nop + nop + nop + nop + nop + nop + nop + nop + +/* ITLB ** ICACHE line 4: */ + nop + nop + nop + nop + nop + nop + nop + nop diff --git a/arch/sparc64/kernel/ktlb.S b/arch/sparc64/kernel/ktlb.S index d9244d3..f6bb2e0 100644 --- a/arch/sparc64/kernel/ktlb.S +++ b/arch/sparc64/kernel/ktlb.S @@ -4,191 +4,192 @@ * Copyright (C) 1996 Eddie C. Dost (ecd@brainaid.de) * Copyright (C) 1996 Miguel de Icaza (miguel@nuclecu.unam.mx) * Copyright (C) 1996,98,99 Jakub Jelinek (jj@sunsite.mff.cuni.cz) -*/ + */ #include #include #include #include #include +#include .text .align 32 -/* - * On a second level vpte miss, check whether the original fault is to the OBP - * range (note that this is only possible for instruction miss, data misses to - * obp range do not use vpte). If so, go back directly to the faulting address. - * This is because we want to read the tpc, otherwise we have no way of knowing - * the 8k aligned faulting address if we are using >8k kernel pagesize. This - * also ensures no vpte range addresses are dropped into tlb while obp is - * executing (see inherit_locked_prom_mappings() rant). - */ -sparc64_vpte_nucleus: - /* Note that kvmap below has verified that the address is - * in the range MODULES_VADDR --> VMALLOC_END already. So - * here we need only check if it is an OBP address or not. +kvmap_itlb: + /* g6: TAG TARGET */ + mov TLB_TAG_ACCESS, %g4 + ldxa [%g4] ASI_IMMU, %g4 + + /* sun4v_itlb_miss branches here with the missing virtual + * address already loaded into %g4 */ +kvmap_itlb_4v: + +kvmap_itlb_nonlinear: + /* Catch kernel NULL pointer calls. */ + sethi %hi(PAGE_SIZE), %g5 + cmp %g4, %g5 + bleu,pn %xcc, kvmap_dtlb_longpath + nop + + KERN_TSB_LOOKUP_TL1(%g4, %g6, %g5, %g1, %g2, %g3, kvmap_itlb_load) + +kvmap_itlb_tsb_miss: sethi %hi(LOW_OBP_ADDRESS), %g5 cmp %g4, %g5 - blu,pn %xcc, kern_vpte + blu,pn %xcc, kvmap_itlb_vmalloc_addr mov 0x1, %g5 sllx %g5, 32, %g5 cmp %g4, %g5 - blu,pn %xcc, vpte_insn_obp + blu,pn %xcc, kvmap_itlb_obp nop - /* These two instructions are patched by paginig_init(). */ -kern_vpte: - sethi %hi(swapper_pgd_zero), %g5 - lduw [%g5 + %lo(swapper_pgd_zero)], %g5 - - /* With kernel PGD in %g5, branch back into dtlb_backend. */ - ba,pt %xcc, sparc64_kpte_continue - andn %g1, 0x3, %g1 /* Finish PMD offset adjustment. */ - -vpte_noent: - /* Restore previous TAG_ACCESS, %g5 is zero, and we will - * skip over the trap instruction so that the top level - * TLB miss handler will thing this %g5 value is just an - * invalid PTE, thus branching to full fault processing. - */ - mov TLB_SFSR, %g1 - stxa %g4, [%g1 + %g1] ASI_DMMU - done - -vpte_insn_obp: - /* Behave as if we are at TL0. */ - wrpr %g0, 1, %tl - rdpr %tpc, %g4 /* Find original faulting iaddr */ - srlx %g4, 13, %g4 /* Throw out context bits */ - sllx %g4, 13, %g4 /* g4 has vpn + ctx0 now */ - - /* Restore previous TAG_ACCESS. */ - mov TLB_SFSR, %g1 - stxa %g4, [%g1 + %g1] ASI_IMMU - - sethi %hi(prom_trans), %g5 - or %g5, %lo(prom_trans), %g5 - -1: ldx [%g5 + 0x00], %g6 ! base - brz,a,pn %g6, longpath ! no more entries, fail - mov TLB_SFSR, %g1 ! and restore %g1 - ldx [%g5 + 0x08], %g1 ! len - add %g6, %g1, %g1 ! end - cmp %g6, %g4 - bgu,pt %xcc, 2f - cmp %g4, %g1 - bgeu,pt %xcc, 2f - ldx [%g5 + 0x10], %g1 ! PTE - - /* TLB load, restore %g1, and return from trap. */ - sub %g4, %g6, %g6 - add %g1, %g6, %g5 - mov TLB_SFSR, %g1 - stxa %g5, [%g0] ASI_ITLB_DATA_IN - retry +kvmap_itlb_vmalloc_addr: + KERN_PGTABLE_WALK(%g4, %g5, %g2, kvmap_itlb_longpath) + + KTSB_LOCK_TAG(%g1, %g2, %g4) + + /* Load and check PTE. */ + ldxa [%g5] ASI_PHYS_USE_EC, %g5 + brgez,a,pn %g5, kvmap_itlb_longpath + KTSB_STORE(%g1, %g0) -2: ba,pt %xcc, 1b - add %g5, (3 * 8), %g5 ! next entry + KTSB_WRITE(%g1, %g5, %g6) -kvmap_do_obp: - sethi %hi(prom_trans), %g5 - or %g5, %lo(prom_trans), %g5 - srlx %g4, 13, %g4 - sllx %g4, 13, %g4 - -1: ldx [%g5 + 0x00], %g6 ! base - brz,a,pn %g6, longpath ! no more entries, fail - mov TLB_SFSR, %g1 ! and restore %g1 - ldx [%g5 + 0x08], %g1 ! len - add %g6, %g1, %g1 ! end - cmp %g6, %g4 - bgu,pt %xcc, 2f - cmp %g4, %g1 - bgeu,pt %xcc, 2f - ldx [%g5 + 0x10], %g1 ! PTE - - /* TLB load, restore %g1, and return from trap. */ - sub %g4, %g6, %g6 - add %g1, %g6, %g5 - mov TLB_SFSR, %g1 - stxa %g5, [%g0] ASI_DTLB_DATA_IN + /* fallthrough to TLB load */ + +kvmap_itlb_load: + stxa %g5, [%g0] ASI_ITLB_DATA_IN ! Reload TLB retry -2: ba,pt %xcc, 1b - add %g5, (3 * 8), %g5 ! next entry +kvmap_itlb_longpath: + +661: rdpr %pstate, %g5 + wrpr %g5, PSTATE_AG | PSTATE_MG, %pstate + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + + rdpr %tpc, %g5 + ba,pt %xcc, sparc64_realfault_common + mov FAULT_CODE_ITLB, %g4 + +kvmap_itlb_obp: + OBP_TRANS_LOOKUP(%g4, %g5, %g2, %g3, kvmap_itlb_longpath) + + KTSB_LOCK_TAG(%g1, %g2, %g4) + + KTSB_WRITE(%g1, %g5, %g6) + + ba,pt %xcc, kvmap_itlb_load + nop + +kvmap_dtlb_obp: + OBP_TRANS_LOOKUP(%g4, %g5, %g2, %g3, kvmap_dtlb_longpath) + + KTSB_LOCK_TAG(%g1, %g2, %g4) + + KTSB_WRITE(%g1, %g5, %g6) + + ba,pt %xcc, kvmap_dtlb_load + nop -/* - * On a first level data miss, check whether this is to the OBP range (note - * that such accesses can be made by prom, as well as by kernel using - * prom_getproperty on "address"), and if so, do not use vpte access ... - * rather, use information saved during inherit_prom_mappings() using 8k - * pagesize. - */ .align 32 -kvmap: - brgez,pn %g4, kvmap_nonlinear +kvmap_dtlb: + /* %g6: TAG TARGET */ + mov TLB_TAG_ACCESS, %g4 + ldxa [%g4] ASI_DMMU, %g4 + + /* sun4v_dtlb_miss branches here with the missing virtual + * address already loaded into %g4 + */ +kvmap_dtlb_4v: + brgez,pn %g4, kvmap_dtlb_nonlinear nop -#ifdef CONFIG_DEBUG_PAGEALLOC +#define KERN_HIGHBITS ((_PAGE_VALID|_PAGE_SZ4MB)^0xfffff80000000000) +#define KERN_LOWBITS (_PAGE_CP | _PAGE_CV | _PAGE_P | _PAGE_W) + + sethi %uhi(KERN_HIGHBITS), %g2 + or %g2, %ulo(KERN_HIGHBITS), %g2 + sllx %g2, 32, %g2 + or %g2, KERN_LOWBITS, %g2 + +#undef KERN_HIGHBITS +#undef KERN_LOWBITS + .globl kvmap_linear_patch kvmap_linear_patch: -#endif - ba,pt %xcc, kvmap_load + ba,pt %xcc, kvmap_dtlb_load xor %g2, %g4, %g5 -#ifdef CONFIG_DEBUG_PAGEALLOC - sethi %hi(swapper_pg_dir), %g5 - or %g5, %lo(swapper_pg_dir), %g5 - sllx %g4, 64 - (PGDIR_SHIFT + PGDIR_BITS), %g6 - srlx %g6, 64 - PAGE_SHIFT, %g6 - andn %g6, 0x3, %g6 - lduw [%g5 + %g6], %g5 - brz,pn %g5, longpath - sllx %g4, 64 - (PMD_SHIFT + PMD_BITS), %g6 - srlx %g6, 64 - PAGE_SHIFT, %g6 - sllx %g5, 11, %g5 - andn %g6, 0x3, %g6 - lduwa [%g5 + %g6] ASI_PHYS_USE_EC, %g5 - brz,pn %g5, longpath - sllx %g4, 64 - PMD_SHIFT, %g6 - srlx %g6, 64 - PAGE_SHIFT, %g6 - sllx %g5, 11, %g5 - andn %g6, 0x7, %g6 - ldxa [%g5 + %g6] ASI_PHYS_USE_EC, %g5 - brz,pn %g5, longpath +kvmap_dtlb_vmalloc_addr: + KERN_PGTABLE_WALK(%g4, %g5, %g2, kvmap_dtlb_longpath) + + KTSB_LOCK_TAG(%g1, %g2, %g4) + + /* Load and check PTE. */ + ldxa [%g5] ASI_PHYS_USE_EC, %g5 + brgez,a,pn %g5, kvmap_dtlb_longpath + KTSB_STORE(%g1, %g0) + + KTSB_WRITE(%g1, %g5, %g6) + + /* fallthrough to TLB load */ + +kvmap_dtlb_load: + stxa %g5, [%g0] ASI_DTLB_DATA_IN ! Reload TLB + retry + +kvmap_dtlb_nonlinear: + /* Catch kernel NULL pointer derefs. */ + sethi %hi(PAGE_SIZE), %g5 + cmp %g4, %g5 + bleu,pn %xcc, kvmap_dtlb_longpath nop - ba,a,pt %xcc, kvmap_load -#endif -kvmap_nonlinear: + KERN_TSB_LOOKUP_TL1(%g4, %g6, %g5, %g1, %g2, %g3, kvmap_dtlb_load) + +kvmap_dtlb_tsbmiss: sethi %hi(MODULES_VADDR), %g5 cmp %g4, %g5 - blu,pn %xcc, longpath + blu,pn %xcc, kvmap_dtlb_longpath mov (VMALLOC_END >> 24), %g5 sllx %g5, 24, %g5 cmp %g4, %g5 - bgeu,pn %xcc, longpath + bgeu,pn %xcc, kvmap_dtlb_longpath nop kvmap_check_obp: sethi %hi(LOW_OBP_ADDRESS), %g5 cmp %g4, %g5 - blu,pn %xcc, kvmap_vmalloc_addr + blu,pn %xcc, kvmap_dtlb_vmalloc_addr mov 0x1, %g5 sllx %g5, 32, %g5 cmp %g4, %g5 - blu,pn %xcc, kvmap_do_obp + blu,pn %xcc, kvmap_dtlb_obp nop - -kvmap_vmalloc_addr: - /* If we get here, a vmalloc addr was accessed, load kernel VPTE. */ - ldxa [%g3 + %g6] ASI_N, %g5 - brgez,pn %g5, longpath + ba,pt %xcc, kvmap_dtlb_vmalloc_addr nop -kvmap_load: - /* PTE is valid, load into TLB and return from trap. */ - stxa %g5, [%g0] ASI_DTLB_DATA_IN ! Reload TLB - retry +kvmap_dtlb_longpath: + +661: rdpr %pstate, %g5 + wrpr %g5, PSTATE_AG | PSTATE_MG, %pstate + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + + rdpr %tl, %g4 + cmp %g4, 1 + mov TLB_TAG_ACCESS, %g4 + ldxa [%g4] ASI_DMMU, %g5 + be,pt %xcc, sparc64_realfault_common + mov FAULT_CODE_DTLB, %g4 + ba,pt %xcc, winfix_trampoline + nop diff --git a/arch/sparc64/kernel/process.c b/arch/sparc64/kernel/process.c index 059b0d0..803eea4 100644 --- a/arch/sparc64/kernel/process.c +++ b/arch/sparc64/kernel/process.c @@ -44,6 +44,7 @@ #include #include #include +#include #include /* #define VERBOSE_SHOWREGS */ @@ -433,30 +434,15 @@ void exit_thread(void) void flush_thread(void) { struct thread_info *t = current_thread_info(); + struct mm_struct *mm; if (t->flags & _TIF_ABI_PENDING) t->flags ^= (_TIF_ABI_PENDING | _TIF_32BIT); - if (t->task->mm) { - unsigned long pgd_cache = 0UL; - if (test_thread_flag(TIF_32BIT)) { - struct mm_struct *mm = t->task->mm; - pgd_t *pgd0 = &mm->pgd[0]; - pud_t *pud0 = pud_offset(pgd0, 0); - - if (pud_none(*pud0)) { - pmd_t *page = pmd_alloc_one(mm, 0); - pud_set(pud0, page); - } - pgd_cache = get_pgd_cache(pgd0); - } - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "membar #Sync" - : /* no outputs */ - : "r" (pgd_cache), - "r" (TSB_REG), - "i" (ASI_DMMU)); - } + mm = t->task->mm; + if (mm) + tsb_context_switch(mm); + set_thread_wsaved(0); /* Turn off performance counters if on. */ @@ -555,6 +541,18 @@ void synchronize_user_stack(void) } } +static void stack_unaligned(unsigned long sp) +{ + siginfo_t info; + + info.si_signo = SIGBUS; + info.si_errno = 0; + info.si_code = BUS_ADRALN; + info.si_addr = (void __user *) sp; + info.si_trapno = 0; + force_sig_info(SIGBUS, &info, current); +} + void fault_in_user_windows(void) { struct thread_info *t = current_thread_info(); @@ -570,13 +568,17 @@ void fault_in_user_windows(void) flush_user_windows(); window = get_thread_wsaved(); - if (window != 0) { + if (likely(window != 0)) { window -= 1; do { unsigned long sp = (t->rwbuf_stkptrs[window] + bias); struct reg_window *rwin = &t->reg_window[window]; - if (copy_to_user((char __user *)sp, rwin, winsize)) + if (unlikely(sp & 0x7UL)) + stack_unaligned(sp); + + if (unlikely(copy_to_user((char __user *)sp, + rwin, winsize))) goto barf; } while (window--); } diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index b80eba0..a55d517 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -223,12 +223,24 @@ rt_continue: ldx [%sp + PTREGS_OFF + P ldx [%sp + PTREGS_OFF + PT_V9_G3], %g3 ldx [%sp + PTREGS_OFF + PT_V9_G4], %g4 ldx [%sp + PTREGS_OFF + PT_V9_G5], %g5 - mov TSB_REG, %g6 - brnz,a,pn %l3, 1f - ldxa [%g6] ASI_IMMU, %g5 -1: ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 + brz,pt %l3, 1f + mov %g6, %l2 + + /* Must do this before thread reg is clobbered below. */ + LOAD_PER_CPU_BASE(%g5, %g6, %i0, %i1, %i2) +1: + ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 ldx [%sp + PTREGS_OFF + PT_V9_G7], %g7 - wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate + + /* Normal globals are restored, go to trap globals. */ +661: wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate + .section .sun4v_1insn_patch, "ax" + .word 661b + SET_GL(1) + .previous + + mov %l2, %g6 + ldx [%sp + PTREGS_OFF + PT_V9_I0], %i0 ldx [%sp + PTREGS_OFF + PT_V9_I1], %i1 @@ -257,22 +269,84 @@ rt_continue: ldx [%sp + PTREGS_OFF + P ldx [%l1 + %lo(sparc64_kern_pri_nuc_bits)], %l1 or %l0, %l1, %l0 stxa %l0, [%l7] ASI_DMMU - flush %g6 + sethi %hi(KERNBASE), %l7 + flush %l7 rdpr %wstate, %l1 rdpr %otherwin, %l2 srl %l1, 3, %l1 wrpr %l2, %g0, %canrestore wrpr %l1, %g0, %wstate - wrpr %g0, %g0, %otherwin + brnz,pt %l2, user_rtt_restore + wrpr %g0, %g0, %otherwin + + ldx [%g6 + TI_FLAGS], %g3 + wr %g0, ASI_AIUP, %asi + rdpr %cwp, %g1 + andcc %g3, _TIF_32BIT, %g0 + sub %g1, 1, %g1 + bne,pt %xcc, user_rtt_fill_32bit + wrpr %g1, %cwp + ba,a,pt %xcc, user_rtt_fill_64bit + +user_rtt_fill_fixup: + rdpr %cwp, %g1 + add %g1, 1, %g1 + wrpr %g1, 0x0, %cwp + + rdpr %wstate, %g2 + sll %g2, 3, %g2 + wrpr %g2, 0x0, %wstate + + /* We know %canrestore and %otherwin are both zero. */ + + sethi %hi(sparc64_kern_pri_context), %g2 + ldx [%g2 + %lo(sparc64_kern_pri_context)], %g2 + mov PRIMARY_CONTEXT, %g1 + stxa %g2, [%g1] ASI_DMMU + sethi %hi(KERNBASE), %g1 + flush %g1 + + or %g4, FAULT_CODE_WINFIXUP, %g4 + stb %g4, [%g6 + TI_FAULT_CODE] + stx %g5, [%g6 + TI_FAULT_ADDR] + + mov %g6, %l1 + wrpr %g0, 0x0, %tl + wrpr %g0, RTRAP_PSTATE, %pstate + +661: nop + .section .sun4v_1insn_patch, "ax" + .word 661b + SET_GL(0) + .previous + + mov %l1, %g6 + ldx [%g6 + TI_TASK], %g4 + LOAD_PER_CPU_BASE(%g5, %g6, %g1, %g2, %g3) + call do_sparc64_fault + add %sp, PTREGS_OFF, %o0 + ba,pt %xcc, rtrap + nop + +user_rtt_pre_restore: + add %g1, 1, %g1 + wrpr %g1, 0x0, %cwp + +user_rtt_restore: restore rdpr %canrestore, %g1 wrpr %g1, 0x0, %cleanwin retry nop -kern_rtt: restore +kern_rtt: rdpr %canrestore, %g1 + brz,pn %g1, kern_rtt_fill + nop +kern_rtt_restore: + restore retry + to_kernel: #ifdef CONFIG_PREEMPT ldsw [%g6 + TI_PRE_COUNT], %l5 diff --git a/arch/sparc64/kernel/setup.c b/arch/sparc64/kernel/setup.c index 054461e..d91f1dc 100644 --- a/arch/sparc64/kernel/setup.c +++ b/arch/sparc64/kernel/setup.c @@ -490,6 +490,100 @@ void register_prom_callbacks(void) "' linux-.soft2 to .soft2"); } +static void __init per_cpu_patch(void) +{ +#ifdef CONFIG_SMP + struct cpuid_patch_entry *p; + unsigned long ver; + int is_jbus; + + if (tlb_type == spitfire && !this_is_starfire) + return; + + __asm__ ("rdpr %%ver, %0" : "=r" (ver)); + is_jbus = ((ver >> 32) == __JALAPENO_ID || + (ver >> 32) == __SERRANO_ID); + + p = &__cpuid_patch; + while (p < &__cpuid_patch_end) { + unsigned long addr = p->addr; + unsigned int *insns; + + switch (tlb_type) { + case spitfire: + insns = &p->starfire[0]; + break; + case cheetah: + case cheetah_plus: + if (is_jbus) + insns = &p->cheetah_jbus[0]; + else + insns = &p->cheetah_safari[0]; + break; + case hypervisor: + insns = &p->sun4v[0]; + break; + default: + prom_printf("Unknown cpu type, halting.\n"); + prom_halt(); + }; + + *(unsigned int *) (addr + 0) = insns[0]; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 0)); + + *(unsigned int *) (addr + 4) = insns[1]; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 4)); + + *(unsigned int *) (addr + 8) = insns[2]; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 8)); + + *(unsigned int *) (addr + 12) = insns[3]; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 12)); + + p++; + } +#endif +} + +static void __init sun4v_patch(void) +{ + struct sun4v_1insn_patch_entry *p1; + struct sun4v_2insn_patch_entry *p2; + + if (tlb_type != hypervisor) + return; + + p1 = &__sun4v_1insn_patch; + while (p1 < &__sun4v_1insn_patch_end) { + unsigned long addr = p1->addr; + + *(unsigned int *) (addr + 0) = p1->insn; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 0)); + + p1++; + } + + p2 = &__sun4v_2insn_patch; + while (p2 < &__sun4v_2insn_patch_end) { + unsigned long addr = p2->addr; + + *(unsigned int *) (addr + 0) = p2->insns[0]; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 0)); + + *(unsigned int *) (addr + 3) = p2->insns[1]; + wmb(); + __asm__ __volatile__("flush %0" : : "r" (addr + 4)); + + p2++; + } +} + void __init setup_arch(char **cmdline_p) { /* Initialize PROM console and command line. */ @@ -507,6 +601,13 @@ void __init setup_arch(char **cmdline_p) /* Work out if we are starfire early on */ check_if_starfire(); + /* Now we know enough to patch the get_cpuid sequences + * used by trap code. + */ + per_cpu_patch(); + + sun4v_patch(); + boot_flags_init(*cmdline_p); idprom_init(); @@ -543,6 +644,9 @@ void __init setup_arch(char **cmdline_p) #endif paging_init(); + + /* Get boot processor trap_block[] setup. */ + init_cur_cpu_trap(); } static int __init set_preferred_console(void) diff --git a/arch/sparc64/kernel/smp.c b/arch/sparc64/kernel/smp.c index 1fb6323..24d775c 100644 --- a/arch/sparc64/kernel/smp.c +++ b/arch/sparc64/kernel/smp.c @@ -38,6 +38,7 @@ #include #include #include +#include extern void calibrate_delay(void); @@ -87,10 +88,6 @@ void __init smp_store_cpu_info(int id) cpu_data(id).clock_tick = prom_getintdefault(cpu_node, "clock-frequency", 0); - cpu_data(id).pgcache_size = 0; - cpu_data(id).pte_cache[0] = NULL; - cpu_data(id).pte_cache[1] = NULL; - cpu_data(id).pgd_cache = NULL; cpu_data(id).idle_volume = 1; cpu_data(id).dcache_size = prom_getintdefault(cpu_node, "dcache-size", @@ -119,28 +116,14 @@ static void smp_setup_percpu_timer(void) static volatile unsigned long callin_flag = 0; -extern void inherit_locked_prom_mappings(int save_p); - -static inline void cpu_setup_percpu_base(unsigned long cpu_id) -{ - __asm__ __volatile__("mov %0, %%g5\n\t" - "stxa %0, [%1] %2\n\t" - "membar #Sync" - : /* no outputs */ - : "r" (__per_cpu_offset(cpu_id)), - "r" (TSB_REG), "i" (ASI_IMMU)); -} - void __init smp_callin(void) { int cpuid = hard_smp_processor_id(); - inherit_locked_prom_mappings(0); + __local_per_cpu_offset = __per_cpu_offset(cpuid); __flush_tlb_all(); - cpu_setup_percpu_base(cpuid); - smp_setup_percpu_timer(); if (cheetah_pcache_forced_on) @@ -441,7 +424,7 @@ static __inline__ void spitfire_xcall_de static void cheetah_xcall_deliver(u64 data0, u64 data1, u64 data2, cpumask_t mask) { u64 pstate, ver; - int nack_busy_id, is_jalapeno; + int nack_busy_id, is_jbus; if (cpus_empty(mask)) return; @@ -451,7 +434,8 @@ static void cheetah_xcall_deliver(u64 da * derivative processor. */ __asm__ ("rdpr %%ver, %0" : "=r" (ver)); - is_jalapeno = ((ver >> 32) == 0x003e0016); + is_jbus = ((ver >> 32) == __JALAPENO_ID || + (ver >> 32) == __SERRANO_ID); __asm__ __volatile__("rdpr %%pstate, %0" : "=r" (pstate)); @@ -476,7 +460,7 @@ retry: for_each_cpu_mask(i, mask) { u64 target = (i << 14) | 0x70; - if (!is_jalapeno) + if (!is_jbus) target |= (nack_busy_id << 24); __asm__ __volatile__( "stxa %%g0, [%0] %1\n\t" @@ -529,7 +513,7 @@ retry: for_each_cpu_mask(i, mask) { u64 check_mask; - if (is_jalapeno) + if (is_jbus) check_mask = (0x2UL << (2*i)); else check_mask = (0x2UL << @@ -544,6 +528,11 @@ retry: } } +static void hypervisor_xcall_deliver(u64 data0, u64 data1, u64 data2, cpumask_t mask) +{ + /* XXX implement me */ +} + /* Send cross call to all processors mentioned in MASK * except self. */ @@ -557,8 +546,10 @@ static void smp_cross_call_masked(unsign if (tlb_type == spitfire) spitfire_xcall_deliver(data0, data1, data2, mask); - else + else if (tlb_type == cheetah || tlb_type == cheetah_plus) cheetah_xcall_deliver(data0, data1, data2, mask); + else + hypervisor_xcall_deliver(data0, data1, data2, mask); /* NOTE: Caller runs local copy on master. */ put_cpu(); @@ -594,11 +585,11 @@ extern unsigned long xcall_call_function * You must not call this function with disabled interrupts or from a * hardware interrupt handler or from a bottom half handler. */ -int smp_call_function(void (*func)(void *info), void *info, - int nonatomic, int wait) +static int smp_call_function_mask(void (*func)(void *info), void *info, + int nonatomic, int wait, cpumask_t mask) { struct call_data_struct data; - int cpus = num_online_cpus() - 1; + int cpus = cpus_weight(mask) - 1; long timeout; if (!cpus) @@ -616,7 +607,7 @@ int smp_call_function(void (*func)(void call_data = &data; - smp_cross_call(&xcall_call_function, 0, 0, 0); + smp_cross_call_masked(&xcall_call_function, 0, 0, 0, mask); /* * Wait for other cpus to complete function or at @@ -642,6 +633,13 @@ out_timeout: return 0; } +int smp_call_function(void (*func)(void *info), void *info, + int nonatomic, int wait) +{ + return smp_call_function_mask(func, info, nonatomic, wait, + cpu_online_map); +} + void smp_call_function_client(int irq, struct pt_regs *regs) { void (*func) (void *info) = call_data->func; @@ -659,11 +657,22 @@ void smp_call_function_client(int irq, s } } +static void tsb_sync(void *info) +{ + struct mm_struct *mm = info; + + if (current->active_mm == mm) + tsb_context_switch(mm); +} + +void smp_tsb_sync(struct mm_struct *mm) +{ + smp_call_function_mask(tsb_sync, mm, 0, 1, mm->cpu_vm_mask); +} + extern unsigned long xcall_flush_tlb_mm; extern unsigned long xcall_flush_tlb_pending; extern unsigned long xcall_flush_tlb_kernel_range; -extern unsigned long xcall_flush_tlb_all_spitfire; -extern unsigned long xcall_flush_tlb_all_cheetah; extern unsigned long xcall_report_regs; extern unsigned long xcall_receive_signal; @@ -693,11 +702,17 @@ static __inline__ void __local_flush_dca void smp_flush_dcache_page_impl(struct page *page, int cpu) { cpumask_t mask = cpumask_of_cpu(cpu); - int this_cpu = get_cpu(); + int this_cpu; + + if (tlb_type == hypervisor) + return; #ifdef CONFIG_DEBUG_DCFLUSH atomic_inc(&dcpage_flushes); #endif + + this_cpu = get_cpu(); + if (cpu == this_cpu) { __local_flush_dcache_page(page); } else if (cpu_online(cpu)) { @@ -713,7 +728,7 @@ void smp_flush_dcache_page_impl(struct p __pa(pg_addr), (u64) pg_addr, mask); - } else { + } else if (tlb_type == cheetah || tlb_type == cheetah_plus) { #ifdef DCACHE_ALIASING_POSSIBLE data0 = ((u64)&xcall_flush_dcache_page_cheetah); @@ -735,7 +750,12 @@ void flush_dcache_page_all(struct mm_str void *pg_addr = page_address(page); cpumask_t mask = cpu_online_map; u64 data0; - int this_cpu = get_cpu(); + int this_cpu; + + if (tlb_type == hypervisor) + return; + + this_cpu = get_cpu(); cpu_clear(this_cpu, mask); @@ -752,7 +772,7 @@ void flush_dcache_page_all(struct mm_str __pa(pg_addr), (u64) pg_addr, mask); - } else { + } else if (tlb_type == cheetah || tlb_type == cheetah_plus) { #ifdef DCACHE_ALIASING_POSSIBLE data0 = ((u64)&xcall_flush_dcache_page_cheetah); cheetah_xcall_deliver(data0, @@ -778,8 +798,10 @@ void smp_receive_signal(int cpu) if (tlb_type == spitfire) spitfire_xcall_deliver(data0, 0, 0, mask); - else + else if (tlb_type == cheetah || tlb_type == cheetah_plus) cheetah_xcall_deliver(data0, 0, 0, mask); + else if (tlb_type == hypervisor) + hypervisor_xcall_deliver(data0, 0, 0, mask); } } @@ -794,15 +816,6 @@ void smp_report_regs(void) smp_cross_call(&xcall_report_regs, 0, 0, 0); } -void smp_flush_tlb_all(void) -{ - if (tlb_type == spitfire) - smp_cross_call(&xcall_flush_tlb_all_spitfire, 0, 0, 0); - else - smp_cross_call(&xcall_flush_tlb_all_cheetah, 0, 0, 0); - __flush_tlb_all(); -} - /* We know that the window frames of the user have been flushed * to the stack before we get here because all callers of us * are flush_tlb_*() routines, and these run after flush_cache_*() @@ -944,24 +957,19 @@ void smp_release(void) * can service tlb flush xcalls... */ extern void prom_world(int); -extern void save_alternate_globals(unsigned long *); -extern void restore_alternate_globals(unsigned long *); + void smp_penguin_jailcell(int irq, struct pt_regs *regs) { - unsigned long global_save[24]; - clear_softint(1 << irq); preempt_disable(); __asm__ __volatile__("flushw"); - save_alternate_globals(global_save); prom_world(1); atomic_inc(&smp_capture_registry); membar_storeload_storestore(); while (penguins_are_doing_time) rmb(); - restore_alternate_globals(global_save); atomic_dec(&smp_capture_registry); prom_world(0); @@ -1107,12 +1115,15 @@ void __init smp_prepare_cpus(unsigned in void __devinit smp_prepare_boot_cpu(void) { - if (hard_smp_processor_id() >= NR_CPUS) { + int cpu = hard_smp_processor_id(); + + if (cpu >= NR_CPUS) { prom_printf("Serious problem, boot cpu id >= NR_CPUS\n"); prom_halt(); } - current_thread_info()->cpu = hard_smp_processor_id(); + current_thread_info()->cpu = cpu; + __local_per_cpu_offset = __per_cpu_offset(cpu); cpu_set(smp_processor_id(), cpu_online_map); cpu_set(smp_processor_id(), phys_cpu_present_map); @@ -1173,12 +1184,9 @@ void __init setup_per_cpu_areas(void) { unsigned long goal, size, i; char *ptr; - /* Created by linker magic */ - extern char __per_cpu_start[], __per_cpu_end[]; /* Copy section for each CPU (we discard the original) */ - goal = ALIGN(__per_cpu_end - __per_cpu_start, PAGE_SIZE); - + goal = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES); #ifdef CONFIG_MODULES if (goal < PERCPU_ENOUGH_ROOM) goal = PERCPU_ENOUGH_ROOM; @@ -1187,31 +1195,10 @@ void __init setup_per_cpu_areas(void) for (size = 1UL; size < goal; size <<= 1UL) __per_cpu_shift++; - /* Make sure the resulting __per_cpu_base value - * will fit in the 43-bit sign extended IMMU - * TSB register. - */ - ptr = __alloc_bootmem(size * NR_CPUS, PAGE_SIZE, - (unsigned long) __per_cpu_start); + ptr = alloc_bootmem(size * NR_CPUS); __per_cpu_base = ptr - __per_cpu_start; - if ((__per_cpu_shift < PAGE_SHIFT) || - (__per_cpu_base & ~PAGE_MASK) || - (__per_cpu_base != (((long) __per_cpu_base << 20) >> 20))) { - prom_printf("PER_CPU: Invalid layout, " - "ptr[%p] shift[%lx] base[%lx]\n", - ptr, __per_cpu_shift, __per_cpu_base); - prom_halt(); - } - for (i = 0; i < NR_CPUS; i++, ptr += size) memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); - - /* Finally, load in the boot cpu's base value. - * We abuse the IMMU TSB register for trap handler - * entry and exit loading of %g5. That is why it - * has to be page aligned. - */ - cpu_setup_percpu_base(hard_smp_processor_id()); } diff --git a/arch/sparc64/kernel/sparc64_ksyms.c b/arch/sparc64/kernel/sparc64_ksyms.c index 3c06bfb..f1f0137 100644 --- a/arch/sparc64/kernel/sparc64_ksyms.c +++ b/arch/sparc64/kernel/sparc64_ksyms.c @@ -241,10 +241,6 @@ EXPORT_SYMBOL(verify_compat_iovec); #endif EXPORT_SYMBOL(dump_fpu); -EXPORT_SYMBOL(pte_alloc_one_kernel); -#ifndef CONFIG_SMP -EXPORT_SYMBOL(pgt_quicklists); -#endif EXPORT_SYMBOL(put_fs_struct); /* math-emu wants this */ diff --git a/arch/sparc64/kernel/sun4v_tlb_miss.S b/arch/sparc64/kernel/sun4v_tlb_miss.S new file mode 100644 index 0000000..58ea5dd --- /dev/null +++ b/arch/sparc64/kernel/sun4v_tlb_miss.S @@ -0,0 +1,219 @@ +/* sun4v_tlb_miss.S: Sun4v TLB miss handlers. + * + * Copyright (C) 2006 + */ + + .text + .align 32 + +sun4v_itlb_miss: + /* Load CPU ID into %g3. */ + mov SCRATCHPAD_CPUID, %g1 + ldxa [%g1] ASI_SCRATCHPAD, %g3 + + /* Load UTSB reg into %g1. */ + ldxa [%g1 + %g1] ASI_SCRATCHPAD, %g1 + + /* Load &trap_block[smp_processor_id()] into %g2. */ + sethi %hi(trap_block), %g2 + or %g2, %lo(trap_block), %g2 + sllx %g3, TRAP_BLOCK_SZ_SHIFT, %g3 + add %g2, %g3, %g2 + + /* Create a TAG TARGET, "(vaddr>>22) | (ctx << 48)", in %g6. + * Branch if kernel TLB miss. The kernel TSB and user TSB miss + * code wants the missing virtual address in %g4, so that value + * cannot be modified through the entirety of this handler. + */ + ldx [%g2 + TRAP_PER_CPU_FAULT_INFO + HV_FAULT_I_ADDR_OFFSET], %g4 + ldx [%g2 + TRAP_PER_CPU_FAULT_INFO + HV_FAULT_I_CTX_OFFSET], %g5 + srlx %g4, 22, %g3 + sllx %g5, 48, %g6 + or %g6, %g3, %g6 + brz,pn %g5, kvmap_itlb_4v + nop + + /* Create TSB pointer. This is something like: + * + * index_mask = (512 << (tsb_reg & 0x7UL)) - 1UL; + * tsb_base = tsb_reg & ~0x7UL; + */ + and %g1, 0x7, %g3 + andn %g1, 0x7, %g1 + mov 512, %g7 + sllx %g7, %g3, %g7 + sub %g7, 1, %g7 + + /* TSB index mask is in %g7, tsb base is in %g1. Compute + * the TSB entry pointer into %g1: + * + * tsb_index = ((vaddr >> PAGE_SHIFT) & tsb_mask); + * tsb_ptr = tsb_base + (tsb_index * 16); + */ + srlx %g4, PAGE_SHIFT, %g3 + and %g3, %g7, %g3 + sllx %g3, 4, %g3 + add %g1, %g3, %g1 + + /* Load TSB tag/pte into %g2/%g3 and compare the tag. */ + ldda [%g1] ASI_QUAD_LDD_PHYS, %g2 + cmp %g2, %g6 + sethi %hi(_PAGE_EXEC), %g7 + bne,a,pn %xcc, tsb_miss_page_table_walk + mov FAULT_CODE_ITLB, %g3 + andcc %g3, %g7, %g0 + be,a,pn %xcc, tsb_do_fault + mov FAULT_CODE_ITLB, %g3 + + /* We have a valid entry, make hypervisor call to load + * I-TLB and return from trap. + * + * %g3: PTE + * %g4: vaddr + * %g6: TAG TARGET (only "CTX << 48" part matters) + */ +sun4v_itlb_load: + mov %o0, %g1 ! save %o0 + mov %o1, %g2 ! save %o1 + mov %o2, %g5 ! save %o2 + mov %o3, %g7 ! save %o3 + mov %g4, %o0 ! vaddr + srlx %g6, 48, %o1 ! ctx + mov %g3, %o2 ! PTE + mov HV_MMU_IMMU, %o3 ! flags + ta HV_MMU_MAP_ADDR_TRAP + mov %g1, %o0 ! restore %o0 + mov %g2, %o1 ! restore %o1 + mov %g5, %o2 ! restore %o2 + mov %g7, %o3 ! restore %o3 + + retry + +sun4v_dtlb_miss: + /* Load CPU ID into %g3. */ + mov SCRATCHPAD_CPUID, %g1 + ldxa [%g1] ASI_SCRATCHPAD, %g3 + + /* Load UTSB reg into %g1. */ + ldxa [%g1 + %g1] ASI_SCRATCHPAD, %g1 + + /* Load &trap_block[smp_processor_id()] into %g2. */ + sethi %hi(trap_block), %g2 + or %g2, %lo(trap_block), %g2 + sllx %g3, TRAP_BLOCK_SZ_SHIFT, %g3 + add %g2, %g3, %g2 + + /* Create a TAG TARGET, "(vaddr>>22) | (ctx << 48)", in %g6. + * Branch if kernel TLB miss. The kernel TSB and user TSB miss + * code wants the missing virtual address in %g4, so that value + * cannot be modified through the entirety of this handler. + */ + ldx [%g2 + TRAP_PER_CPU_FAULT_INFO + HV_FAULT_D_ADDR_OFFSET], %g4 + ldx [%g2 + TRAP_PER_CPU_FAULT_INFO + HV_FAULT_D_CTX_OFFSET], %g5 + srlx %g4, 22, %g3 + sllx %g5, 48, %g6 + or %g6, %g3, %g6 + brz,pn %g5, kvmap_dtlb_4v + nop + + /* Create TSB pointer. This is something like: + * + * index_mask = (512 << (tsb_reg & 0x7UL)) - 1UL; + * tsb_base = tsb_reg & ~0x7UL; + */ + and %g1, 0x7, %g3 + andn %g1, 0x7, %g1 + mov 512, %g7 + sllx %g7, %g3, %g7 + sub %g7, 1, %g7 + + /* TSB index mask is in %g7, tsb base is in %g1. Compute + * the TSB entry pointer into %g1: + * + * tsb_index = ((vaddr >> PAGE_SHIFT) & tsb_mask); + * tsb_ptr = tsb_base + (tsb_index * 16); + */ + srlx %g4, PAGE_SHIFT, %g3 + and %g3, %g7, %g3 + sllx %g3, 4, %g3 + add %g1, %g3, %g1 + + /* Load TSB tag/pte into %g2/%g3 and compare the tag. */ + ldda [%g1] ASI_QUAD_LDD_PHYS, %g2 + cmp %g2, %g6 + bne,a,pn %xcc, tsb_miss_page_table_walk + mov FAULT_CODE_ITLB, %g3 + + /* We have a valid entry, make hypervisor call to load + * D-TLB and return from trap. + * + * %g3: PTE + * %g4: vaddr + * %g6: TAG TARGET (only "CTX << 48" part matters) + */ +sun4v_dtlb_load: + mov %o0, %g1 ! save %o0 + mov %o1, %g2 ! save %o1 + mov %o2, %g5 ! save %o2 + mov %o3, %g7 ! save %o3 + mov %g4, %o0 ! vaddr + srlx %g6, 48, %o1 ! ctx + mov %g3, %o2 ! PTE + mov HV_MMU_DMMU, %o3 ! flags + ta HV_MMU_MAP_ADDR_TRAP + mov %g1, %o0 ! restore %o0 + mov %g2, %o1 ! restore %o1 + mov %g5, %o2 ! restore %o2 + mov %g7, %o3 ! restore %o3 + + retry + +sun4v_dtlb_prot: + /* Load CPU ID into %g3. */ + mov SCRATCHPAD_CPUID, %g1 + ldxa [%g1] ASI_SCRATCHPAD, %g3 + + /* Load &trap_block[smp_processor_id()] into %g2. */ + sethi %hi(trap_block), %g2 + or %g2, %lo(trap_block), %g2 + sllx %g3, TRAP_BLOCK_SZ_SHIFT, %g3 + add %g2, %g3, %g2 + + ldx [%g2 + TRAP_PER_CPU_FAULT_INFO + HV_FAULT_D_ADDR_OFFSET], %g5 + rdpr %tl, %g1 + cmp %g1, 1 + bgu,pn %xcc, winfix_trampoline + nop + ba,pt %xcc, sparc64_realfault_common + mov FAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4 + +#define BRANCH_ALWAYS 0x10680000 +#define NOP 0x01000000 +#define SUN4V_DO_PATCH(OLD, NEW) \ + sethi %hi(NEW), %g1; \ + or %g1, %lo(NEW), %g1; \ + sethi %hi(OLD), %g2; \ + or %g2, %lo(OLD), %g2; \ + sub %g1, %g2, %g1; \ + sethi %hi(BRANCH_ALWAYS), %g3; \ + srl %g1, 2, %g1; \ + or %g3, %lo(BRANCH_ALWAYS), %g3; \ + or %g3, %g1, %g3; \ + stw %g3, [%g2]; \ + sethi %hi(NOP), %g3; \ + or %g3, %lo(NOP), %g3; \ + stw %g3, [%g2 + 0x4]; \ + flush %g2; + + .globl sun4v_patch_tlb_handlers + .type sun4v_patch_tlb_handlers,#function +sun4v_patch_tlb_handlers: + SUN4V_DO_PATCH(tl0_iamiss, sun4v_itlb_miss) + SUN4V_DO_PATCH(tl1_iamiss, sun4v_itlb_miss) + SUN4V_DO_PATCH(tl0_damiss, sun4v_dtlb_miss) + SUN4V_DO_PATCH(tl1_damiss, sun4v_dtlb_miss) + SUN4V_DO_PATCH(tl0_daprot, sun4v_dtlb_prot) + SUN4V_DO_PATCH(tl1_daprot, sun4v_dtlb_prot) + retl + nop + .size sun4v_patch_tlb_handlers,.-sun4v_patch_tlb_handlers diff --git a/arch/sparc64/kernel/trampoline.S b/arch/sparc64/kernel/trampoline.S index 9478551..18c333f 100644 --- a/arch/sparc64/kernel/trampoline.S +++ b/arch/sparc64/kernel/trampoline.S @@ -287,54 +287,18 @@ do_unlock: wrpr %g0, 0, %wstate wrpr %g0, 0, %tl - /* Setup the trap globals, then we can resurface. */ - rdpr %pstate, %o1 - mov %g6, %o2 - wrpr %o1, PSTATE_AG, %pstate + /* Load TBA, then we can resurface. */ sethi %hi(sparc64_ttable_tl0), %g5 wrpr %g5, %tba - mov %o2, %g6 - - wrpr %o1, PSTATE_MG, %pstate -#define KERN_HIGHBITS ((_PAGE_VALID|_PAGE_SZ4MB)^0xfffff80000000000) -#define KERN_LOWBITS (_PAGE_CP | _PAGE_CV | _PAGE_P | _PAGE_W) - - mov TSB_REG, %g1 - stxa %g0, [%g1] ASI_DMMU - membar #Sync - mov TLB_SFSR, %g1 - sethi %uhi(KERN_HIGHBITS), %g2 - or %g2, %ulo(KERN_HIGHBITS), %g2 - sllx %g2, 32, %g2 - or %g2, KERN_LOWBITS, %g2 - BRANCH_IF_ANY_CHEETAH(g3,g7,9f) - - ba,pt %xcc, 1f - nop - -9: - sethi %uhi(VPTE_BASE_CHEETAH), %g3 - or %g3, %ulo(VPTE_BASE_CHEETAH), %g3 - ba,pt %xcc, 2f - sllx %g3, 32, %g3 -1: - sethi %uhi(VPTE_BASE_SPITFIRE), %g3 - or %g3, %ulo(VPTE_BASE_SPITFIRE), %g3 - sllx %g3, 32, %g3 - -2: - clr %g7 -#undef KERN_HIGHBITS -#undef KERN_LOWBITS - - wrpr %o1, 0x0, %pstate ldx [%g6 + TI_TASK], %g4 wrpr %g0, 0, %wstate call init_irqwork_curcpu nop + call init_cur_cpu_trap + nop /* Start using proper page size encodings in ctx register. */ sethi %hi(sparc64_kern_pri_context), %g3 diff --git a/arch/sparc64/kernel/traps.c b/arch/sparc64/kernel/traps.c index 8d44ae5..1c4744c 100644 --- a/arch/sparc64/kernel/traps.c +++ b/arch/sparc64/kernel/traps.c @@ -38,6 +38,7 @@ #include #include #include +#include #ifdef CONFIG_KMOD #include #endif @@ -788,7 +789,8 @@ void __init cheetah_ecache_flush_init(vo cheetah_error_log[i].afsr = CHAFSR_INVALID; __asm__ ("rdpr %%ver, %0" : "=r" (ver)); - if ((ver >> 32) == 0x003e0016) { + if ((ver >> 32) == __JALAPENO_ID || + (ver >> 32) == __SERRANO_ID) { cheetah_error_table = &__jalapeno_error_table[0]; cheetah_afsr_errors = JPAFSR_ERRORS; } else if ((ver >> 32) == 0x003e0015) { @@ -2130,7 +2132,22 @@ void do_getpsr(struct pt_regs *regs) } } +struct trap_per_cpu trap_block[NR_CPUS]; + +/* This can get invoked before sched_init() so play it super safe + * and use hard_smp_processor_id(). + */ +void init_cur_cpu_trap(void) +{ + int cpu = hard_smp_processor_id(); + struct trap_per_cpu *p = &trap_block[cpu]; + + p->thread = current_thread_info(); + p->pgd_paddr = 0; +} + extern void thread_info_offsets_are_bolixed_dave(void); +extern void trap_per_cpu_offsets_are_bolixed_dave(void); /* Only invoked on boot processor. */ void __init trap_init(void) @@ -2154,7 +2171,6 @@ void __init trap_init(void) TI_KERN_CNTD0 != offsetof(struct thread_info, kernel_cntd0) || TI_KERN_CNTD1 != offsetof(struct thread_info, kernel_cntd1) || TI_PCR != offsetof(struct thread_info, pcr_reg) || - TI_CEE_STUFF != offsetof(struct thread_info, cee_stuff) || TI_PRE_COUNT != offsetof(struct thread_info, preempt_count) || TI_NEW_CHILD != offsetof(struct thread_info, new_child) || TI_SYS_NOERROR != offsetof(struct thread_info, syscall_noerror) || @@ -2165,6 +2181,10 @@ void __init trap_init(void) (TI_FPREGS & (64 - 1))) thread_info_offsets_are_bolixed_dave(); + if (TRAP_PER_CPU_THREAD != offsetof(struct trap_per_cpu, thread) || + TRAP_PER_CPU_PGD_PADDR != offsetof(struct trap_per_cpu, pgd_paddr)) + trap_per_cpu_offsets_are_bolixed_dave(); + /* Attach to the address space of init_task. On SMP we * do this in smp.c:smp_callin for other cpus. */ diff --git a/arch/sparc64/kernel/tsb.S b/arch/sparc64/kernel/tsb.S new file mode 100644 index 0000000..819a6ef --- /dev/null +++ b/arch/sparc64/kernel/tsb.S @@ -0,0 +1,284 @@ +/* tsb.S: Sparc64 TSB table handling. + * + * Copyright (C) 2006 David S. Miller + */ + +#include + + .text + .align 32 + + /* Invoked from TLB miss handler, we are in the + * MMU global registers and they are setup like + * this: + * + * %g1: TSB entry pointer + * %g2: available temporary + * %g3: FAULT_CODE_{D,I}TLB + * %g4: available temporary + * %g5: available temporary + * %g6: TAG TARGET + * %g7: available temporary, will be loaded by us with + * the physical address base of the linux page + * tables for the current address space + */ +tsb_miss_dtlb: + mov TLB_TAG_ACCESS, %g4 + ldxa [%g4] ASI_DMMU, %g4 + ba,pt %xcc, tsb_miss_page_table_walk + nop + +tsb_miss_itlb: + mov TLB_TAG_ACCESS, %g4 + ldxa [%g4] ASI_IMMU, %g4 + ba,pt %xcc, tsb_miss_page_table_walk + nop + + /* The sun4v TLB miss handlers jump directly here instead + * of tsb_miss_{d,i}tlb with the missing virtual address + * already loaded into %g4. + */ +tsb_miss_page_table_walk: + TRAP_LOAD_PGD_PHYS(%g7, %g5) + + USER_PGTABLE_WALK_TL1(%g4, %g7, %g5, %g2, tsb_do_fault) + +tsb_reload: + TSB_LOCK_TAG(%g1, %g2, %g7) + + /* Load and check PTE. */ + ldxa [%g5] ASI_PHYS_USE_EC, %g5 + brgez,a,pn %g5, tsb_do_fault + TSB_STORE(%g1, %g0) + + /* If it is larger than the base page size, don't + * bother putting it into the TSB. + */ + srlx %g5, 32, %g2 + sethi %hi(_PAGE_ALL_SZ_BITS >> 32), %g7 + and %g2, %g7, %g2 + sethi %hi(_PAGE_SZBITS >> 32), %g7 + cmp %g2, %g7 + bne,a,pn %xcc, tsb_tlb_reload + TSB_STORE(%g1, %g0) + + TSB_WRITE(%g1, %g5, %g6) + + /* Finally, load TLB and return from trap. */ +tsb_tlb_reload: + cmp %g3, FAULT_CODE_DTLB + bne,pn %xcc, tsb_itlb_load + nop + +tsb_dtlb_load: + +661: stxa %g5, [%g0] ASI_DTLB_DATA_IN + retry + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + + /* For sun4v the ASI_DTLB_DATA_IN store and the retry + * instruction get nop'd out and we get here to branch + * to the sun4v tlb load code. The registers are setup + * as follows: + * + * %g4: vaddr + * %g5: PTE + * %g6: TAG + * + * The sun4v TLB load wants the PTE in %g3 so we fix that + * up here. + */ + ba,pt %xcc, sun4v_dtlb_load + mov %g5, %g3 + +tsb_itlb_load: + +661: stxa %g5, [%g0] ASI_ITLB_DATA_IN + retry + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + + /* For sun4v the ASI_ITLB_DATA_IN store and the retry + * instruction get nop'd out and we get here to branch + * to the sun4v tlb load code. The registers are setup + * as follows: + * + * %g4: vaddr + * %g5: PTE + * %g6: TAG + * + * The sun4v TLB load wants the PTE in %g3 so we fix that + * up here. + */ + ba,pt %xcc, sun4v_itlb_load + mov %g5, %g3 + + /* No valid entry in the page tables, do full fault + * processing. + */ + + .globl tsb_do_fault +tsb_do_fault: + cmp %g3, FAULT_CODE_DTLB + +661: rdpr %pstate, %g5 + wrpr %g5, PSTATE_AG | PSTATE_MG, %pstate + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + + bne,pn %xcc, tsb_do_itlb_fault + nop + +tsb_do_dtlb_fault: + rdpr %tl, %g3 + cmp %g3, 1 + +661: mov TLB_TAG_ACCESS, %g4 + ldxa [%g4] ASI_DMMU, %g5 + .section .sun4v_2insn_patch, "ax" + .word 661b + mov %g4, %g5 + nop + .previous + + be,pt %xcc, sparc64_realfault_common + mov FAULT_CODE_DTLB, %g4 + ba,pt %xcc, winfix_trampoline + nop + +tsb_do_itlb_fault: + rdpr %tpc, %g5 + ba,pt %xcc, sparc64_realfault_common + mov FAULT_CODE_ITLB, %g4 + + .globl sparc64_realfault_common +sparc64_realfault_common: + /* fault code in %g4, fault address in %g5, etrap will + * preserve these two values in %l4 and %l5 respectively + */ + ba,pt %xcc, etrap ! Save trap state +1: rd %pc, %g7 ! ... + stb %l4, [%g6 + TI_FAULT_CODE] ! Save fault code + stx %l5, [%g6 + TI_FAULT_ADDR] ! Save fault address + call do_sparc64_fault ! Call fault handler + add %sp, PTREGS_OFF, %o0 ! Compute pt_regs arg + ba,pt %xcc, rtrap_clr_l6 ! Restore cpu state + nop ! Delay slot (fill me) + +winfix_trampoline: + rdpr %tpc, %g3 ! Prepare winfixup TNPC + or %g3, 0x7c, %g3 ! Compute branch offset + wrpr %g3, %tnpc ! Write it into TNPC + done ! Trap return + + /* Insert an entry into the TSB. + * + * %o0: TSB entry pointer (virt or phys address) + * %o1: tag + * %o2: pte + */ + .align 32 + .globl __tsb_insert +__tsb_insert: + rdpr %pstate, %o5 + wrpr %o5, PSTATE_IE, %pstate + TSB_LOCK_TAG(%o0, %g2, %g3) + TSB_WRITE(%o0, %o2, %o1) + wrpr %o5, %pstate + retl + nop + + /* Flush the given TSB entry if it has the matching + * tag. + * + * %o0: TSB entry pointer (virt or phys address) + * %o1: tag + */ + .align 32 + .globl tsb_flush +tsb_flush: + sethi %hi(TSB_TAG_LOCK_HIGH), %g2 +1: TSB_LOAD_TAG(%o0, %g1) + srlx %g1, 32, %o3 + andcc %o3, %g2, %g0 + bne,pn %icc, 1b + membar #LoadLoad + cmp %g1, %o1 + bne,pt %xcc, 2f + clr %o3 + TSB_CAS_TAG(%o0, %g1, %o3) + cmp %g1, %o3 + bne,pn %xcc, 1b + nop +2: retl + TSB_MEMBAR + + /* Reload MMU related context switch state at + * schedule() time. + * + * %o0: page table physical address + * %o1: TSB register value + * %o2: TSB virtual address + * %o3: TSB mapping locked PTE + * + * We have to run this whole thing with interrupts + * disabled so that the current cpu doesn't change + * due to preemption. + */ + .align 32 + .globl __tsb_context_switch +__tsb_context_switch: + rdpr %pstate, %o5 + wrpr %o5, PSTATE_IE, %pstate + + ldub [%g6 + TI_CPU], %g1 + sethi %hi(trap_block), %g2 + sllx %g1, TRAP_BLOCK_SZ_SHIFT, %g1 + or %g2, %lo(trap_block), %g2 + add %g2, %g1, %g2 + stx %o0, [%g2 + TRAP_PER_CPU_PGD_PADDR] + +661: mov TSB_REG, %g1 + stxa %o1, [%g1] ASI_DMMU + .section .sun4v_2insn_patch, "ax" + .word 661b + mov SCRATCHPAD_UTSBREG1, %g1 + stxa %o1, [%g1] ASI_SCRATCHPAD + .previous + + membar #Sync + +661: stxa %o1, [%g1] ASI_IMMU + membar #Sync + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + + brz %o2, 9f + nop + + sethi %hi(sparc64_highest_unlocked_tlb_ent), %o4 + mov TLB_TAG_ACCESS, %g1 + lduw [%o4 + %lo(sparc64_highest_unlocked_tlb_ent)], %g2 + stxa %o2, [%g1] ASI_DMMU + membar #Sync + sllx %g2, 3, %g2 + stxa %o3, [%g2] ASI_DTLB_DATA_ACCESS + membar #Sync +9: + wrpr %o5, %pstate + + retl + nop diff --git a/arch/sparc64/kernel/ttable.S b/arch/sparc64/kernel/ttable.S index 8365bc1..2679b6e 100644 --- a/arch/sparc64/kernel/ttable.S +++ b/arch/sparc64/kernel/ttable.S @@ -78,9 +78,9 @@ tl0_vaw: TRAP(do_vaw) tl0_cee: membar #Sync TRAP_NOSAVE_7INSNS(__spitfire_cee_trap) tl0_iamiss: -#include "itlb_base.S" +#include "itlb_miss.S" tl0_damiss: -#include "dtlb_base.S" +#include "dtlb_miss.S" tl0_daprot: #include "dtlb_prot.S" tl0_fecc: BTRAP(0x70) /* Fast-ECC on Cheetah */ @@ -92,11 +92,11 @@ tl0_resv07c: BTRAP(0x7c) BTRAP(0x7d) BTR tl0_s0n: SPILL_0_NORMAL tl0_s1n: SPILL_1_NORMAL tl0_s2n: SPILL_2_NORMAL -tl0_s3n: SPILL_3_NORMAL -tl0_s4n: SPILL_4_NORMAL -tl0_s5n: SPILL_5_NORMAL -tl0_s6n: SPILL_6_NORMAL -tl0_s7n: SPILL_7_NORMAL +tl0_s3n: SPILL_0_NORMAL_ETRAP +tl0_s4n: SPILL_1_GENERIC_ETRAP +tl0_s5n: SPILL_1_GENERIC_ETRAP_FIXUP +tl0_s6n: SPILL_2_GENERIC_ETRAP +tl0_s7n: SPILL_2_GENERIC_ETRAP_FIXUP tl0_s0o: SPILL_0_OTHER tl0_s1o: SPILL_1_OTHER tl0_s2o: SPILL_2_OTHER @@ -110,9 +110,9 @@ tl0_f1n: FILL_1_NORMAL tl0_f2n: FILL_2_NORMAL tl0_f3n: FILL_3_NORMAL tl0_f4n: FILL_4_NORMAL -tl0_f5n: FILL_5_NORMAL -tl0_f6n: FILL_6_NORMAL -tl0_f7n: FILL_7_NORMAL +tl0_f5n: FILL_0_NORMAL_RTRAP +tl0_f6n: FILL_1_GENERIC_RTRAP +tl0_f7n: FILL_2_GENERIC_RTRAP tl0_f0o: FILL_0_OTHER tl0_f1o: FILL_1_OTHER tl0_f2o: FILL_2_OTHER @@ -128,7 +128,7 @@ tl0_flushw: FLUSH_WINDOW_TRAP tl0_resv104: BTRAP(0x104) BTRAP(0x105) BTRAP(0x106) BTRAP(0x107) .globl tl0_solaris tl0_solaris: SOLARIS_SYSCALL_TRAP -tl0_netbsd: NETBSD_SYSCALL_TRAP +tl0_resv109: BTRAP(0x109) tl0_resv10a: BTRAP(0x10a) BTRAP(0x10b) BTRAP(0x10c) BTRAP(0x10d) BTRAP(0x10e) tl0_resv10f: BTRAP(0x10f) tl0_linux32: LINUX_32BIT_SYSCALL_TRAP @@ -222,26 +222,10 @@ tl1_resv05c: BTRAPTL1(0x5c) BTRAPTL1(0x5 tl1_ivec: TRAP_IVEC tl1_paw: TRAPTL1(do_paw_tl1) tl1_vaw: TRAPTL1(do_vaw_tl1) - - /* The grotty trick to save %g1 into current->thread.cee_stuff - * is because when we take this trap we could be interrupting - * trap code already using the trap alternate global registers. - * - * We cross our fingers and pray that this store/load does - * not cause yet another CEE trap. - */ -tl1_cee: membar #Sync - stx %g1, [%g6 + TI_CEE_STUFF] - ldxa [%g0] ASI_AFSR, %g1 - membar #Sync - stxa %g1, [%g0] ASI_AFSR - membar #Sync - ldx [%g6 + TI_CEE_STUFF], %g1 - retry - +tl1_cee: BTRAPTL1(0x63) tl1_iamiss: BTRAPTL1(0x64) BTRAPTL1(0x65) BTRAPTL1(0x66) BTRAPTL1(0x67) tl1_damiss: -#include "dtlb_backend.S" +#include "dtlb_miss.S" tl1_daprot: #include "dtlb_prot.S" tl1_fecc: BTRAPTL1(0x70) /* Fast-ECC on Cheetah */ diff --git a/arch/sparc64/kernel/vmlinux.lds.S b/arch/sparc64/kernel/vmlinux.lds.S index 467d13a..b097379 100644 --- a/arch/sparc64/kernel/vmlinux.lds.S +++ b/arch/sparc64/kernel/vmlinux.lds.S @@ -70,6 +70,22 @@ SECTIONS .con_initcall.init : { *(.con_initcall.init) } __con_initcall_end = .; SECURITY_INIT + . = ALIGN(4); + __tsb_ldquad_phys_patch = .; + .tsb_ldquad_phys_patch : { *(.tsb_ldquad_phys_patch) } + __tsb_ldquad_phys_patch_end = .; + __tsb_phys_patch = .; + .tsb_phys_patch : { *(.tsb_phys_patch) } + __tsb_phys_patch_end = .; + __cpuid_patch = .; + .cpuid_patch : { *(.cpuid_patch) } + __cpuid_patch_end = .; + __sun4v_1insn_patch = .; + .sun4v_1insn_patch : { *(.sun4v_1insn_patch) } + __sun4v_1insn_patch_end = .; + __sun4v_2insn_patch = .; + .sun4v_2insn_patch : { *(.sun4v_2insn_patch) } + __sun4v_2insn_patch_end = .; . = ALIGN(8192); __initramfs_start = .; .init.ramfs : { *(.init.ramfs) } diff --git a/arch/sparc64/kernel/winfixup.S b/arch/sparc64/kernel/winfixup.S index 3916092..efe2770 100644 --- a/arch/sparc64/kernel/winfixup.S +++ b/arch/sparc64/kernel/winfixup.S @@ -1,8 +1,6 @@ -/* $Id: winfixup.S,v 1.30 2002/02/09 19:49:30 davem Exp $ +/* winfixup.S: Handle cases where user stack pointer is found to be bogus. * - * winfixup.S: Handle cases where user stack pointer is found to be bogus. - * - * Copyright (C) 1997 David S. Miller (davem@caip.rutgers.edu) + * Copyright (C) 1997, 2006 David S. Miller (davem@davemloft.net) */ #include @@ -15,374 +13,129 @@ .text -set_pcontext: - sethi %hi(sparc64_kern_pri_context), %l1 - ldx [%l1 + %lo(sparc64_kern_pri_context)], %l1 - mov PRIMARY_CONTEXT, %g1 - stxa %l1, [%g1] ASI_DMMU - flush %g6 - retl - nop + /* It used to be the case that these register window fault + * handlers could run via the save and restore instructions + * done by the trap entry and exit code. They now do the + * window spill/fill by hand, so that case no longer can occur. + */ .align 32 - - /* Here are the rules, pay attention. - * - * The kernel is disallowed from touching user space while - * the trap level is greater than zero, except for from within - * the window spill/fill handlers. This must be followed - * so that we can easily detect the case where we tried to - * spill/fill with a bogus (or unmapped) user stack pointer. - * - * These are layed out in a special way for cache reasons, - * don't touch... - */ - .globl fill_fixup, spill_fixup fill_fixup: - rdpr %tstate, %g1 - andcc %g1, TSTATE_PRIV, %g0 - or %g4, FAULT_CODE_WINFIXUP, %g4 - be,pt %xcc, window_scheisse_from_user_common - and %g1, TSTATE_CWP, %g1 - - /* This is the extremely complex case, but it does happen from - * time to time if things are just right. Essentially the restore - * done in rtrap right before going back to user mode, with tl=1 - * and that levels trap stack registers all setup, took a fill trap, - * the user stack was not mapped in the tlb, and tlb miss occurred, - * the pte found was not valid, and a simple ref bit watch update - * could not satisfy the miss, so we got here. - * - * We must carefully unwind the state so we get back to tl=0, preserve - * all the register values we were going to give to the user. Luckily - * most things are where they need to be, we also have the address - * which triggered the fault handy as well. - * - * Also note that we must preserve %l5 and %l6. If the user was - * returning from a system call, we must make it look this way - * after we process the fill fault on the users stack. - * - * First, get into the window where the original restore was executed. - */ - - rdpr %wstate, %g2 ! Grab user mode wstate. - wrpr %g1, %cwp ! Get into the right window. - sll %g2, 3, %g2 ! NORMAL-->OTHER - - wrpr %g0, 0x0, %canrestore ! Standard etrap stuff. - wrpr %g2, 0x0, %wstate ! This must be consistent. - wrpr %g0, 0x0, %otherwin ! We know this. - call set_pcontext ! Change contexts... + TRAP_LOAD_THREAD_REG(%g6, %g1) + rdpr %tstate, %g1 + and %g1, TSTATE_CWP, %g1 + or %g4, FAULT_CODE_WINFIXUP, %g4 + stb %g4, [%g6 + TI_FAULT_CODE] + stx %g5, [%g6 + TI_FAULT_ADDR] + wrpr %g1, %cwp + ba,pt %xcc, etrap + rd %pc, %g7 + call do_sparc64_fault + add %sp, PTREGS_OFF, %o0 + ba,pt %xcc, rtrap_clr_l6 nop - rdpr %pstate, %l1 ! Prepare to change globals. - mov %g6, %o7 ! Get current. - - andn %l1, PSTATE_MM, %l1 ! We want to be in RMO - stb %g4, [%g6 + TI_FAULT_CODE] - stx %g5, [%g6 + TI_FAULT_ADDR] - wrpr %g0, 0x0, %tl ! Out of trap levels. - wrpr %l1, (PSTATE_IE | PSTATE_AG | PSTATE_RMO), %pstate - mov %o7, %g6 - ldx [%g6 + TI_TASK], %g4 -#ifdef CONFIG_SMP - mov TSB_REG, %g1 - ldxa [%g1] ASI_IMMU, %g5 -#endif - /* This is the same as below, except we handle this a bit special - * since we must preserve %l5 and %l6, see comment above. - */ - call do_sparc64_fault - add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop ! yes, nop is correct - - /* Be very careful about usage of the alternate globals here. - * You cannot touch %g4/%g5 as that has the fault information - * should this be from usermode. Also be careful for the case - * where we get here from the save instruction in etrap.S when - * coming from either user or kernel (does not matter which, it - * is the same problem in both cases). Essentially this means - * do not touch %g7 or %g2 so we handle the two cases fine. + /* Be very careful about usage of the trap globals here. + * You cannot touch %g5 as that has the fault information. */ spill_fixup: - ldx [%g6 + TI_FLAGS], %g1 - andcc %g1, _TIF_32BIT, %g0 - ldub [%g6 + TI_WSAVED], %g1 - - sll %g1, 3, %g3 - add %g6, %g3, %g3 - stx %sp, [%g3 + TI_RWIN_SPTRS] - sll %g1, 7, %g3 - bne,pt %xcc, 1f - add %g6, %g3, %g3 - stx %l0, [%g3 + TI_REG_WINDOW + 0x00] - stx %l1, [%g3 + TI_REG_WINDOW + 0x08] - - stx %l2, [%g3 + TI_REG_WINDOW + 0x10] - stx %l3, [%g3 + TI_REG_WINDOW + 0x18] - stx %l4, [%g3 + TI_REG_WINDOW + 0x20] - stx %l5, [%g3 + TI_REG_WINDOW + 0x28] - stx %l6, [%g3 + TI_REG_WINDOW + 0x30] - stx %l7, [%g3 + TI_REG_WINDOW + 0x38] - stx %i0, [%g3 + TI_REG_WINDOW + 0x40] - stx %i1, [%g3 + TI_REG_WINDOW + 0x48] - - stx %i2, [%g3 + TI_REG_WINDOW + 0x50] - stx %i3, [%g3 + TI_REG_WINDOW + 0x58] - stx %i4, [%g3 + TI_REG_WINDOW + 0x60] - stx %i5, [%g3 + TI_REG_WINDOW + 0x68] - stx %i6, [%g3 + TI_REG_WINDOW + 0x70] - b,pt %xcc, 2f - stx %i7, [%g3 + TI_REG_WINDOW + 0x78] -1: stw %l0, [%g3 + TI_REG_WINDOW + 0x00] - - stw %l1, [%g3 + TI_REG_WINDOW + 0x04] - stw %l2, [%g3 + TI_REG_WINDOW + 0x08] - stw %l3, [%g3 + TI_REG_WINDOW + 0x0c] - stw %l4, [%g3 + TI_REG_WINDOW + 0x10] - stw %l5, [%g3 + TI_REG_WINDOW + 0x14] - stw %l6, [%g3 + TI_REG_WINDOW + 0x18] - stw %l7, [%g3 + TI_REG_WINDOW + 0x1c] - stw %i0, [%g3 + TI_REG_WINDOW + 0x20] - - stw %i1, [%g3 + TI_REG_WINDOW + 0x24] - stw %i2, [%g3 + TI_REG_WINDOW + 0x28] - stw %i3, [%g3 + TI_REG_WINDOW + 0x2c] - stw %i4, [%g3 + TI_REG_WINDOW + 0x30] - stw %i5, [%g3 + TI_REG_WINDOW + 0x34] - stw %i6, [%g3 + TI_REG_WINDOW + 0x38] - stw %i7, [%g3 + TI_REG_WINDOW + 0x3c] -2: add %g1, 1, %g1 - - stb %g1, [%g6 + TI_WSAVED] - rdpr %tstate, %g1 - andcc %g1, TSTATE_PRIV, %g0 +spill_fixup_mna: +spill_fixup_dax: + TRAP_LOAD_THREAD_REG(%g6, %g1) + ldx [%g6 + TI_FLAGS], %g1 + andcc %g1, _TIF_32BIT, %g0 + ldub [%g6 + TI_WSAVED], %g1 + sll %g1, 3, %g3 + add %g6, %g3, %g3 + stx %sp, [%g3 + TI_RWIN_SPTRS] + sll %g1, 7, %g3 + bne,pt %xcc, 1f + add %g6, %g3, %g3 + stx %l0, [%g3 + TI_REG_WINDOW + 0x00] + stx %l1, [%g3 + TI_REG_WINDOW + 0x08] + stx %l2, [%g3 + TI_REG_WINDOW + 0x10] + stx %l3, [%g3 + TI_REG_WINDOW + 0x18] + stx %l4, [%g3 + TI_REG_WINDOW + 0x20] + stx %l5, [%g3 + TI_REG_WINDOW + 0x28] + stx %l6, [%g3 + TI_REG_WINDOW + 0x30] + stx %l7, [%g3 + TI_REG_WINDOW + 0x38] + stx %i0, [%g3 + TI_REG_WINDOW + 0x40] + stx %i1, [%g3 + TI_REG_WINDOW + 0x48] + stx %i2, [%g3 + TI_REG_WINDOW + 0x50] + stx %i3, [%g3 + TI_REG_WINDOW + 0x58] + stx %i4, [%g3 + TI_REG_WINDOW + 0x60] + stx %i5, [%g3 + TI_REG_WINDOW + 0x68] + stx %i6, [%g3 + TI_REG_WINDOW + 0x70] + ba,pt %xcc, 2f + stx %i7, [%g3 + TI_REG_WINDOW + 0x78] +1: stw %l0, [%g3 + TI_REG_WINDOW + 0x00] + stw %l1, [%g3 + TI_REG_WINDOW + 0x04] + stw %l2, [%g3 + TI_REG_WINDOW + 0x08] + stw %l3, [%g3 + TI_REG_WINDOW + 0x0c] + stw %l4, [%g3 + TI_REG_WINDOW + 0x10] + stw %l5, [%g3 + TI_REG_WINDOW + 0x14] + stw %l6, [%g3 + TI_REG_WINDOW + 0x18] + stw %l7, [%g3 + TI_REG_WINDOW + 0x1c] + stw %i0, [%g3 + TI_REG_WINDOW + 0x20] + stw %i1, [%g3 + TI_REG_WINDOW + 0x24] + stw %i2, [%g3 + TI_REG_WINDOW + 0x28] + stw %i3, [%g3 + TI_REG_WINDOW + 0x2c] + stw %i4, [%g3 + TI_REG_WINDOW + 0x30] + stw %i5, [%g3 + TI_REG_WINDOW + 0x34] + stw %i6, [%g3 + TI_REG_WINDOW + 0x38] + stw %i7, [%g3 + TI_REG_WINDOW + 0x3c] +2: add %g1, 1, %g1 + stb %g1, [%g6 + TI_WSAVED] + rdpr %tstate, %g1 + andcc %g1, TSTATE_PRIV, %g0 saved - and %g1, TSTATE_CWP, %g1 - be,pn %xcc, window_scheisse_from_user_common - mov FAULT_CODE_WRITE | FAULT_CODE_DTLB | FAULT_CODE_WINFIXUP, %g4 + be,pn %xcc, 1f + and %g1, TSTATE_CWP, %g1 retry +1: mov FAULT_CODE_WRITE | FAULT_CODE_DTLB | FAULT_CODE_WINFIXUP, %g4 + stb %g4, [%g6 + TI_FAULT_CODE] + stx %g5, [%g6 + TI_FAULT_ADDR] + wrpr %g1, %cwp + ba,pt %xcc, etrap + rd %pc, %g7 + call do_sparc64_fault + add %sp, PTREGS_OFF, %o0 + ba,a,pt %xcc, rtrap_clr_l6 -window_scheisse_from_user_common: - stb %g4, [%g6 + TI_FAULT_CODE] - stx %g5, [%g6 + TI_FAULT_ADDR] - wrpr %g1, %cwp - ba,pt %xcc, etrap - rd %pc, %g7 - call do_sparc64_fault - add %sp, PTREGS_OFF, %o0 - ba,a,pt %xcc, rtrap_clr_l6 - - .globl winfix_mna, fill_fixup_mna, spill_fixup_mna winfix_mna: - andn %g3, 0x7f, %g3 - add %g3, 0x78, %g3 - wrpr %g3, %tnpc + andn %g3, 0x7f, %g3 + add %g3, 0x78, %g3 + wrpr %g3, %tnpc done -fill_fixup_mna: - rdpr %tstate, %g1 - andcc %g1, TSTATE_PRIV, %g0 - be,pt %xcc, window_mna_from_user_common - and %g1, TSTATE_CWP, %g1 - - /* Please, see fill_fixup commentary about why we must preserve - * %l5 and %l6 to preserve absolute correct semantics. - */ - rdpr %wstate, %g2 ! Grab user mode wstate. - wrpr %g1, %cwp ! Get into the right window. - sll %g2, 3, %g2 ! NORMAL-->OTHER - wrpr %g0, 0x0, %canrestore ! Standard etrap stuff. - - wrpr %g2, 0x0, %wstate ! This must be consistent. - wrpr %g0, 0x0, %otherwin ! We know this. - call set_pcontext ! Change contexts... - nop - rdpr %pstate, %l1 ! Prepare to change globals. - mov %g4, %o2 ! Setup args for - mov %g5, %o1 ! final call to mem_address_unaligned. - andn %l1, PSTATE_MM, %l1 ! We want to be in RMO - mov %g6, %o7 ! Stash away current. - wrpr %g0, 0x0, %tl ! Out of trap levels. - wrpr %l1, (PSTATE_IE | PSTATE_AG | PSTATE_RMO), %pstate - mov %o7, %g6 ! Get current back. - ldx [%g6 + TI_TASK], %g4 ! Finish it. -#ifdef CONFIG_SMP - mov TSB_REG, %g1 - ldxa [%g1] ASI_IMMU, %g5 -#endif - call mem_address_unaligned - add %sp, PTREGS_OFF, %o0 - - b,pt %xcc, rtrap - nop ! yes, the nop is correct -spill_fixup_mna: - ldx [%g6 + TI_FLAGS], %g1 - andcc %g1, _TIF_32BIT, %g0 - ldub [%g6 + TI_WSAVED], %g1 - sll %g1, 3, %g3 - add %g6, %g3, %g3 - stx %sp, [%g3 + TI_RWIN_SPTRS] - - sll %g1, 7, %g3 - bne,pt %xcc, 1f - add %g6, %g3, %g3 - stx %l0, [%g3 + TI_REG_WINDOW + 0x00] - stx %l1, [%g3 + TI_REG_WINDOW + 0x08] - stx %l2, [%g3 + TI_REG_WINDOW + 0x10] - stx %l3, [%g3 + TI_REG_WINDOW + 0x18] - stx %l4, [%g3 + TI_REG_WINDOW + 0x20] - - stx %l5, [%g3 + TI_REG_WINDOW + 0x28] - stx %l6, [%g3 + TI_REG_WINDOW + 0x30] - stx %l7, [%g3 + TI_REG_WINDOW + 0x38] - stx %i0, [%g3 + TI_REG_WINDOW + 0x40] - stx %i1, [%g3 + TI_REG_WINDOW + 0x48] - stx %i2, [%g3 + TI_REG_WINDOW + 0x50] - stx %i3, [%g3 + TI_REG_WINDOW + 0x58] - stx %i4, [%g3 + TI_REG_WINDOW + 0x60] - - stx %i5, [%g3 + TI_REG_WINDOW + 0x68] - stx %i6, [%g3 + TI_REG_WINDOW + 0x70] - stx %i7, [%g3 + TI_REG_WINDOW + 0x78] - b,pt %xcc, 2f - add %g1, 1, %g1 -1: std %l0, [%g3 + TI_REG_WINDOW + 0x00] - std %l2, [%g3 + TI_REG_WINDOW + 0x08] - std %l4, [%g3 + TI_REG_WINDOW + 0x10] - - std %l6, [%g3 + TI_REG_WINDOW + 0x18] - std %i0, [%g3 + TI_REG_WINDOW + 0x20] - std %i2, [%g3 + TI_REG_WINDOW + 0x28] - std %i4, [%g3 + TI_REG_WINDOW + 0x30] - std %i6, [%g3 + TI_REG_WINDOW + 0x38] - add %g1, 1, %g1 -2: stb %g1, [%g6 + TI_WSAVED] - rdpr %tstate, %g1 +fill_fixup_mna: + TRAP_LOAD_THREAD_REG(%g6, %g1) + rdpr %tstate, %g1 + and %g1, TSTATE_CWP, %g1 + wrpr %g1, %cwp + ba,pt %xcc, etrap + rd %pc, %g7 + mov %l4, %o2 + mov %l5, %o1 + call mem_address_unaligned + add %sp, PTREGS_OFF, %o0 + ba,a,pt %xcc, rtrap_clr_l6 - andcc %g1, TSTATE_PRIV, %g0 - saved - be,pn %xcc, window_mna_from_user_common - and %g1, TSTATE_CWP, %g1 - retry -window_mna_from_user_common: - wrpr %g1, %cwp - sethi %hi(109f), %g7 - ba,pt %xcc, etrap -109: or %g7, %lo(109b), %g7 - mov %l4, %o2 - mov %l5, %o1 - call mem_address_unaligned - add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - clr %l6 - - /* These are only needed for 64-bit mode processes which - * put their stack pointer into the VPTE area and there - * happens to be a VPTE tlb entry mapped there during - * a spill/fill trap to that stack frame. - */ - .globl winfix_dax, fill_fixup_dax, spill_fixup_dax winfix_dax: - andn %g3, 0x7f, %g3 - add %g3, 0x74, %g3 - wrpr %g3, %tnpc + andn %g3, 0x7f, %g3 + add %g3, 0x74, %g3 + wrpr %g3, %tnpc done -fill_fixup_dax: - rdpr %tstate, %g1 - andcc %g1, TSTATE_PRIV, %g0 - be,pt %xcc, window_dax_from_user_common - and %g1, TSTATE_CWP, %g1 - - /* Please, see fill_fixup commentary about why we must preserve - * %l5 and %l6 to preserve absolute correct semantics. - */ - rdpr %wstate, %g2 ! Grab user mode wstate. - wrpr %g1, %cwp ! Get into the right window. - sll %g2, 3, %g2 ! NORMAL-->OTHER - wrpr %g0, 0x0, %canrestore ! Standard etrap stuff. - wrpr %g2, 0x0, %wstate ! This must be consistent. - wrpr %g0, 0x0, %otherwin ! We know this. - call set_pcontext ! Change contexts... - nop - rdpr %pstate, %l1 ! Prepare to change globals. - mov %g4, %o1 ! Setup args for - mov %g5, %o2 ! final call to spitfire_data_access_exception. - andn %l1, PSTATE_MM, %l1 ! We want to be in RMO - - mov %g6, %o7 ! Stash away current. - wrpr %g0, 0x0, %tl ! Out of trap levels. - wrpr %l1, (PSTATE_IE | PSTATE_AG | PSTATE_RMO), %pstate - mov %o7, %g6 ! Get current back. - ldx [%g6 + TI_TASK], %g4 ! Finish it. -#ifdef CONFIG_SMP - mov TSB_REG, %g1 - ldxa [%g1] ASI_IMMU, %g5 -#endif - call spitfire_data_access_exception - add %sp, PTREGS_OFF, %o0 - - b,pt %xcc, rtrap - nop ! yes, the nop is correct -spill_fixup_dax: - ldx [%g6 + TI_FLAGS], %g1 - andcc %g1, _TIF_32BIT, %g0 - ldub [%g6 + TI_WSAVED], %g1 - sll %g1, 3, %g3 - add %g6, %g3, %g3 - stx %sp, [%g3 + TI_RWIN_SPTRS] - - sll %g1, 7, %g3 - bne,pt %xcc, 1f - add %g6, %g3, %g3 - stx %l0, [%g3 + TI_REG_WINDOW + 0x00] - stx %l1, [%g3 + TI_REG_WINDOW + 0x08] - stx %l2, [%g3 + TI_REG_WINDOW + 0x10] - stx %l3, [%g3 + TI_REG_WINDOW + 0x18] - stx %l4, [%g3 + TI_REG_WINDOW + 0x20] - - stx %l5, [%g3 + TI_REG_WINDOW + 0x28] - stx %l6, [%g3 + TI_REG_WINDOW + 0x30] - stx %l7, [%g3 + TI_REG_WINDOW + 0x38] - stx %i0, [%g3 + TI_REG_WINDOW + 0x40] - stx %i1, [%g3 + TI_REG_WINDOW + 0x48] - stx %i2, [%g3 + TI_REG_WINDOW + 0x50] - stx %i3, [%g3 + TI_REG_WINDOW + 0x58] - stx %i4, [%g3 + TI_REG_WINDOW + 0x60] - - stx %i5, [%g3 + TI_REG_WINDOW + 0x68] - stx %i6, [%g3 + TI_REG_WINDOW + 0x70] - stx %i7, [%g3 + TI_REG_WINDOW + 0x78] - b,pt %xcc, 2f - add %g1, 1, %g1 -1: std %l0, [%g3 + TI_REG_WINDOW + 0x00] - std %l2, [%g3 + TI_REG_WINDOW + 0x08] - std %l4, [%g3 + TI_REG_WINDOW + 0x10] - - std %l6, [%g3 + TI_REG_WINDOW + 0x18] - std %i0, [%g3 + TI_REG_WINDOW + 0x20] - std %i2, [%g3 + TI_REG_WINDOW + 0x28] - std %i4, [%g3 + TI_REG_WINDOW + 0x30] - std %i6, [%g3 + TI_REG_WINDOW + 0x38] - add %g1, 1, %g1 -2: stb %g1, [%g6 + TI_WSAVED] - rdpr %tstate, %g1 - - andcc %g1, TSTATE_PRIV, %g0 - saved - be,pn %xcc, window_dax_from_user_common - and %g1, TSTATE_CWP, %g1 - retry -window_dax_from_user_common: - wrpr %g1, %cwp - sethi %hi(109f), %g7 - ba,pt %xcc, etrap -109: or %g7, %lo(109b), %g7 - mov %l4, %o1 - mov %l5, %o2 - call spitfire_data_access_exception - add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - clr %l6 +fill_fixup_dax: + TRAP_LOAD_THREAD_REG(%g6, %g1) + rdpr %tstate, %g1 + and %g1, TSTATE_CWP, %g1 + wrpr %g1, %cwp + ba,pt %xcc, etrap + rd %pc, %g7 + mov %l4, %o1 + mov %l5, %o2 + call spitfire_data_access_exception + add %sp, PTREGS_OFF, %o0 + ba,a,pt %xcc, rtrap_clr_l6 diff --git a/arch/sparc64/lib/Makefile b/arch/sparc64/lib/Makefile index c295806..3d0e9a2 100644 --- a/arch/sparc64/lib/Makefile +++ b/arch/sparc64/lib/Makefile @@ -11,6 +11,8 @@ lib-y := PeeCeeI.o copy_page.o clear_pag VISsave.o atomic.o bitops.o \ U1memcpy.o U1copy_from_user.o U1copy_to_user.o \ U3memcpy.o U3copy_from_user.o U3copy_to_user.o U3patch.o \ + NGmemcpy.o NGcopy_from_user.o NGcopy_to_user.o NGpatch.o \ + NGpage.o \ copy_in_user.o user_fixup.o memmove.o \ mcount.o ipcsum.o rwsem.o xor.o find_bit.o delay.o diff --git a/arch/sparc64/lib/NGcopy_from_user.S b/arch/sparc64/lib/NGcopy_from_user.S new file mode 100644 index 0000000..dd50f19 --- /dev/null +++ b/arch/sparc64/lib/NGcopy_from_user.S @@ -0,0 +1,36 @@ +/* NGcopy_from_user.S: Niagara optimized copy from userspace. + * + * Copyright (C) 2006 David S. Miller (davem@davemloft.net) + */ + +#define EX_LD(x) \ +98: x; \ + .section .fixup; \ + .align 4; \ +99: retl; \ + mov 1, %o0; \ + .section __ex_table; \ + .align 4; \ + .word 98b, 99b; \ + .text; \ + .align 4; + +#ifndef ASI_AIUS +#define ASI_AIUS 0x11 +#endif + +#define FUNC_NAME NGcopy_from_user +#define LOAD(type,addr,dest) type##a [addr] ASI_AIUS, dest +#define LOAD_TWIN(addr_reg,dest0,dest1) \ + ldda [addr_reg] ASI_BLK_INIT_QUAD_LDD_AIUS, dest0 +#define EX_RETVAL(x) 0 + +#ifdef __KERNEL__ +#define PREAMBLE \ + rd %asi, %g1; \ + cmp %g1, ASI_AIUS; \ + bne,pn %icc, memcpy_user_stub; \ + nop +#endif + +#include "NGmemcpy.S" diff --git a/arch/sparc64/lib/NGcopy_to_user.S b/arch/sparc64/lib/NGcopy_to_user.S new file mode 100644 index 0000000..4536f24 --- /dev/null +++ b/arch/sparc64/lib/NGcopy_to_user.S @@ -0,0 +1,39 @@ +/* NGcopy_to_user.S: Niagara optimized copy to userspace. + * + * Copyright (C) 2006 David S. Miller (davem@davemloft.net) + */ + +#define EX_ST(x) \ +98: x; \ + .section .fixup; \ + .align 4; \ +99: retl; \ + mov 1, %o0; \ + .section __ex_table; \ + .align 4; \ + .word 98b, 99b; \ + .text; \ + .align 4; + +#ifndef ASI_AIUS +#define ASI_AIUS 0x11 +#endif + +#define FUNC_NAME NGcopy_to_user +#define STORE(type,src,addr) type##a src, [addr] ASI_AIUS +#define STORE_ASI ASI_BLK_INIT_QUAD_LDD_AIUS +#define EX_RETVAL(x) 0 + +#ifdef __KERNEL__ + /* Writing to %asi is _expensive_ so we hardcode it. + * Reading %asi to check for KERNEL_DS is comparatively + * cheap. + */ +#define PREAMBLE \ + rd %asi, %g1; \ + cmp %g1, ASI_AIUS; \ + bne,pn %icc, memcpy_user_stub; \ + nop +#endif + +#include "U3memcpy.S" diff --git a/arch/sparc64/lib/NGmemcpy.S b/arch/sparc64/lib/NGmemcpy.S new file mode 100644 index 0000000..a39aa3b --- /dev/null +++ b/arch/sparc64/lib/NGmemcpy.S @@ -0,0 +1,364 @@ +/* NGmemcpy.S: Niagara optimized memcpy. + * + * Copyright (C) 2006 David S. Miller (davem@davemloft.net) + */ + +#ifdef __KERNEL__ +#include +#define GLOBAL_SPARE %g7 +#define RESTORE_ASI wr %g0, ASI_AIUS, %asi +#else +#define GLOBAL_SPARE %g5 +#define RESTORE_ASI +#endif + +#ifndef STORE_ASI +#define STORE_ASI ASI_BLK_INIT_QUAD_LDD_P +#endif + +#ifndef EX_LD +#define EX_LD(x) x +#endif + +#ifndef EX_ST +#define EX_ST(x) x +#endif + +#ifndef EX_RETVAL +#define EX_RETVAL(x) x +#endif + +#ifndef LOAD +#ifndef MEMCPY_DEBUG +#define LOAD(type,addr,dest) type [addr], dest +#else +#define LOAD(type,addr,dest) type##a [addr] 0x80, dest +#endif +#endif + +#ifndef LOAD_TWIN +#define LOAD_TWIN(addr_reg,dest0,dest1) \ + ldda [addr_reg] ASI_BLK_INIT_QUAD_LDD_P, dest0 +#endif + +#ifndef STORE +#define STORE(type,src,addr) type src, [addr] +#endif + +#ifndef STORE_INIT +#define STORE_INIT(src,addr) stxa src, [addr] %asi +#endif + +#ifndef FUNC_NAME +#define FUNC_NAME NGmemcpy +#endif + +#ifndef PREAMBLE +#define PREAMBLE +#endif + +#ifndef XCC +#define XCC xcc +#endif + + .register %g2,#scratch + .register %g3,#scratch + + .text + .align 64 + + .globl FUNC_NAME + .type FUNC_NAME,#function +FUNC_NAME: /* %o0=dst, %o1=src, %o2=len */ + srlx %o2, 31, %g2 + cmp %g2, 0 + tne %xcc, 5 + PREAMBLE + mov %o0, GLOBAL_SPARE + cmp %o2, 0 + be,pn %XCC, 85f + or %o0, %o1, %o3 + cmp %o2, 16 + blu,a,pn %XCC, 80f + or %o3, %o2, %o3 + + /* 2 blocks (128 bytes) is the minimum we can do the block + * copy with. We need to ensure that we'll iterate at least + * once in the block copy loop. At worst we'll need to align + * the destination to a 64-byte boundary which can chew up + * to (64 - 1) bytes from the length before we perform the + * block copy loop. + */ + cmp %o2, (2 * 64) + blu,pt %XCC, 70f + andcc %o3, 0x7, %g0 + + /* %o0: dst + * %o1: src + * %o2: len (known to be >= 128) + * + * The block copy loops will use %o4/%o5,%g2/%g3 as + * temporaries while copying the data. + */ + + LOAD(prefetch, %o1, #one_read) + wr %g0, STORE_ASI, %asi + + /* Align destination on 64-byte boundary. */ + andcc %o0, (64 - 1), %o4 + be,pt %XCC, 2f + sub %o4, 64, %o4 + sub %g0, %o4, %o4 ! bytes to align dst + sub %o2, %o4, %o2 +1: subcc %o4, 1, %o4 + EX_LD(LOAD(ldub, %o1, %g1)) + EX_ST(STORE(stb, %g1, %o0)) + add %o1, 1, %o1 + bne,pt %XCC, 1b + add %o0, 1, %o0 + + /* If the source is on a 16-byte boundary we can do + * the direct block copy loop. If it is 8-byte aligned + * we can do the 16-byte loads offset by -8 bytes and the + * init stores offset by one register. + * + * If the source is not even 8-byte aligned, we need to do + * shifting and masking (basically integer faligndata). + * + * The careful bit with init stores is that if we store + * to any part of the cache line we have to store the whole + * cacheline else we can end up with corrupt L2 cache line + * contents. Since the loop works on 64-bytes of 64-byte + * aligned store data at a time, this is easy to ensure. + */ +2: + andcc %o1, (16 - 1), %o4 + andn %o2, (64 - 1), %g1 ! block copy loop iterator + sub %o2, %g1, %o2 ! final sub-block copy bytes + be,pt %XCC, 50f + cmp %o4, 8 + be,a,pt %XCC, 10f + sub %o1, 0x8, %o1 + + /* Neither 8-byte nor 16-byte aligned, shift and mask. */ + mov %g1, %o4 + and %o1, 0x7, %g1 + sll %g1, 3, %g1 + mov 64, %o3 + andn %o1, 0x7, %o1 + EX_LD(LOAD(ldx, %o1, %g2)) + sub %o3, %g1, %o3 + sllx %g2, %g1, %g2 + +#define SWIVEL_ONE_DWORD(SRC, TMP1, TMP2, PRE_VAL, PRE_SHIFT, POST_SHIFT, DST)\ + EX_LD(LOAD(ldx, SRC, TMP1)); \ + srlx TMP1, PRE_SHIFT, TMP2; \ + or TMP2, PRE_VAL, TMP2; \ + EX_ST(STORE_INIT(TMP2, DST)); \ + sllx TMP1, POST_SHIFT, PRE_VAL; + +1: add %o1, 0x8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x00) + add %o1, 0x8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x08) + add %o1, 0x8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x10) + add %o1, 0x8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x18) + add %o1, 32, %o1 + LOAD(prefetch, %o1, #one_read) + sub %o1, 32 - 8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x20) + add %o1, 8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x28) + add %o1, 8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x30) + add %o1, 8, %o1 + SWIVEL_ONE_DWORD(%o1, %g3, %o5, %g2, %o3, %g1, %o0 + 0x38) + subcc %o4, 64, %o4 + bne,pt %XCC, 1b + add %o0, 64, %o0 + +#undef SWIVEL_ONE_DWORD + + srl %g1, 3, %g1 + ba,pt %XCC, 60f + add %o1, %g1, %o1 + +10: /* Destination is 64-byte aligned, source was only 8-byte + * aligned but it has been subtracted by 8 and we perform + * one twin load ahead, then add 8 back into source when + * we finish the loop. + */ + EX_LD(LOAD_TWIN(%o1, %o4, %o5)) +1: add %o1, 16, %o1 + EX_LD(LOAD_TWIN(%o1, %g2, %g3)) + add %o1, 16 + 32, %o1 + LOAD(prefetch, %o1, #one_read) + sub %o1, 32, %o1 + EX_ST(STORE_INIT(%o5, %o0 + 0x00)) ! initializes cache line + EX_ST(STORE_INIT(%g2, %o0 + 0x08)) + EX_LD(LOAD_TWIN(%o1, %o4, %o5)) + add %o1, 16, %o1 + EX_ST(STORE_INIT(%g3, %o0 + 0x10)) + EX_ST(STORE_INIT(%o4, %o0 + 0x18)) + EX_LD(LOAD_TWIN(%o1, %g2, %g3)) + add %o1, 16, %o1 + EX_ST(STORE_INIT(%o5, %o0 + 0x20)) + EX_ST(STORE_INIT(%g2, %o0 + 0x28)) + EX_LD(LOAD_TWIN(%o1, %o4, %o5)) + EX_ST(STORE_INIT(%g3, %o0 + 0x30)) + EX_ST(STORE_INIT(%o4, %o0 + 0x38)) + subcc %g1, 64, %g1 + bne,pt %XCC, 1b + add %o0, 64, %o0 + + ba,pt %XCC, 60f + add %o1, 0x8, %o1 + +50: /* Destination is 64-byte aligned, and source is 16-byte + * aligned. + */ +1: EX_LD(LOAD_TWIN(%o1, %o4, %o5)) + add %o1, 16, %o1 + EX_LD(LOAD_TWIN(%o1, %g2, %g3)) + add %o1, 16 + 32, %o1 + LOAD(prefetch, %o1, #one_read) + sub %o1, 32, %o1 + EX_ST(STORE_INIT(%o4, %o0 + 0x00)) ! initializes cache line + EX_ST(STORE_INIT(%o5, %o0 + 0x08)) + EX_LD(LOAD_TWIN(%o1, %o4, %o5)) + add %o1, 16, %o1 + EX_ST(STORE_INIT(%g2, %o0 + 0x10)) + EX_ST(STORE_INIT(%g3, %o0 + 0x18)) + EX_LD(LOAD_TWIN(%o1, %g2, %g3)) + add %o1, 16, %o1 + EX_ST(STORE_INIT(%o4, %o0 + 0x20)) + EX_ST(STORE_INIT(%o5, %o0 + 0x28)) + EX_ST(STORE_INIT(%g2, %o0 + 0x30)) + EX_ST(STORE_INIT(%g3, %o0 + 0x38)) + subcc %g1, 64, %g1 + bne,pt %XCC, 1b + add %o0, 64, %o0 + /* fall through */ + +60: + /* %o2 contains any final bytes still needed to be copied + * over. If anything is left, we copy it one byte at a time. + */ + RESTORE_ASI + brz,pt %o2, 85f + sub %o0, %o1, %o3 + ba,a,pt %XCC, 90f + + .align 64 +70: /* 16 < len <= 64 */ + bne,pn %XCC, 75f + sub %o0, %o1, %o3 + +72: + andn %o2, 0xf, %o4 + and %o2, 0xf, %o2 +1: subcc %o4, 0x10, %o4 + EX_LD(LOAD(ldx, %o1, %o5)) + add %o1, 0x08, %o1 + EX_LD(LOAD(ldx, %o1, %g1)) + sub %o1, 0x08, %o1 + EX_ST(STORE(stx, %o5, %o1 + %o3)) + add %o1, 0x8, %o1 + EX_ST(STORE(stx, %g1, %o1 + %o3)) + bgu,pt %XCC, 1b + add %o1, 0x8, %o1 +73: andcc %o2, 0x8, %g0 + be,pt %XCC, 1f + nop + sub %o2, 0x8, %o2 + EX_LD(LOAD(ldx, %o1, %o5)) + EX_ST(STORE(stx, %o5, %o1 + %o3)) + add %o1, 0x8, %o1 +1: andcc %o2, 0x4, %g0 + be,pt %XCC, 1f + nop + sub %o2, 0x4, %o2 + EX_LD(LOAD(lduw, %o1, %o5)) + EX_ST(STORE(stw, %o5, %o1 + %o3)) + add %o1, 0x4, %o1 +1: cmp %o2, 0 + be,pt %XCC, 85f + nop + ba,pt %xcc, 90f + nop + +75: + andcc %o0, 0x7, %g1 + sub %g1, 0x8, %g1 + be,pn %icc, 2f + sub %g0, %g1, %g1 + sub %o2, %g1, %o2 + +1: subcc %g1, 1, %g1 + EX_LD(LOAD(ldub, %o1, %o5)) + EX_ST(STORE(stb, %o5, %o1 + %o3)) + bgu,pt %icc, 1b + add %o1, 1, %o1 + +2: add %o1, %o3, %o0 + andcc %o1, 0x7, %g1 + bne,pt %icc, 8f + sll %g1, 3, %g1 + + cmp %o2, 16 + bgeu,pt %icc, 72b + nop + ba,a,pt %xcc, 73b + +8: mov 64, %o3 + andn %o1, 0x7, %o1 + EX_LD(LOAD(ldx, %o1, %g2)) + sub %o3, %g1, %o3 + andn %o2, 0x7, %o4 + sllx %g2, %g1, %g2 +1: add %o1, 0x8, %o1 + EX_LD(LOAD(ldx, %o1, %g3)) + subcc %o4, 0x8, %o4 + srlx %g3, %o3, %o5 + or %o5, %g2, %o5 + EX_ST(STORE(stx, %o5, %o0)) + add %o0, 0x8, %o0 + bgu,pt %icc, 1b + sllx %g3, %g1, %g2 + + srl %g1, 3, %g1 + andcc %o2, 0x7, %o2 + be,pn %icc, 85f + add %o1, %g1, %o1 + ba,pt %xcc, 90f + sub %o0, %o1, %o3 + + .align 64 +80: /* 0 < len <= 16 */ + andcc %o3, 0x3, %g0 + bne,pn %XCC, 90f + sub %o0, %o1, %o3 + +1: + subcc %o2, 4, %o2 + EX_LD(LOAD(lduw, %o1, %g1)) + EX_ST(STORE(stw, %g1, %o1 + %o3)) + bgu,pt %XCC, 1b + add %o1, 4, %o1 + +85: retl + mov EX_RETVAL(GLOBAL_SPARE), %o0 + + .align 32 +90: + subcc %o2, 1, %o2 + EX_LD(LOAD(ldub, %o1, %g1)) + EX_ST(STORE(stb, %g1, %o1 + %o3)) + bgu,pt %XCC, 90b + add %o1, 1, %o1 + retl + mov EX_RETVAL(GLOBAL_SPARE), %o0 + + .size FUNC_NAME, .-FUNC_NAME diff --git a/arch/sparc64/lib/NGpage.S b/arch/sparc64/lib/NGpage.S new file mode 100644 index 0000000..0e6152c --- /dev/null +++ b/arch/sparc64/lib/NGpage.S @@ -0,0 +1,95 @@ +/* NGpage.S: Niagara optimize clear and copy page. + * + * Copyright (C) 2006 (davem@davemloft.net) + */ + +#include +#include + + .text + .align 32 + + /* This is heavily simplified from the sun4u variants + * because Niagara does not have any D-cache aliasing issues + * and also we don't need to use the FPU in order to implement + * an optimal page copy/clear. + */ + +NGcopy_user_page: /* %o0=dest, %o1=src, %o2=vaddr */ + prefetch [%o1 + 0x00], #one_read + mov 8, %g1 + mov 16, %g2 + mov 24, %g3 + set PAGE_SIZE, %g7 + +1: ldda [%o1 + %g0] ASI_BLK_INIT_QUAD_LDD_P, %o2 + ldda [%o1 + %g2] ASI_BLK_INIT_QUAD_LDD_P, %o4 + prefetch [%o1 + 0x40], #one_read + add %o1, 32, %o1 + stxa %o2, [%o0 + %g0] ASI_BLK_INIT_QUAD_LDD_P + stxa %o3, [%o0 + %g1] ASI_BLK_INIT_QUAD_LDD_P + ldda [%o1 + %g0] ASI_BLK_INIT_QUAD_LDD_P, %o2 + stxa %o4, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P + stxa %o5, [%o0 + %g3] ASI_BLK_INIT_QUAD_LDD_P + ldda [%o1 + %g2] ASI_BLK_INIT_QUAD_LDD_P, %o4 + add %o1, 32, %o1 + add %o0, 32, %o0 + stxa %o2, [%o0 + %g0] ASI_BLK_INIT_QUAD_LDD_P + stxa %o3, [%o0 + %g1] ASI_BLK_INIT_QUAD_LDD_P + stxa %o4, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P + stxa %o5, [%o0 + %g3] ASI_BLK_INIT_QUAD_LDD_P + subcc %g7, 64, %g7 + bne,pt %xcc, 1b + add %o0, 32, %o0 + retl + nop + +NGclear_page: /* %o0=dest */ +NGclear_user_page: /* %o0=dest, %o1=vaddr */ + mov 8, %g1 + mov 16, %g2 + mov 24, %g3 + set PAGE_SIZE, %g7 + +1: stxa %g0, [%o0 + %g0] ASI_BLK_INIT_QUAD_LDD_P + stxa %g0, [%o0 + %g1] ASI_BLK_INIT_QUAD_LDD_P + stxa %g0, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P + stxa %g0, [%o0 + %g3] ASI_BLK_INIT_QUAD_LDD_P + add %o0, 32, %o0 + stxa %g0, [%o0 + %g0] ASI_BLK_INIT_QUAD_LDD_P + stxa %g0, [%o0 + %g1] ASI_BLK_INIT_QUAD_LDD_P + stxa %g0, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P + stxa %g0, [%o0 + %g3] ASI_BLK_INIT_QUAD_LDD_P + subcc %g7, 64, %g7 + bne,pt %xcc, 1b + add %o0, 32, %o0 + retl + nop + +#define BRANCH_ALWAYS 0x10680000 +#define NOP 0x01000000 +#define NG_DO_PATCH(OLD, NEW) \ + sethi %hi(NEW), %g1; \ + or %g1, %lo(NEW), %g1; \ + sethi %hi(OLD), %g2; \ + or %g2, %lo(OLD), %g2; \ + sub %g1, %g2, %g1; \ + sethi %hi(BRANCH_ALWAYS), %g3; \ + srl %g1, 2, %g1; \ + or %g3, %lo(BRANCH_ALWAYS), %g3; \ + or %g3, %g1, %g3; \ + stw %g3, [%g2]; \ + sethi %hi(NOP), %g3; \ + or %g3, %lo(NOP), %g3; \ + stw %g3, [%g2 + 0x4]; \ + flush %g2; + + .globl niagara_patch_pageops + .type niagara_patch_pageops,#function +niagara_patch_pageops: + NG_DO_PATCH(copy_user_page, NGcopy_user_page) + NG_DO_PATCH(_clear_page, NGclear_page) + NG_DO_PATCH(clear_user_page, NGclear_user_page) + retl + nop + .size niagara_patch_pageops,.-niagara_patch_pageops diff --git a/arch/sparc64/lib/NGpatch.S b/arch/sparc64/lib/NGpatch.S new file mode 100644 index 0000000..f13ec9e --- /dev/null +++ b/arch/sparc64/lib/NGpatch.S @@ -0,0 +1,32 @@ +/* NGpatch.S: Patch Ultra-I routines with Niagara variant. + * + * Copyright (C) 2006 David S. Miller + */ + +#define BRANCH_ALWAYS 0x10680000 +#define NOP 0x01000000 +#define NG_DO_PATCH(OLD, NEW) \ + sethi %hi(NEW), %g1; \ + or %g1, %lo(NEW), %g1; \ + sethi %hi(OLD), %g2; \ + or %g2, %lo(OLD), %g2; \ + sub %g1, %g2, %g1; \ + sethi %hi(BRANCH_ALWAYS), %g3; \ + srl %g1, 2, %g1; \ + or %g3, %lo(BRANCH_ALWAYS), %g3; \ + or %g3, %g1, %g3; \ + stw %g3, [%g2]; \ + sethi %hi(NOP), %g3; \ + or %g3, %lo(NOP), %g3; \ + stw %g3, [%g2 + 0x4]; \ + flush %g2; + + .globl niagara_patch_copyops + .type niagara_patch_copyops,#function +niagara_patch_copyops: + NG_DO_PATCH(memcpy, NGmemcpy) + NG_DO_PATCH(___copy_from_user, NGcopy_from_user) + NG_DO_PATCH(___copy_to_user, NGcopy_to_user) + retl + nop + .size niagara_patch_copyops,.-niagara_patch_copyops diff --git a/arch/sparc64/lib/clear_page.S b/arch/sparc64/lib/clear_page.S index b59884e..cdc634b 100644 --- a/arch/sparc64/lib/clear_page.S +++ b/arch/sparc64/lib/clear_page.S @@ -9,6 +9,7 @@ #include #include #include +#include /* What we used to do was lock a TLB entry into a specific * TLB slot, clear the page with interrupts disabled, then @@ -66,7 +67,8 @@ clear_user_page: /* %o0=dest, %o1=vaddr wrpr %o4, PSTATE_IE, %pstate stxa %o0, [%g3] ASI_DMMU stxa %g1, [%g0] ASI_DTLB_DATA_IN - flush %g6 + sethi %hi(KERNBASE), %g1 + flush %g1 wrpr %o4, 0x0, %pstate mov 1, %o4 diff --git a/arch/sparc64/mm/Makefile b/arch/sparc64/mm/Makefile index 9d0960e..e415bf9 100644 --- a/arch/sparc64/mm/Makefile +++ b/arch/sparc64/mm/Makefile @@ -5,6 +5,6 @@ EXTRA_AFLAGS := -ansi EXTRA_CFLAGS := -Werror -obj-y := ultra.o tlb.o fault.o init.o generic.o +obj-y := ultra.o tlb.o tsb.o fault.o init.o generic.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c index 1e44ee2..e9aac42 100644 --- a/arch/sparc64/mm/init.c +++ b/arch/sparc64/mm/init.c @@ -39,6 +39,7 @@ #include #include #include +#include extern void device_scan(void); @@ -141,24 +142,25 @@ unsigned long sparc64_kern_sec_context _ int bigkernel = 0; -/* XXX Tune this... */ -#define PGT_CACHE_LOW 25 -#define PGT_CACHE_HIGH 50 - -void check_pgt_cache(void) -{ - preempt_disable(); - if (pgtable_cache_size > PGT_CACHE_HIGH) { - do { - if (pgd_quicklist) - free_pgd_slow(get_pgd_fast()); - if (pte_quicklist[0]) - free_pte_slow(pte_alloc_one_fast(NULL, 0)); - if (pte_quicklist[1]) - free_pte_slow(pte_alloc_one_fast(NULL, 1 << (PAGE_SHIFT + 10))); - } while (pgtable_cache_size > PGT_CACHE_LOW); +kmem_cache_t *pgtable_cache __read_mostly; + +static void zero_ctor(void *addr, kmem_cache_t *cache, unsigned long flags) +{ + clear_page(addr); +} + +void pgtable_cache_init(void) +{ + pgtable_cache = kmem_cache_create("pgtable_cache", + PAGE_SIZE, PAGE_SIZE, + SLAB_HWCACHE_ALIGN | + SLAB_MUST_HWCACHE_ALIGN, + zero_ctor, + NULL); + if (!pgtable_cache) { + prom_printf("pgtable_cache_init(): Could not create!\n"); + prom_halt(); } - preempt_enable(); } #ifdef CONFIG_DEBUG_DCFLUSH @@ -243,8 +245,19 @@ static __inline__ void clear_dcache_dirt : "g1", "g7"); } +static inline void tsb_insert(struct tsb *ent, unsigned long tag, unsigned long pte) +{ + unsigned long tsb_addr = (unsigned long) ent; + + if (tlb_type == cheetah_plus) + tsb_addr = __pa(tsb_addr); + + __tsb_insert(tsb_addr, tag, pte); +} + void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t pte) { + struct mm_struct *mm; struct page *page; unsigned long pfn; unsigned long pg_flags; @@ -269,6 +282,17 @@ void update_mmu_cache(struct vm_area_str put_cpu(); } + + mm = vma->vm_mm; + if ((pte_val(pte) & _PAGE_ALL_SZ_BITS) == _PAGE_SZBITS) { + struct tsb *tsb; + unsigned long tag; + + tsb = &mm->context.tsb[(address >> PAGE_SHIFT) & + (mm->context.tsb_nentries - 1UL)]; + tag = (address >> 22UL) | CTX_HWBITS(mm->context) << 48UL; + tsb_insert(tsb, tag, pte_val(pte)); + } } void flush_dcache_page(struct page *page) @@ -311,7 +335,7 @@ out: void __kprobes flush_icache_range(unsigned long start, unsigned long end) { - /* Cheetah has coherent I-cache. */ + /* Cheetah and Hypervisor platform cpus have coherent I-cache. */ if (tlb_type == spitfire) { unsigned long kaddr; @@ -338,7 +362,6 @@ void show_mem(void) nr_swap_pages << (PAGE_SHIFT-10)); printk("%ld pages of RAM\n", num_physpages); printk("%d free pages\n", nr_free_pages()); - printk("%d pages in page table cache\n",pgtable_cache_size); } void mmu_info(struct seq_file *m) @@ -349,6 +372,8 @@ void mmu_info(struct seq_file *m) seq_printf(m, "MMU Type\t: Cheetah+\n"); else if (tlb_type == spitfire) seq_printf(m, "MMU Type\t: Spitfire\n"); + else if (tlb_type == hypervisor) + seq_printf(m, "MMU Type\t: Hypervisor (sun4v)\n"); else seq_printf(m, "MMU Type\t: ???\n"); @@ -371,7 +396,6 @@ struct linux_prom_translation { /* Exported for kernel TLB miss handling in ktlb.S */ struct linux_prom_translation prom_trans[512] __read_mostly; unsigned int prom_trans_ents __read_mostly; -unsigned int swapper_pgd_zero __read_mostly; extern unsigned long prom_boot_page; extern void prom_remap(unsigned long physpage, unsigned long virtpage, int mmu_ihandle); @@ -408,8 +432,7 @@ unsigned long prom_virt_to_phys(unsigned /* The obp translations are saved based on 8k pagesize, since obp can * use a mixture of pagesizes. Misses to the LOW_OBP_ADDRESS -> - * HI_OBP_ADDRESS range are handled in ktlb.S and do not use the vpte - * scheme (also, see rant in inherit_locked_prom_mappings()). + * HI_OBP_ADDRESS range are handled in ktlb.S. */ static inline int in_obp_range(unsigned long vaddr) { @@ -539,366 +562,12 @@ static void __init inherit_prom_mappings prom_printf("done.\n"); } -/* The OBP specifications for sun4u mark 0xfffffffc00000000 and - * upwards as reserved for use by the firmware (I wonder if this - * will be the same on Cheetah...). We use this virtual address - * range for the VPTE table mappings of the nucleus so we need - * to zap them when we enter the PROM. -DaveM - */ -static void __flush_nucleus_vptes(void) -{ - unsigned long prom_reserved_base = 0xfffffffc00000000UL; - int i; - - /* Only DTLB must be checked for VPTE entries. */ - if (tlb_type == spitfire) { - for (i = 0; i < 63; i++) { - unsigned long tag; - - /* Spitfire Errata #32 workaround */ - /* NOTE: Always runs on spitfire, so no cheetah+ - * page size encodings. - */ - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "flush %%g6" - : /* No outputs */ - : "r" (0), - "r" (PRIMARY_CONTEXT), "i" (ASI_DMMU)); - - tag = spitfire_get_dtlb_tag(i); - if (((tag & ~(PAGE_MASK)) == 0) && - ((tag & (PAGE_MASK)) >= prom_reserved_base)) { - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : /* no outputs */ - : "r" (TLB_TAG_ACCESS), "i" (ASI_DMMU)); - spitfire_put_dtlb_data(i, 0x0UL); - } - } - } else if (tlb_type == cheetah || tlb_type == cheetah_plus) { - for (i = 0; i < 512; i++) { - unsigned long tag = cheetah_get_dtlb_tag(i, 2); - - if ((tag & ~PAGE_MASK) == 0 && - (tag & PAGE_MASK) >= prom_reserved_base) { - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : /* no outputs */ - : "r" (TLB_TAG_ACCESS), "i" (ASI_DMMU)); - cheetah_put_dtlb_data(i, 0x0UL, 2); - } - - if (tlb_type != cheetah_plus) - continue; - - tag = cheetah_get_dtlb_tag(i, 3); - - if ((tag & ~PAGE_MASK) == 0 && - (tag & PAGE_MASK) >= prom_reserved_base) { - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : /* no outputs */ - : "r" (TLB_TAG_ACCESS), "i" (ASI_DMMU)); - cheetah_put_dtlb_data(i, 0x0UL, 3); - } - } - } else { - /* Implement me :-) */ - BUG(); - } -} - -static int prom_ditlb_set; -struct prom_tlb_entry { - int tlb_ent; - unsigned long tlb_tag; - unsigned long tlb_data; -}; -struct prom_tlb_entry prom_itlb[16], prom_dtlb[16]; - void prom_world(int enter) { - unsigned long pstate; - int i; - if (!enter) set_fs((mm_segment_t) { get_thread_current_ds() }); - if (!prom_ditlb_set) - return; - - /* Make sure the following runs atomically. */ - __asm__ __volatile__("flushw\n\t" - "rdpr %%pstate, %0\n\t" - "wrpr %0, %1, %%pstate" - : "=r" (pstate) - : "i" (PSTATE_IE)); - - if (enter) { - /* Kick out nucleus VPTEs. */ - __flush_nucleus_vptes(); - - /* Install PROM world. */ - for (i = 0; i < 16; i++) { - if (prom_dtlb[i].tlb_ent != -1) { - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "membar #Sync" - : : "r" (prom_dtlb[i].tlb_tag), "r" (TLB_TAG_ACCESS), - "i" (ASI_DMMU)); - if (tlb_type == spitfire) - spitfire_put_dtlb_data(prom_dtlb[i].tlb_ent, - prom_dtlb[i].tlb_data); - else if (tlb_type == cheetah || tlb_type == cheetah_plus) - cheetah_put_ldtlb_data(prom_dtlb[i].tlb_ent, - prom_dtlb[i].tlb_data); - } - if (prom_itlb[i].tlb_ent != -1) { - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "membar #Sync" - : : "r" (prom_itlb[i].tlb_tag), - "r" (TLB_TAG_ACCESS), - "i" (ASI_IMMU)); - if (tlb_type == spitfire) - spitfire_put_itlb_data(prom_itlb[i].tlb_ent, - prom_itlb[i].tlb_data); - else if (tlb_type == cheetah || tlb_type == cheetah_plus) - cheetah_put_litlb_data(prom_itlb[i].tlb_ent, - prom_itlb[i].tlb_data); - } - } - } else { - for (i = 0; i < 16; i++) { - if (prom_dtlb[i].tlb_ent != -1) { - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : : "r" (TLB_TAG_ACCESS), "i" (ASI_DMMU)); - if (tlb_type == spitfire) - spitfire_put_dtlb_data(prom_dtlb[i].tlb_ent, 0x0UL); - else - cheetah_put_ldtlb_data(prom_dtlb[i].tlb_ent, 0x0UL); - } - if (prom_itlb[i].tlb_ent != -1) { - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : : "r" (TLB_TAG_ACCESS), - "i" (ASI_IMMU)); - if (tlb_type == spitfire) - spitfire_put_itlb_data(prom_itlb[i].tlb_ent, 0x0UL); - else - cheetah_put_litlb_data(prom_itlb[i].tlb_ent, 0x0UL); - } - } - } - __asm__ __volatile__("wrpr %0, 0, %%pstate" - : : "r" (pstate)); -} - -void inherit_locked_prom_mappings(int save_p) -{ - int i; - int dtlb_seen = 0; - int itlb_seen = 0; - - /* Fucking losing PROM has more mappings in the TLB, but - * it (conveniently) fails to mention any of these in the - * translations property. The only ones that matter are - * the locked PROM tlb entries, so we impose the following - * irrecovable rule on the PROM, it is allowed 8 locked - * entries in the ITLB and 8 in the DTLB. - * - * Supposedly the upper 16GB of the address space is - * reserved for OBP, BUT I WISH THIS WAS DOCUMENTED - * SOMEWHERE!!!!!!!!!!!!!!!!! Furthermore the entire interface - * used between the client program and the firmware on sun5 - * systems to coordinate mmu mappings is also COMPLETELY - * UNDOCUMENTED!!!!!! Thanks S(t)un! - */ - if (save_p) { - for (i = 0; i < 16; i++) { - prom_itlb[i].tlb_ent = -1; - prom_dtlb[i].tlb_ent = -1; - } - } - if (tlb_type == spitfire) { - int high = sparc64_highest_unlocked_tlb_ent; - for (i = 0; i <= high; i++) { - unsigned long data; - - /* Spitfire Errata #32 workaround */ - /* NOTE: Always runs on spitfire, so no cheetah+ - * page size encodings. - */ - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "flush %%g6" - : /* No outputs */ - : "r" (0), - "r" (PRIMARY_CONTEXT), "i" (ASI_DMMU)); - - data = spitfire_get_dtlb_data(i); - if ((data & (_PAGE_L|_PAGE_VALID)) == (_PAGE_L|_PAGE_VALID)) { - unsigned long tag; - - /* Spitfire Errata #32 workaround */ - /* NOTE: Always runs on spitfire, so no - * cheetah+ page size encodings. - */ - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "flush %%g6" - : /* No outputs */ - : "r" (0), - "r" (PRIMARY_CONTEXT), "i" (ASI_DMMU)); - - tag = spitfire_get_dtlb_tag(i); - if (save_p) { - prom_dtlb[dtlb_seen].tlb_ent = i; - prom_dtlb[dtlb_seen].tlb_tag = tag; - prom_dtlb[dtlb_seen].tlb_data = data; - } - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : : "r" (TLB_TAG_ACCESS), "i" (ASI_DMMU)); - spitfire_put_dtlb_data(i, 0x0UL); - - dtlb_seen++; - if (dtlb_seen > 15) - break; - } - } - - for (i = 0; i < high; i++) { - unsigned long data; - - /* Spitfire Errata #32 workaround */ - /* NOTE: Always runs on spitfire, so no - * cheetah+ page size encodings. - */ - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "flush %%g6" - : /* No outputs */ - : "r" (0), - "r" (PRIMARY_CONTEXT), "i" (ASI_DMMU)); - - data = spitfire_get_itlb_data(i); - if ((data & (_PAGE_L|_PAGE_VALID)) == (_PAGE_L|_PAGE_VALID)) { - unsigned long tag; - - /* Spitfire Errata #32 workaround */ - /* NOTE: Always runs on spitfire, so no - * cheetah+ page size encodings. - */ - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "flush %%g6" - : /* No outputs */ - : "r" (0), - "r" (PRIMARY_CONTEXT), "i" (ASI_DMMU)); - - tag = spitfire_get_itlb_tag(i); - if (save_p) { - prom_itlb[itlb_seen].tlb_ent = i; - prom_itlb[itlb_seen].tlb_tag = tag; - prom_itlb[itlb_seen].tlb_data = data; - } - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : : "r" (TLB_TAG_ACCESS), "i" (ASI_IMMU)); - spitfire_put_itlb_data(i, 0x0UL); - - itlb_seen++; - if (itlb_seen > 15) - break; - } - } - } else if (tlb_type == cheetah || tlb_type == cheetah_plus) { - int high = sparc64_highest_unlocked_tlb_ent; - - for (i = 0; i <= high; i++) { - unsigned long data; - - data = cheetah_get_ldtlb_data(i); - if ((data & (_PAGE_L|_PAGE_VALID)) == (_PAGE_L|_PAGE_VALID)) { - unsigned long tag; - - tag = cheetah_get_ldtlb_tag(i); - if (save_p) { - prom_dtlb[dtlb_seen].tlb_ent = i; - prom_dtlb[dtlb_seen].tlb_tag = tag; - prom_dtlb[dtlb_seen].tlb_data = data; - } - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : : "r" (TLB_TAG_ACCESS), "i" (ASI_DMMU)); - cheetah_put_ldtlb_data(i, 0x0UL); - - dtlb_seen++; - if (dtlb_seen > 15) - break; - } - } - - for (i = 0; i < high; i++) { - unsigned long data; - - data = cheetah_get_litlb_data(i); - if ((data & (_PAGE_L|_PAGE_VALID)) == (_PAGE_L|_PAGE_VALID)) { - unsigned long tag; - - tag = cheetah_get_litlb_tag(i); - if (save_p) { - prom_itlb[itlb_seen].tlb_ent = i; - prom_itlb[itlb_seen].tlb_tag = tag; - prom_itlb[itlb_seen].tlb_data = data; - } - __asm__ __volatile__("stxa %%g0, [%0] %1\n\t" - "membar #Sync" - : : "r" (TLB_TAG_ACCESS), "i" (ASI_IMMU)); - cheetah_put_litlb_data(i, 0x0UL); - - itlb_seen++; - if (itlb_seen > 15) - break; - } - } - } else { - /* Implement me :-) */ - BUG(); - } - if (save_p) - prom_ditlb_set = 1; -} - -/* Give PROM back his world, done during reboots... */ -void prom_reload_locked(void) -{ - int i; - - for (i = 0; i < 16; i++) { - if (prom_dtlb[i].tlb_ent != -1) { - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "membar #Sync" - : : "r" (prom_dtlb[i].tlb_tag), "r" (TLB_TAG_ACCESS), - "i" (ASI_DMMU)); - if (tlb_type == spitfire) - spitfire_put_dtlb_data(prom_dtlb[i].tlb_ent, - prom_dtlb[i].tlb_data); - else if (tlb_type == cheetah || tlb_type == cheetah_plus) - cheetah_put_ldtlb_data(prom_dtlb[i].tlb_ent, - prom_dtlb[i].tlb_data); - } - - if (prom_itlb[i].tlb_ent != -1) { - __asm__ __volatile__("stxa %0, [%1] %2\n\t" - "membar #Sync" - : : "r" (prom_itlb[i].tlb_tag), - "r" (TLB_TAG_ACCESS), - "i" (ASI_IMMU)); - if (tlb_type == spitfire) - spitfire_put_itlb_data(prom_itlb[i].tlb_ent, - prom_itlb[i].tlb_data); - else - cheetah_put_litlb_data(prom_itlb[i].tlb_ent, - prom_itlb[i].tlb_data); - } - } + __asm__ __volatile__("flushw"); } #ifdef DCACHE_ALIASING_POSSIBLE @@ -914,7 +583,7 @@ void __flush_dcache_range(unsigned long if (++n >= 512) break; } - } else { + } else if (tlb_type == cheetah || tlb_type == cheetah_plus) { start = __pa(start); end = __pa(end); for (va = start; va < end; va += 32) @@ -1035,78 +704,6 @@ out: spin_unlock(&ctx_alloc_lock); } -#ifndef CONFIG_SMP -struct pgtable_cache_struct pgt_quicklists; -#endif - -/* OK, we have to color these pages. The page tables are accessed - * by non-Dcache enabled mapping in the VPTE area by the dtlb_backend.S - * code, as well as by PAGE_OFFSET range direct-mapped addresses by - * other parts of the kernel. By coloring, we make sure that the tlbmiss - * fast handlers do not get data from old/garbage dcache lines that - * correspond to an old/stale virtual address (user/kernel) that - * previously mapped the pagetable page while accessing vpte range - * addresses. The idea is that if the vpte color and PAGE_OFFSET range - * color is the same, then when the kernel initializes the pagetable - * using the later address range, accesses with the first address - * range will see the newly initialized data rather than the garbage. - */ -#ifdef DCACHE_ALIASING_POSSIBLE -#define DC_ALIAS_SHIFT 1 -#else -#define DC_ALIAS_SHIFT 0 -#endif -pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address) -{ - struct page *page; - unsigned long color; - - { - pte_t *ptep = pte_alloc_one_fast(mm, address); - - if (ptep) - return ptep; - } - - color = VPTE_COLOR(address); - page = alloc_pages(GFP_KERNEL|__GFP_REPEAT, DC_ALIAS_SHIFT); - if (page) { - unsigned long *to_free; - unsigned long paddr; - pte_t *pte; - -#ifdef DCACHE_ALIASING_POSSIBLE - set_page_count(page, 1); - ClearPageCompound(page); - - set_page_count((page + 1), 1); - ClearPageCompound(page + 1); -#endif - paddr = (unsigned long) page_address(page); - memset((char *)paddr, 0, (PAGE_SIZE << DC_ALIAS_SHIFT)); - - if (!color) { - pte = (pte_t *) paddr; - to_free = (unsigned long *) (paddr + PAGE_SIZE); - } else { - pte = (pte_t *) (paddr + PAGE_SIZE); - to_free = (unsigned long *) paddr; - } - -#ifdef DCACHE_ALIASING_POSSIBLE - /* Now free the other one up, adjust cache size. */ - preempt_disable(); - *to_free = (unsigned long) pte_quicklist[color ^ 0x1]; - pte_quicklist[color ^ 0x1] = to_free; - pgtable_cache_size++; - preempt_enable(); -#endif - - return pte; - } - return NULL; -} - void sparc_ultra_dump_itlb(void) { int slot; @@ -1194,6 +791,15 @@ void sparc_ultra_dump_dtlb(void) } } +static inline void spitfire_errata32(void) +{ + __asm__ __volatile__("stxa %0, [%1] %2\n\t" + "flush %%g6" + : /* No outputs */ + : "r" (0), + "r" (PRIMARY_CONTEXT), "i" (ASI_DMMU)); +} + extern unsigned long cmdline_memory_size; unsigned long __init bootmem_init(unsigned long *pages_avail) @@ -1419,6 +1025,9 @@ void kernel_map_pages(struct page *page, kernel_map_range(phys_start, phys_end, (enable ? PAGE_KERNEL : __pgprot(0))); + flush_tsb_kernel_range(PAGE_OFFSET + phys_start, + PAGE_OFFSET + phys_end); + /* we should perform an IPI and flush all tlbs, * but that can deadlock->flush only current cpu. */ @@ -1439,9 +1048,45 @@ unsigned long __init find_ecache_flush_s return ~0UL; } +static void __init tsb_phys_patch(void) +{ + struct tsb_ldquad_phys_patch_entry *pquad; + struct tsb_phys_patch_entry *p; + + pquad = &__tsb_ldquad_phys_patch; + while (pquad < &__tsb_ldquad_phys_patch_end) { + unsigned long addr = pquad->addr; + + if (tlb_type == hypervisor) + *(unsigned int *) addr = pquad->sun4v_insn; + else + *(unsigned int *) addr = pquad->sun4u_insn; + wmb(); + __asm__ __volatile__("flush %0" + : /* no outputs */ + : "r" (addr)); + + pquad++; + } + + p = &__tsb_phys_patch; + while (p < &__tsb_phys_patch_end) { + unsigned long addr = p->addr; + + *(unsigned int *) addr = p->insn; + wmb(); + __asm__ __volatile__("flush %0" + : /* no outputs */ + : "r" (addr)); + + p++; + } +} + /* paging_init() sets up the page tables */ extern void cheetah_ecache_flush_init(void); +extern void sun4v_patch_tlb_handlers(void); static unsigned long last_valid_pfn; pgd_t swapper_pg_dir[2048]; @@ -1451,6 +1096,13 @@ void __init paging_init(void) unsigned long end_pfn, pages_avail, shift; unsigned long real_end, i; + if (tlb_type == cheetah_plus || + tlb_type == hypervisor) + tsb_phys_patch(); + + if (tlb_type == hypervisor) + sun4v_patch_tlb_handlers(); + /* Find available physical memory... */ read_obp_memory("available", &pavail[0], &pavail_ents); @@ -1486,21 +1138,10 @@ void __init paging_init(void) pud_set(pud_offset(&swapper_pg_dir[0], 0), swapper_low_pmd_dir + (shift / sizeof(pgd_t))); - swapper_pgd_zero = pgd_val(swapper_pg_dir[0]); - inherit_prom_mappings(); - /* Ok, we can use our TLB miss and window trap handlers safely. - * We need to do a quick peek here to see if we are on StarFire - * or not, so setup_tba can setup the IRQ globals correctly (it - * needs to get the hard smp processor id correctly). - */ - { - extern void setup_tba(int); - setup_tba(this_is_starfire); - } - - inherit_locked_prom_mappings(1); + /* Ok, we can use our TLB miss and window trap handlers safely. */ + setup_tba(); __flush_tlb_all(); diff --git a/arch/sparc64/mm/tlb.c b/arch/sparc64/mm/tlb.c index 8b104be..78357cc 100644 --- a/arch/sparc64/mm/tlb.c +++ b/arch/sparc64/mm/tlb.c @@ -25,6 +25,8 @@ void flush_tlb_pending(void) struct mmu_gather *mp = &__get_cpu_var(mmu_gathers); if (mp->tlb_nr) { + flush_tsb_user(mp); + if (CTX_VALID(mp->mm->context)) { #ifdef CONFIG_SMP smp_flush_tlb_pending(mp->mm, mp->tlb_nr, @@ -89,62 +91,3 @@ no_cache_flush: if (nr >= TLB_BATCH_NR) flush_tlb_pending(); } - -void flush_tlb_pgtables(struct mm_struct *mm, unsigned long start, unsigned long end) -{ - struct mmu_gather *mp = &__get_cpu_var(mmu_gathers); - unsigned long nr = mp->tlb_nr; - long s = start, e = end, vpte_base; - - if (mp->fullmm) - return; - - /* If start is greater than end, that is a real problem. */ - BUG_ON(start > end); - - /* However, straddling the VA space hole is quite normal. */ - s &= PMD_MASK; - e = (e + PMD_SIZE - 1) & PMD_MASK; - - vpte_base = (tlb_type == spitfire ? - VPTE_BASE_SPITFIRE : - VPTE_BASE_CHEETAH); - - if (unlikely(nr != 0 && mm != mp->mm)) { - flush_tlb_pending(); - nr = 0; - } - - if (nr == 0) - mp->mm = mm; - - start = vpte_base + (s >> (PAGE_SHIFT - 3)); - end = vpte_base + (e >> (PAGE_SHIFT - 3)); - - /* If the request straddles the VA space hole, we - * need to swap start and end. The reason this - * occurs is that "vpte_base" is the center of - * the linear page table mapping area. Thus, - * high addresses with the sign bit set map to - * addresses below vpte_base and non-sign bit - * addresses map to addresses above vpte_base. - */ - if (end < start) { - unsigned long tmp = start; - - start = end; - end = tmp; - } - - while (start < end) { - mp->vaddrs[nr] = start; - mp->tlb_nr = ++nr; - if (nr >= TLB_BATCH_NR) { - flush_tlb_pending(); - nr = 0; - } - start += PAGE_SIZE; - } - if (nr) - flush_tlb_pending(); -} diff --git a/arch/sparc64/mm/tsb.c b/arch/sparc64/mm/tsb.c new file mode 100644 index 0000000..2cc8e65 --- /dev/null +++ b/arch/sparc64/mm/tsb.c @@ -0,0 +1,350 @@ +/* arch/sparc64/mm/tsb.c + * + * Copyright (C) 2006 David S. Miller + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +extern struct tsb swapper_tsb[KERNEL_TSB_NENTRIES]; + +static inline unsigned long tsb_hash(unsigned long vaddr, unsigned long nentries) +{ + vaddr >>= PAGE_SHIFT; + return vaddr & (nentries - 1); +} + +static inline int tag_compare(unsigned long tag, unsigned long vaddr, unsigned long context) +{ + return (tag == ((vaddr >> 22) | (context << 48))); +} + +/* TSB flushes need only occur on the processor initiating the address + * space modification, not on each cpu the address space has run on. + * Only the TLB flush needs that treatment. + */ + +void flush_tsb_kernel_range(unsigned long start, unsigned long end) +{ + unsigned long v; + + for (v = start; v < end; v += PAGE_SIZE) { + unsigned long hash = tsb_hash(v, KERNEL_TSB_NENTRIES); + struct tsb *ent = &swapper_tsb[hash]; + + if (tag_compare(ent->tag, v, 0)) { + ent->tag = 0UL; + membar_storeload_storestore(); + } + } +} + +void flush_tsb_user(struct mmu_gather *mp) +{ + struct mm_struct *mm = mp->mm; + struct tsb *tsb = mm->context.tsb; + unsigned long nentries = mm->context.tsb_nentries; + unsigned long ctx, base; + int i; + + if (unlikely(!CTX_VALID(mm->context))) + return; + + ctx = CTX_HWBITS(mm->context); + + if (tlb_type == cheetah_plus) + base = __pa(tsb); + else + base = (unsigned long) tsb; + + for (i = 0; i < mp->tlb_nr; i++) { + unsigned long v = mp->vaddrs[i]; + unsigned long tag, ent, hash; + + v &= ~0x1UL; + + hash = tsb_hash(v, nentries); + ent = base + (hash * sizeof(struct tsb)); + tag = (v >> 22UL) | (ctx << 48UL); + + tsb_flush(ent, tag); + } +} + +static void setup_tsb_params(struct mm_struct *mm, unsigned long tsb_bytes) +{ + unsigned long tsb_reg, base, tsb_paddr; + unsigned long page_sz, tte; + + mm->context.tsb_nentries = tsb_bytes / sizeof(struct tsb); + + base = TSBMAP_BASE; + tte = (_PAGE_VALID | _PAGE_L | _PAGE_CP | + _PAGE_CV | _PAGE_P | _PAGE_W); + tsb_paddr = __pa(mm->context.tsb); + BUG_ON(tsb_paddr & (tsb_bytes - 1UL)); + + /* Use the smallest page size that can map the whole TSB + * in one TLB entry. + */ + switch (tsb_bytes) { + case 8192 << 0: + tsb_reg = 0x0UL; +#ifdef DCACHE_ALIASING_POSSIBLE + base += (tsb_paddr & 8192); +#endif + tte |= _PAGE_SZ8K; + page_sz = 8192; + break; + + case 8192 << 1: + tsb_reg = 0x1UL; + tte |= _PAGE_SZ64K; + page_sz = 64 * 1024; + break; + + case 8192 << 2: + tsb_reg = 0x2UL; + tte |= _PAGE_SZ64K; + page_sz = 64 * 1024; + break; + + case 8192 << 3: + tsb_reg = 0x3UL; + tte |= _PAGE_SZ64K; + page_sz = 64 * 1024; + break; + + case 8192 << 4: + tsb_reg = 0x4UL; + tte |= _PAGE_SZ512K; + page_sz = 512 * 1024; + break; + + case 8192 << 5: + tsb_reg = 0x5UL; + tte |= _PAGE_SZ512K; + page_sz = 512 * 1024; + break; + + case 8192 << 6: + tsb_reg = 0x6UL; + tte |= _PAGE_SZ512K; + page_sz = 512 * 1024; + break; + + case 8192 << 7: + tsb_reg = 0x7UL; + tte |= _PAGE_SZ4MB; + page_sz = 4 * 1024 * 1024; + break; + + default: + BUG(); + }; + + if (tlb_type == cheetah_plus) { + /* Physical mapping, no locked TLB entry for TSB. */ + tsb_reg |= tsb_paddr; + + mm->context.tsb_reg_val = tsb_reg; + mm->context.tsb_map_vaddr = 0; + mm->context.tsb_map_pte = 0; + } else { + tsb_reg |= base; + tsb_reg |= (tsb_paddr & (page_sz - 1UL)); + tte |= (tsb_paddr & ~(page_sz - 1UL)); + + mm->context.tsb_reg_val = tsb_reg; + mm->context.tsb_map_vaddr = base; + mm->context.tsb_map_pte = tte; + } + +} + +/* The page tables are locked against modifications while this + * runs. + * + * XXX do some prefetching... + */ +static void copy_tsb(struct tsb *old_tsb, unsigned long old_size, + struct tsb *new_tsb, unsigned long new_size) +{ + unsigned long old_nentries = old_size / sizeof(struct tsb); + unsigned long new_nentries = new_size / sizeof(struct tsb); + unsigned long i; + + for (i = 0; i < old_nentries; i++) { + register unsigned long tag asm("o4"); + register unsigned long pte asm("o5"); + unsigned long v, hash; + + if (tlb_type == cheetah_plus) { + __asm__ __volatile__( + "ldda [%2] %3, %0" + : "=r" (tag), "=r" (pte) + : "r" (__pa(&old_tsb[i])), + "i" (ASI_QUAD_LDD_PHYS)); + } else { + __asm__ __volatile__( + "ldda [%2] %3, %0" + : "=r" (tag), "=r" (pte) + : "r" (&old_tsb[i]), + "i" (ASI_NUCLEUS_QUAD_LDD)); + } + + if (!tag || (tag & (1UL << TSB_TAG_LOCK_BIT))) + continue; + + /* We only put base page size PTEs into the TSB, + * but that might change in the future. This code + * would need to be changed if we start putting larger + * page size PTEs into there. + */ + WARN_ON((pte & _PAGE_ALL_SZ_BITS) != _PAGE_SZBITS); + + /* The tag holds bits 22 to 63 of the virtual address + * and the context. Clear out the context, and shift + * up to make a virtual address. + */ + v = (tag & ((1UL << 42UL) - 1UL)) << 22UL; + + /* The implied bits of the tag (bits 13 to 21) are + * determined by the TSB entry index, so fill that in. + */ + v |= (i & (512UL - 1UL)) << 13UL; + + hash = tsb_hash(v, new_nentries); + if (tlb_type == cheetah_plus) { + __asm__ __volatile__( + "stxa %0, [%1] %2\n\t" + "stxa %3, [%4] %2" + : /* no outputs */ + : "r" (tag), + "r" (__pa(&new_tsb[hash].tag)), + "i" (ASI_PHYS_USE_EC), + "r" (pte), + "r" (__pa(&new_tsb[hash].pte))); + } else { + new_tsb[hash].tag = tag; + new_tsb[hash].pte = pte; + } + } +} + +/* When the RSS of an address space exceeds mm->context.tsb_rss_limit, + * update_mmu_cache() invokes this routine to try and grow the TSB. + * When we reach the maximum TSB size supported, we stick ~0UL into + * mm->context.tsb_rss_limit so the grow checks in update_mmu_cache() + * will not trigger any longer. + * + * The TSB can be anywhere from 8K to 1MB in size, in increasing powers + * of two. The TSB must be aligned to it's size, so f.e. a 512K TSB + * must be 512K aligned. + * + * The idea here is to grow the TSB when the RSS of the process approaches + * the number of entries that the current TSB can hold at once. Currently, + * we trigger when the RSS hits 3/4 of the TSB capacity. + */ +void tsb_grow(struct mm_struct *mm, unsigned long rss, gfp_t gfp_flags) +{ + unsigned long max_tsb_size = 1 * 1024 * 1024; + unsigned long size, old_size; + struct page *page; + struct tsb *old_tsb; + + if (max_tsb_size > (PAGE_SIZE << MAX_ORDER)) + max_tsb_size = (PAGE_SIZE << MAX_ORDER); + + for (size = PAGE_SIZE; size < max_tsb_size; size <<= 1UL) { + unsigned long n_entries = size / sizeof(struct tsb); + + n_entries = (n_entries * 3) / 4; + if (n_entries > rss) + break; + } + + page = alloc_pages(gfp_flags | __GFP_ZERO, get_order(size)); + if (unlikely(!page)) + return; + + if (size == max_tsb_size) + mm->context.tsb_rss_limit = ~0UL; + else + mm->context.tsb_rss_limit = + ((size / sizeof(struct tsb)) * 3) / 4; + + old_tsb = mm->context.tsb; + old_size = mm->context.tsb_nentries * sizeof(struct tsb); + + if (old_tsb) + copy_tsb(old_tsb, old_size, page_address(page), size); + + mm->context.tsb = page_address(page); + setup_tsb_params(mm, size); + + /* If old_tsb is NULL, we're being invoked for the first time + * from init_new_context(). + */ + if (old_tsb) { + /* Now force all other processors to reload the new + * TSB state. + */ + smp_tsb_sync(mm); + + /* Finally reload it on the local cpu. No further + * references will remain to the old TSB and we can + * thus free it up. + */ + tsb_context_switch(mm); + + free_pages((unsigned long) old_tsb, get_order(old_size)); + } +} + +int init_new_context(struct task_struct *tsk, struct mm_struct *mm) +{ + + mm->context.sparc64_ctx_val = 0UL; + + /* copy_mm() copies over the parent's mm_struct before calling + * us, so we need to zero out the TSB pointer or else tsb_grow() + * will be confused and think there is an older TSB to free up. + */ + mm->context.tsb = NULL; + tsb_grow(mm, 0, GFP_KERNEL); + + if (unlikely(!mm->context.tsb)) + return -ENOMEM; + + return 0; +} + +void destroy_context(struct mm_struct *mm) +{ + unsigned long size = mm->context.tsb_nentries * sizeof(struct tsb); + + free_pages((unsigned long) mm->context.tsb, get_order(size)); + + /* We can remove these later, but for now it's useful + * to catch any bogus post-destroy_context() references + * to the TSB. + */ + mm->context.tsb = NULL; + mm->context.tsb_reg_val = 0UL; + + spin_lock(&ctx_alloc_lock); + + if (CTX_VALID(mm->context)) { + unsigned long nr = CTX_NRBITS(mm->context); + mmu_context_bmap[nr>>6] &= ~(1UL << (nr & 63)); + } + + spin_unlock(&ctx_alloc_lock); +} diff --git a/arch/sparc64/mm/ultra.S b/arch/sparc64/mm/ultra.S index e4c9151..8c24493 100644 --- a/arch/sparc64/mm/ultra.S +++ b/arch/sparc64/mm/ultra.S @@ -15,6 +15,7 @@ #include #include #include +#include /* Basically, most of the Spitfire vs. Cheetah madness * has to do with the fact that Cheetah does not support @@ -29,16 +30,18 @@ .text .align 32 .globl __flush_tlb_mm -__flush_tlb_mm: /* %o0=(ctx & TAG_CONTEXT_BITS), %o1=SECONDARY_CONTEXT */ +__flush_tlb_mm: /* 18 insns */ + /* %o0=(ctx & TAG_CONTEXT_BITS), %o1=SECONDARY_CONTEXT */ ldxa [%o1] ASI_DMMU, %g2 cmp %g2, %o0 bne,pn %icc, __spitfire_flush_tlb_mm_slow mov 0x50, %g3 stxa %g0, [%g3] ASI_DMMU_DEMAP stxa %g0, [%g3] ASI_IMMU_DEMAP + sethi %hi(KERNBASE), %g3 + flush %g3 retl - flush %g6 - nop + nop nop nop nop @@ -51,7 +54,7 @@ __flush_tlb_mm: /* %o0=(ctx & TAG_CONTEX .align 32 .globl __flush_tlb_pending -__flush_tlb_pending: +__flush_tlb_pending: /* 26 insns */ /* %o0 = context, %o1 = nr, %o2 = vaddrs[] */ rdpr %pstate, %g7 sllx %o1, 3, %o1 @@ -72,7 +75,8 @@ __flush_tlb_pending: brnz,pt %o1, 1b nop stxa %g2, [%o4] ASI_DMMU - flush %g6 + sethi %hi(KERNBASE), %o4 + flush %o4 retl wrpr %g7, 0x0, %pstate nop @@ -82,7 +86,8 @@ __flush_tlb_pending: .align 32 .globl __flush_tlb_kernel_range -__flush_tlb_kernel_range: /* %o0=start, %o1=end */ +__flush_tlb_kernel_range: /* 14 insns */ + /* %o0=start, %o1=end */ cmp %o0, %o1 be,pn %xcc, 2f sethi %hi(PAGE_SIZE), %o4 @@ -94,8 +99,11 @@ __flush_tlb_kernel_range: /* %o0=start, membar #Sync brnz,pt %o3, 1b sub %o3, %o4, %o3 -2: retl - flush %g6 +2: sethi %hi(KERNBASE), %o3 + flush %o3 + retl + nop + nop __spitfire_flush_tlb_mm_slow: rdpr %pstate, %g1 @@ -105,7 +113,8 @@ __spitfire_flush_tlb_mm_slow: stxa %g0, [%g3] ASI_IMMU_DEMAP flush %g6 stxa %g2, [%o1] ASI_DMMU - flush %g6 + sethi %hi(KERNBASE), %o1 + flush %o1 retl wrpr %g1, 0, %pstate @@ -181,7 +190,7 @@ __flush_dcache_page: /* %o0=kaddr, %o1=f .previous /* Cheetah specific versions, patched at boot time. */ -__cheetah_flush_tlb_mm: /* 18 insns */ +__cheetah_flush_tlb_mm: /* 19 insns */ rdpr %pstate, %g7 andn %g7, PSTATE_IE, %g2 wrpr %g2, 0x0, %pstate @@ -196,12 +205,13 @@ __cheetah_flush_tlb_mm: /* 18 insns */ stxa %g0, [%g3] ASI_DMMU_DEMAP stxa %g0, [%g3] ASI_IMMU_DEMAP stxa %g2, [%o2] ASI_DMMU - flush %g6 + sethi %hi(KERNBASE), %o2 + flush %o2 wrpr %g0, 0, %tl retl wrpr %g7, 0x0, %pstate -__cheetah_flush_tlb_pending: /* 26 insns */ +__cheetah_flush_tlb_pending: /* 27 insns */ /* %o0 = context, %o1 = nr, %o2 = vaddrs[] */ rdpr %pstate, %g7 sllx %o1, 3, %o1 @@ -225,7 +235,8 @@ __cheetah_flush_tlb_pending: /* 26 insns brnz,pt %o1, 1b nop stxa %g2, [%o4] ASI_DMMU - flush %g6 + sethi %hi(KERNBASE), %o4 + flush %o4 wrpr %g0, 0, %tl retl wrpr %g7, 0x0, %pstate @@ -245,7 +256,63 @@ __cheetah_flush_dcache_page: /* 11 insns nop #endif /* DCACHE_ALIASING_POSSIBLE */ -cheetah_patch_one: + /* Hypervisor specific versions, patched at boot time. */ +__hypervisor_flush_tlb_mm: /* 8 insns */ + mov %o0, %o2 /* ARG2: mmu context */ + mov 0, %o0 /* ARG0: CPU lists unimplemented */ + mov 0, %o1 /* ARG1: CPU lists unimplemented */ + mov HV_MMU_ALL, %o3 /* ARG3: flags */ + mov HV_FAST_MMU_DEMAP_CTX, %o5 + ta HV_FAST_TRAP + retl + nop + +__hypervisor_flush_tlb_pending: /* 15 insns */ + /* %o0 = context, %o1 = nr, %o2 = vaddrs[] */ + sllx %o1, 3, %g1 + mov %o2, %g2 + mov %o0, %g3 +1: sub %g1, (1 << 3), %g1 + ldx [%g2 + %g1], %o0 /* ARG0: vaddr + IMMU-bit */ + mov %g3, %o1 /* ARG1: mmu context */ + mov HV_MMU_DMMU, %o2 + andcc %o0, 1, %g0 + movne %icc, HV_MMU_ALL, %o2 /* ARG2: flags */ + andn %o0, 1, %o0 + ta HV_MMU_UNMAP_ADDR_TRAP + brnz,pt %g1, 1b + nop + retl + nop + +__hypervisor_flush_tlb_kernel_range: /* 14 insns */ + /* %o0=start, %o1=end */ + cmp %o0, %o1 + be,pn %xcc, 2f + sethi %hi(PAGE_SIZE), %g3 + mov %o0, %g1 + sub %o1, %g1, %g2 + sub %g2, %g3, %g2 +1: add %g1, %g2, %o0 /* ARG0: virtual address */ + mov 0, %o1 /* ARG1: mmu context */ + mov HV_MMU_ALL, %o2 /* ARG2: flags */ + ta HV_MMU_UNMAP_ADDR_TRAP + brnz,pt %g2, 1b + sub %g2, %g3, %g2 +2: retl + nop + +#ifdef DCACHE_ALIASING_POSSIBLE + /* XXX Niagara and friends have an 8K cache, so no aliasing is + * XXX possible, but nothing explicit in the Hypervisor API + * XXX guarantees this. + */ +__hypervisor_flush_dcache_page: /* 2 insns */ + retl + nop +#endif + +tlb_patch_one: 1: lduw [%o1], %g1 stw %g1, [%o0] flush %o0 @@ -264,22 +331,22 @@ cheetah_patch_cachetlbops: or %o0, %lo(__flush_tlb_mm), %o0 sethi %hi(__cheetah_flush_tlb_mm), %o1 or %o1, %lo(__cheetah_flush_tlb_mm), %o1 - call cheetah_patch_one - mov 18, %o2 + call tlb_patch_one + mov 19, %o2 sethi %hi(__flush_tlb_pending), %o0 or %o0, %lo(__flush_tlb_pending), %o0 sethi %hi(__cheetah_flush_tlb_pending), %o1 or %o1, %lo(__cheetah_flush_tlb_pending), %o1 - call cheetah_patch_one - mov 26, %o2 + call tlb_patch_one + mov 27, %o2 #ifdef DCACHE_ALIASING_POSSIBLE sethi %hi(__flush_dcache_page), %o0 or %o0, %lo(__flush_dcache_page), %o0 sethi %hi(__cheetah_flush_dcache_page), %o1 or %o1, %lo(__cheetah_flush_dcache_page), %o1 - call cheetah_patch_one + call tlb_patch_one mov 11, %o2 #endif /* DCACHE_ALIASING_POSSIBLE */ @@ -295,16 +362,14 @@ cheetah_patch_cachetlbops: * %g1 address arg 1 (tlb page and range flushes) * %g7 address arg 2 (tlb range flush only) * - * %g6 ivector table, don't touch - * %g2 scratch 1 - * %g3 scratch 2 - * %g4 scratch 3 - * - * TODO: Make xcall TLB range flushes use the tricks above... -DaveM + * %g6 scratch 1 + * %g2 scratch 2 + * %g3 scratch 3 + * %g4 scratch 4 */ .align 32 .globl xcall_flush_tlb_mm -xcall_flush_tlb_mm: +xcall_flush_tlb_mm: /* 18 insns */ mov PRIMARY_CONTEXT, %g2 ldxa [%g2] ASI_DMMU, %g3 srlx %g3, CTX_PGSZ1_NUC_SHIFT, %g4 @@ -316,9 +381,16 @@ xcall_flush_tlb_mm: stxa %g0, [%g4] ASI_IMMU_DEMAP stxa %g3, [%g2] ASI_DMMU retry + nop + nop + nop + nop + nop + nop + nop .globl xcall_flush_tlb_pending -xcall_flush_tlb_pending: +xcall_flush_tlb_pending: /* 20 insns */ /* %g5=context, %g1=nr, %g7=vaddrs[] */ sllx %g1, 3, %g1 mov PRIMARY_CONTEXT, %g4 @@ -343,7 +415,7 @@ xcall_flush_tlb_pending: retry .globl xcall_flush_tlb_kernel_range -xcall_flush_tlb_kernel_range: +xcall_flush_tlb_kernel_range: /* 22 insns */ sethi %hi(PAGE_SIZE - 1), %g2 or %g2, %lo(PAGE_SIZE - 1), %g2 andn %g1, %g2, %g1 @@ -360,14 +432,27 @@ xcall_flush_tlb_kernel_range: retry nop nop + nop + nop + nop + nop + nop + nop /* This runs in a very controlled environment, so we do * not need to worry about BH races etc. */ .globl xcall_sync_tick xcall_sync_tick: - rdpr %pstate, %g2 + +661: rdpr %pstate, %g2 wrpr %g2, PSTATE_IG | PSTATE_AG, %pstate + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + rdpr %pil, %g2 wrpr %g0, 15, %pil sethi %hi(109f), %g7 @@ -390,8 +475,15 @@ xcall_sync_tick: */ .globl xcall_report_regs xcall_report_regs: - rdpr %pstate, %g2 + +661: rdpr %pstate, %g2 wrpr %g2, PSTATE_IG | PSTATE_AG, %pstate + .section .sun4v_2insn_patch, "ax" + .word 661b + nop + nop + .previous + rdpr %pil, %g2 wrpr %g0, 15, %pil sethi %hi(109f), %g7 @@ -453,62 +545,74 @@ xcall_flush_dcache_page_spitfire: /* %g1 nop nop - .data - -errata32_hwbug: - .xword 0 - - .text - - /* These two are not performance critical... */ - .globl xcall_flush_tlb_all_spitfire -xcall_flush_tlb_all_spitfire: - /* Spitfire Errata #32 workaround. */ - sethi %hi(errata32_hwbug), %g4 - stx %g0, [%g4 + %lo(errata32_hwbug)] - - clr %g2 - clr %g3 -1: ldxa [%g3] ASI_DTLB_DATA_ACCESS, %g4 - and %g4, _PAGE_L, %g5 - brnz,pn %g5, 2f - mov TLB_TAG_ACCESS, %g7 - - stxa %g0, [%g7] ASI_DMMU + .globl __hypervisor_xcall_flush_tlb_mm +__hypervisor_xcall_flush_tlb_mm: /* 18 insns */ + /* %g5=ctx, g1,g2,g3,g4,g7=scratch, %g6=unusable */ + mov %o0, %g2 + mov %o1, %g3 + mov %o2, %g4 + mov %o3, %g1 + mov %o5, %g7 + clr %o0 /* ARG0: CPU lists unimplemented */ + clr %o1 /* ARG1: CPU lists unimplemented */ + mov %g5, %o2 /* ARG2: mmu context */ + mov HV_MMU_ALL, %o3 /* ARG3: flags */ + mov HV_FAST_MMU_DEMAP_CTX, %o5 + ta HV_FAST_TRAP + mov %g2, %o0 + mov %g3, %o1 + mov %g4, %o2 + mov %g1, %o3 + mov %g7, %o5 membar #Sync - stxa %g0, [%g3] ASI_DTLB_DATA_ACCESS - membar #Sync - - /* Spitfire Errata #32 workaround. */ - sethi %hi(errata32_hwbug), %g4 - stx %g0, [%g4 + %lo(errata32_hwbug)] - -2: ldxa [%g3] ASI_ITLB_DATA_ACCESS, %g4 - and %g4, _PAGE_L, %g5 - brnz,pn %g5, 2f - mov TLB_TAG_ACCESS, %g7 + retry - stxa %g0, [%g7] ASI_IMMU - membar #Sync - stxa %g0, [%g3] ASI_ITLB_DATA_ACCESS + .globl __hypervisor_xcall_flush_tlb_pending +__hypervisor_xcall_flush_tlb_pending: /* 18 insns */ + /* %g5=ctx, %g1=nr, %g7=vaddrs[], %g2,%g3,%g4=scratch, %g6=unusable */ + sllx %g1, 3, %g1 + mov %o0, %g2 + mov %o1, %g3 + mov %o2, %g4 +1: sub %g1, (1 << 3), %g1 + ldx [%g7 + %g1], %o0 /* ARG0: virtual address */ + mov %g5, %o1 /* ARG1: mmu context */ + mov HV_MMU_DMMU, %o2 + andcc %o0, 1, %g0 + movne %icc, HV_MMU_ALL, %o2 /* ARG2: flags */ + ta HV_MMU_UNMAP_ADDR_TRAP + brnz,pt %g1, 1b + nop + mov %g2, %o0 + mov %g3, %o1 + mov %g4, %o2 membar #Sync - - /* Spitfire Errata #32 workaround. */ - sethi %hi(errata32_hwbug), %g4 - stx %g0, [%g4 + %lo(errata32_hwbug)] - -2: add %g2, 1, %g2 - cmp %g2, SPITFIRE_HIGHEST_LOCKED_TLBENT - ble,pt %icc, 1b - sll %g2, 3, %g3 - flush %g6 retry - .globl xcall_flush_tlb_all_cheetah -xcall_flush_tlb_all_cheetah: - mov 0x80, %g2 - stxa %g0, [%g2] ASI_DMMU_DEMAP - stxa %g0, [%g2] ASI_IMMU_DEMAP + .globl __hypervisor_xcall_flush_tlb_kernel_range +__hypervisor_xcall_flush_tlb_kernel_range: /* 22 insns */ + /* %g1=start, %g7=end, g2,g3,g4,g5=scratch, g6=unusable */ + sethi %hi(PAGE_SIZE - 1), %g2 + or %g2, %lo(PAGE_SIZE - 1), %g2 + andn %g1, %g2, %g1 + andn %g7, %g2, %g7 + sub %g7, %g1, %g3 + add %g2, 1, %g2 + sub %g3, %g2, %g3 + mov %o0, %g2 + mov %o1, %g4 + mov %o2, %g5 +1: add %g1, %g3, %o0 /* ARG0: virtual address */ + mov 0, %o1 /* ARG1: mmu context */ + mov HV_MMU_ALL, %o2 /* ARG2: flags */ + ta HV_MMU_UNMAP_ADDR_TRAP + sethi %hi(PAGE_SIZE), %o2 + brnz,pt %g3, 1b + sub %g3, %o2, %g3 + mov %g2, %o0 + mov %g4, %o1 + mov %g5, %o2 + membar #Sync retry /* These just get rescheduled to PIL vectors. */ @@ -528,3 +632,64 @@ xcall_capture: retry #endif /* CONFIG_SMP */ + + + .globl hypervisor_patch_cachetlbops +hypervisor_patch_cachetlbops: + save %sp, -128, %sp + + sethi %hi(__flush_tlb_mm), %o0 + or %o0, %lo(__flush_tlb_mm), %o0 + sethi %hi(__hypervisor_flush_tlb_mm), %o1 + or %o1, %lo(__hypervisor_flush_tlb_mm), %o1 + call tlb_patch_one + mov 8, %o2 + + sethi %hi(__flush_tlb_pending), %o0 + or %o0, %lo(__flush_tlb_pending), %o0 + sethi %hi(__hypervisor_flush_tlb_pending), %o1 + or %o1, %lo(__hypervisor_flush_tlb_pending), %o1 + call tlb_patch_one + mov 15, %o2 + + sethi %hi(__flush_tlb_kernel_range), %o0 + or %o0, %lo(__flush_tlb_kernel_range), %o0 + sethi %hi(__hypervisor_flush_tlb_kernel_range), %o1 + or %o1, %lo(__hypervisor_flush_tlb_kernel_range), %o1 + call tlb_patch_one + mov 14, %o2 + +#ifdef DCACHE_ALIASING_POSSIBLE + sethi %hi(__flush_dcache_page), %o0 + or %o0, %lo(__flush_dcache_page), %o0 + sethi %hi(__hypervisor_flush_dcache_page), %o1 + or %o1, %lo(__hypervisor_flush_dcache_page), %o1 + call tlb_patch_one + mov 2, %o2 +#endif /* DCACHE_ALIASING_POSSIBLE */ + +#ifdef CONFIG_SMP + sethi %hi(xcall_flush_tlb_mm), %o0 + or %o0, %lo(xcall_flush_tlb_mm), %o0 + sethi %hi(__hypervisor_xcall_flush_tlb_mm), %o1 + or %o1, %lo(__hypervisor_xcall_flush_tlb_mm), %o1 + call tlb_patch_one + mov 18, %o2 + + sethi %hi(xcall_flush_tlb_pending), %o0 + or %o0, %lo(xcall_flush_tlb_pending), %o0 + sethi %hi(__hypervisor_xcall_flush_tlb_pending), %o1 + or %o1, %lo(__hypervisor_xcall_flush_tlb_pending), %o1 + call tlb_patch_one + mov 18, %o2 + + sethi %hi(xcall_flush_tlb_kernel_range), %o0 + or %o0, %lo(xcall_flush_tlb_kernel_range), %o0 + sethi %hi(__hypervisor_xcall_flush_tlb_kernel_range), %o1 + or %o1, %lo(__hypervisor_xcall_flush_tlb_kernel_range), %o1 + call tlb_patch_one + mov 22, %o2 +#endif /* CONFIG_SMP */ + + ret + restore diff --git a/arch/sparc64/prom/cif.S b/arch/sparc64/prom/cif.S index 29d0ae7..5f27ad7 100644 --- a/arch/sparc64/prom/cif.S +++ b/arch/sparc64/prom/cif.S @@ -1,10 +1,12 @@ /* cif.S: PROM entry/exit assembler trampolines. * - * Copyright (C) 1996,1997 Jakub Jelinek (jj@sunsite.mff.cuni.cz) - * Copyright (C) 2005 David S. Miller + * Copyright (C) 1996, 1997 Jakub Jelinek (jj@sunsite.mff.cuni.cz) + * Copyright (C) 2005, 2006 David S. Miller */ #include +#include +#include .text .globl prom_cif_interface @@ -12,78 +14,16 @@ prom_cif_interface: sethi %hi(p1275buf), %o0 or %o0, %lo(p1275buf), %o0 ldx [%o0 + 0x010], %o1 ! prom_cif_stack - save %o1, -0x190, %sp + save %o1, -192, %sp ldx [%i0 + 0x008], %l2 ! prom_cif_handler - rdpr %pstate, %l4 - wrpr %g0, 0x15, %pstate ! save alternate globals - stx %g1, [%sp + 2047 + 0x0b0] - stx %g2, [%sp + 2047 + 0x0b8] - stx %g3, [%sp + 2047 + 0x0c0] - stx %g4, [%sp + 2047 + 0x0c8] - stx %g5, [%sp + 2047 + 0x0d0] - stx %g6, [%sp + 2047 + 0x0d8] - stx %g7, [%sp + 2047 + 0x0e0] - wrpr %g0, 0x814, %pstate ! save interrupt globals - stx %g1, [%sp + 2047 + 0x0e8] - stx %g2, [%sp + 2047 + 0x0f0] - stx %g3, [%sp + 2047 + 0x0f8] - stx %g4, [%sp + 2047 + 0x100] - stx %g5, [%sp + 2047 + 0x108] - stx %g6, [%sp + 2047 + 0x110] - stx %g7, [%sp + 2047 + 0x118] - wrpr %g0, 0x14, %pstate ! save normal globals - stx %g1, [%sp + 2047 + 0x120] - stx %g2, [%sp + 2047 + 0x128] - stx %g3, [%sp + 2047 + 0x130] - stx %g4, [%sp + 2047 + 0x138] - stx %g5, [%sp + 2047 + 0x140] - stx %g6, [%sp + 2047 + 0x148] - stx %g7, [%sp + 2047 + 0x150] - wrpr %g0, 0x414, %pstate ! save mmu globals - stx %g1, [%sp + 2047 + 0x158] - stx %g2, [%sp + 2047 + 0x160] - stx %g3, [%sp + 2047 + 0x168] - stx %g4, [%sp + 2047 + 0x170] - stx %g5, [%sp + 2047 + 0x178] - stx %g6, [%sp + 2047 + 0x180] - stx %g7, [%sp + 2047 + 0x188] - mov %g1, %l0 ! also save to locals, so we can handle - mov %g2, %l1 ! tlb faults later on, when accessing - mov %g3, %l3 ! the stack. - mov %g7, %l5 - wrpr %l4, PSTATE_IE, %pstate ! turn off interrupts + mov %g4, %l0 + mov %g5, %l1 + mov %g6, %l3 call %l2 add %i0, 0x018, %o0 ! prom_args - wrpr %g0, 0x414, %pstate ! restore mmu globals - mov %l0, %g1 - mov %l1, %g2 - mov %l3, %g3 - mov %l5, %g7 - wrpr %g0, 0x14, %pstate ! restore normal globals - ldx [%sp + 2047 + 0x120], %g1 - ldx [%sp + 2047 + 0x128], %g2 - ldx [%sp + 2047 + 0x130], %g3 - ldx [%sp + 2047 + 0x138], %g4 - ldx [%sp + 2047 + 0x140], %g5 - ldx [%sp + 2047 + 0x148], %g6 - ldx [%sp + 2047 + 0x150], %g7 - wrpr %g0, 0x814, %pstate ! restore interrupt globals - ldx [%sp + 2047 + 0x0e8], %g1 - ldx [%sp + 2047 + 0x0f0], %g2 - ldx [%sp + 2047 + 0x0f8], %g3 - ldx [%sp + 2047 + 0x100], %g4 - ldx [%sp + 2047 + 0x108], %g5 - ldx [%sp + 2047 + 0x110], %g6 - ldx [%sp + 2047 + 0x118], %g7 - wrpr %g0, 0x15, %pstate ! restore alternate globals - ldx [%sp + 2047 + 0x0b0], %g1 - ldx [%sp + 2047 + 0x0b8], %g2 - ldx [%sp + 2047 + 0x0c0], %g3 - ldx [%sp + 2047 + 0x0c8], %g4 - ldx [%sp + 2047 + 0x0d0], %g5 - ldx [%sp + 2047 + 0x0d8], %g6 - ldx [%sp + 2047 + 0x0e0], %g7 - wrpr %l4, 0, %pstate ! restore original pstate + mov %l0, %g4 + mov %l1, %g5 + mov %l3, %g6 ret restore @@ -91,135 +31,18 @@ prom_cif_interface: prom_cif_callback: sethi %hi(p1275buf), %o1 or %o1, %lo(p1275buf), %o1 - save %sp, -0x270, %sp - rdpr %pstate, %l4 - wrpr %g0, 0x15, %pstate ! save PROM alternate globals - stx %g1, [%sp + 2047 + 0x0b0] - stx %g2, [%sp + 2047 + 0x0b8] - stx %g3, [%sp + 2047 + 0x0c0] - stx %g4, [%sp + 2047 + 0x0c8] - stx %g5, [%sp + 2047 + 0x0d0] - stx %g6, [%sp + 2047 + 0x0d8] - stx %g7, [%sp + 2047 + 0x0e0] - ! restore Linux alternate globals - ldx [%sp + 2047 + 0x190], %g1 - ldx [%sp + 2047 + 0x198], %g2 - ldx [%sp + 2047 + 0x1a0], %g3 - ldx [%sp + 2047 + 0x1a8], %g4 - ldx [%sp + 2047 + 0x1b0], %g5 - ldx [%sp + 2047 + 0x1b8], %g6 - ldx [%sp + 2047 + 0x1c0], %g7 - wrpr %g0, 0x814, %pstate ! save PROM interrupt globals - stx %g1, [%sp + 2047 + 0x0e8] - stx %g2, [%sp + 2047 + 0x0f0] - stx %g3, [%sp + 2047 + 0x0f8] - stx %g4, [%sp + 2047 + 0x100] - stx %g5, [%sp + 2047 + 0x108] - stx %g6, [%sp + 2047 + 0x110] - stx %g7, [%sp + 2047 + 0x118] - ! restore Linux interrupt globals - ldx [%sp + 2047 + 0x1c8], %g1 - ldx [%sp + 2047 + 0x1d0], %g2 - ldx [%sp + 2047 + 0x1d8], %g3 - ldx [%sp + 2047 + 0x1e0], %g4 - ldx [%sp + 2047 + 0x1e8], %g5 - ldx [%sp + 2047 + 0x1f0], %g6 - ldx [%sp + 2047 + 0x1f8], %g7 - wrpr %g0, 0x14, %pstate ! save PROM normal globals - stx %g1, [%sp + 2047 + 0x120] - stx %g2, [%sp + 2047 + 0x128] - stx %g3, [%sp + 2047 + 0x130] - stx %g4, [%sp + 2047 + 0x138] - stx %g5, [%sp + 2047 + 0x140] - stx %g6, [%sp + 2047 + 0x148] - stx %g7, [%sp + 2047 + 0x150] - ! restore Linux normal globals - ldx [%sp + 2047 + 0x200], %g1 - ldx [%sp + 2047 + 0x208], %g2 - ldx [%sp + 2047 + 0x210], %g3 - ldx [%sp + 2047 + 0x218], %g4 - ldx [%sp + 2047 + 0x220], %g5 - ldx [%sp + 2047 + 0x228], %g6 - ldx [%sp + 2047 + 0x230], %g7 - wrpr %g0, 0x414, %pstate ! save PROM mmu globals - stx %g1, [%sp + 2047 + 0x158] - stx %g2, [%sp + 2047 + 0x160] - stx %g3, [%sp + 2047 + 0x168] - stx %g4, [%sp + 2047 + 0x170] - stx %g5, [%sp + 2047 + 0x178] - stx %g6, [%sp + 2047 + 0x180] - stx %g7, [%sp + 2047 + 0x188] - ! restore Linux mmu globals - ldx [%sp + 2047 + 0x238], %o0 - ldx [%sp + 2047 + 0x240], %o1 - ldx [%sp + 2047 + 0x248], %l2 - ldx [%sp + 2047 + 0x250], %l3 - ldx [%sp + 2047 + 0x258], %l5 - ldx [%sp + 2047 + 0x260], %l6 - ldx [%sp + 2047 + 0x268], %l7 - ! switch to Linux tba - sethi %hi(sparc64_ttable_tl0), %l1 - rdpr %tba, %l0 ! save PROM tba - mov %o0, %g1 - mov %o1, %g2 - mov %l2, %g3 - mov %l3, %g4 - mov %l5, %g5 - mov %l6, %g6 - mov %l7, %g7 - wrpr %l1, %tba ! install Linux tba - wrpr %l4, 0, %pstate ! restore PSTATE + save %sp, -192, %sp + TRAP_LOAD_THREAD_REG(%g6, %g1) + LOAD_PER_CPU_BASE(%g5, %g6, %g4, %g3, %o0) + ldx [%g6 + TI_TASK], %g4 call prom_world - mov %g0, %o0 + mov 0, %o0 ldx [%i1 + 0x000], %l2 call %l2 mov %i0, %o0 mov %o0, %l1 call prom_world - or %g0, 1, %o0 - wrpr %g0, 0x14, %pstate ! interrupts off - ! restore PROM mmu globals - ldx [%sp + 2047 + 0x158], %o0 - ldx [%sp + 2047 + 0x160], %o1 - ldx [%sp + 2047 + 0x168], %l2 - ldx [%sp + 2047 + 0x170], %l3 - ldx [%sp + 2047 + 0x178], %l5 - ldx [%sp + 2047 + 0x180], %l6 - ldx [%sp + 2047 + 0x188], %l7 - wrpr %g0, 0x414, %pstate ! restore PROM mmu globals - mov %o0, %g1 - mov %o1, %g2 - mov %l2, %g3 - mov %l3, %g4 - mov %l5, %g5 - mov %l6, %g6 - mov %l7, %g7 - wrpr %l0, %tba ! restore PROM tba - wrpr %g0, 0x14, %pstate ! restore PROM normal globals - ldx [%sp + 2047 + 0x120], %g1 - ldx [%sp + 2047 + 0x128], %g2 - ldx [%sp + 2047 + 0x130], %g3 - ldx [%sp + 2047 + 0x138], %g4 - ldx [%sp + 2047 + 0x140], %g5 - ldx [%sp + 2047 + 0x148], %g6 - ldx [%sp + 2047 + 0x150], %g7 - wrpr %g0, 0x814, %pstate ! restore PROM interrupt globals - ldx [%sp + 2047 + 0x0e8], %g1 - ldx [%sp + 2047 + 0x0f0], %g2 - ldx [%sp + 2047 + 0x0f8], %g3 - ldx [%sp + 2047 + 0x100], %g4 - ldx [%sp + 2047 + 0x108], %g5 - ldx [%sp + 2047 + 0x110], %g6 - ldx [%sp + 2047 + 0x118], %g7 - wrpr %g0, 0x15, %pstate ! restore PROM alternate globals - ldx [%sp + 2047 + 0x0b0], %g1 - ldx [%sp + 2047 + 0x0b8], %g2 - ldx [%sp + 2047 + 0x0c0], %g3 - ldx [%sp + 2047 + 0x0c8], %g4 - ldx [%sp + 2047 + 0x0d0], %g5 - ldx [%sp + 2047 + 0x0d8], %g6 - ldx [%sp + 2047 + 0x0e0], %g7 - wrpr %l4, 0, %pstate + mov 1, %o0 ret restore %l1, 0, %o0 diff --git a/include/asm-sparc64/asi.h b/include/asm-sparc64/asi.h index 5348556..662a211 100644 --- a/include/asm-sparc64/asi.h +++ b/include/asm-sparc64/asi.h @@ -25,14 +25,27 @@ /* SpitFire and later extended ASIs. The "(III)" marker designates * UltraSparc-III and later specific ASIs. The "(CMT)" marker designates - * Chip Multi Threading specific ASIs. + * Chip Multi Threading specific ASIs. "(NG)" designates Niagara specific + * ASIs, "(4V)" designates SUN4V specific ASIs. */ #define ASI_PHYS_USE_EC 0x14 /* PADDR, E-cachable */ #define ASI_PHYS_BYPASS_EC_E 0x15 /* PADDR, E-bit */ +#define ASI_BLK_AIUP_4V 0x16 /* (4V) Prim, user, block ld/st */ +#define ASI_BLK_AIUS_4V 0x17 /* (4V) Sec, user, block ld/st */ #define ASI_PHYS_USE_EC_L 0x1c /* PADDR, E-cachable, little endian*/ #define ASI_PHYS_BYPASS_EC_E_L 0x1d /* PADDR, E-bit, little endian */ +#define ASI_BLK_AIUP_L_4V 0x1e /* (4V) Prim, user, block, l-endian*/ +#define ASI_BLK_AIUS_L_4V 0x1f /* (4V) Sec, user, block, l-endian */ +#define ASI_SCRATCHPAD 0x20 /* (4V) Scratch Pad Registers */ +#define ASI_MMU 0x21 /* (4V) MMU Context Registers */ +#define ASI_BLK_INIT_QUAD_LDD_AIUS 0x23 /* (NG) init-store, twin load, + * secondary, user + */ #define ASI_NUCLEUS_QUAD_LDD 0x24 /* Cachable, qword load */ +#define ASI_QUEUE 0x25 /* (4V) Interrupt Queue Registers */ +#define ASI_QUAD_LDD_PHYS_4V 0x26 /* (4V) Physical, qword load */ #define ASI_NUCLEUS_QUAD_LDD_L 0x2c /* Cachable, qword load, l-endian */ +#define ASI_QUAD_LDD_PHYS_L_4V 0x2e /* (4V) Phys, qword load, l-endian */ #define ASI_PCACHE_DATA_STATUS 0x30 /* (III) PCache data stat RAM diag */ #define ASI_PCACHE_DATA 0x31 /* (III) PCache data RAM diag */ #define ASI_PCACHE_TAG 0x32 /* (III) PCache tag RAM diag */ @@ -137,6 +150,9 @@ #define ASI_FL16_SL 0xdb /* Secondary, 1 16-bit, fpu ld/st,L*/ #define ASI_BLK_COMMIT_P 0xe0 /* Primary, blk store commit */ #define ASI_BLK_COMMIT_S 0xe1 /* Secondary, blk store commit */ +#define ASI_BLK_INIT_QUAD_LDD_P 0xe2 /* (NG) init-store, twin load, + * primary, implicit + */ #define ASI_BLK_P 0xf0 /* Primary, blk ld/st */ #define ASI_BLK_S 0xf1 /* Secondary, blk ld/st */ #define ASI_BLK_PL 0xf8 /* Primary, blk ld/st, little */ diff --git a/include/asm-sparc64/cpudata.h b/include/asm-sparc64/cpudata.h index 74de79d..26b1dc9 100644 --- a/include/asm-sparc64/cpudata.h +++ b/include/asm-sparc64/cpudata.h @@ -1,12 +1,17 @@ /* cpudata.h: Per-cpu parameters. * - * Copyright (C) 2003, 2005 David S. Miller (davem@redhat.com) + * Copyright (C) 2003, 2005, 2006 David S. Miller (davem@davemloft.net) */ #ifndef _SPARC64_CPUDATA_H #define _SPARC64_CPUDATA_H +#include + +#ifndef __ASSEMBLY__ + #include +#include typedef struct { /* Dcache line 1 */ @@ -17,25 +22,181 @@ typedef struct { unsigned long clock_tick; /* %tick's per second */ unsigned long udelay_val; - /* Dcache line 2 */ - unsigned int pgcache_size; - unsigned int __pad1; - unsigned long *pte_cache[2]; - unsigned long *pgd_cache; - - /* Dcache line 3, rarely used */ + /* Dcache line 2, rarely used */ unsigned int dcache_size; unsigned int dcache_line_size; unsigned int icache_size; unsigned int icache_line_size; unsigned int ecache_size; unsigned int ecache_line_size; - unsigned int __pad2; unsigned int __pad3; + unsigned int __pad4; } cpuinfo_sparc; DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data); #define cpu_data(__cpu) per_cpu(__cpu_data, (__cpu)) #define local_cpu_data() __get_cpu_var(__cpu_data) +/* Trap handling code needs to get at a few critical values upon + * trap entry and to process TSB misses. These cannot be in the + * per_cpu() area as we really need to lock them into the TLB and + * thus make them part of the main kernel image. As a result we + * try to make this as small as possible. + * + * This is padded out and aligned to 64-bytes to avoid false sharing + * on SMP. + */ + +/* If you modify the size of this structure, please update + * TRAP_BLOCK_SZ_SHIFT below. + */ +struct thread_info; +struct trap_per_cpu { +/* D-cache line 1 */ + struct thread_info *thread; + unsigned long pgd_paddr; + unsigned long __pad1[2]; + +/* D-cache line 2 */ + unsigned long __pad2[4]; + +/* Dcache lines 3 and 4 */ + struct hv_fault_status fault_info; +} __attribute__((aligned(64))); +extern struct trap_per_cpu trap_block[NR_CPUS]; +extern void init_cur_cpu_trap(void); +extern void setup_tba(void); + +#ifdef CONFIG_SMP +struct cpuid_patch_entry { + unsigned int addr; + unsigned int cheetah_safari[4]; + unsigned int cheetah_jbus[4]; + unsigned int starfire[4]; + unsigned int sun4v[4]; +}; +extern struct cpuid_patch_entry __cpuid_patch, __cpuid_patch_end; +#endif + +struct sun4v_1insn_patch_entry { + unsigned int addr; + unsigned int insn; +}; +extern struct sun4v_1insn_patch_entry __sun4v_1insn_patch, + __sun4v_1insn_patch_end; + +struct sun4v_2insn_patch_entry { + unsigned int addr; + unsigned int insns[2]; +}; +extern struct sun4v_2insn_patch_entry __sun4v_2insn_patch, + __sun4v_2insn_patch_end; + +#endif /* !(__ASSEMBLY__) */ + +#define TRAP_PER_CPU_THREAD 0x00 +#define TRAP_PER_CPU_PGD_PADDR 0x08 +#define TRAP_PER_CPU_FAULT_INFO 0x20 + +#define TRAP_BLOCK_SZ_SHIFT 7 + +#include + +#ifdef CONFIG_SMP + +#define __GET_CPUID(REG) \ + /* Spitfire implementation (default). */ \ +661: ldxa [%g0] ASI_UPA_CONFIG, REG; \ + srlx REG, 17, REG; \ + and REG, 0x1f, REG; \ + nop; \ + .section .cpuid_patch, "ax"; \ + /* Instruction location. */ \ + .word 661b; \ + /* Cheetah Safari implementation. */ \ + ldxa [%g0] ASI_SAFARI_CONFIG, REG; \ + srlx REG, 17, REG; \ + and REG, 0x3ff, REG; \ + nop; \ + /* Cheetah JBUS implementation. */ \ + ldxa [%g0] ASI_JBUS_CONFIG, REG; \ + srlx REG, 17, REG; \ + and REG, 0x1f, REG; \ + nop; \ + /* Starfire implementation. */ \ + sethi %hi(0x1fff40000d0 >> 9), REG; \ + sllx REG, 9, REG; \ + or REG, 0xd0, REG; \ + lduwa [REG] ASI_PHYS_BYPASS_EC_E, REG;\ + /* sun4v implementation. */ \ + mov SCRATCHPAD_CPUID, REG; \ + nop; \ + ldxa [REG] ASI_SCRATCHPAD, REG; \ + nop; \ + .previous; + +/* Clobbers TMP, current address space PGD phys address into DEST. */ +#define TRAP_LOAD_PGD_PHYS(DEST, TMP) \ + __GET_CPUID(TMP) \ + sethi %hi(trap_block), DEST; \ + sllx TMP, TRAP_BLOCK_SZ_SHIFT, TMP; \ + or DEST, %lo(trap_block), DEST; \ + add DEST, TMP, DEST; \ + ldx [DEST + TRAP_PER_CPU_PGD_PADDR], DEST; + +/* Clobbers TMP, loads local processor's IRQ work area into DEST. */ +#define TRAP_LOAD_IRQ_WORK(DEST, TMP) \ + __GET_CPUID(TMP) \ + sethi %hi(__irq_work), DEST; \ + sllx TMP, 6, TMP; \ + or DEST, %lo(__irq_work), DEST; \ + add DEST, TMP, DEST; + +/* Clobbers TMP, loads DEST with current thread info pointer. */ +#define TRAP_LOAD_THREAD_REG(DEST, TMP) \ + __GET_CPUID(TMP) \ + sethi %hi(trap_block), DEST; \ + sllx TMP, TRAP_BLOCK_SZ_SHIFT, TMP; \ + or DEST, %lo(trap_block), DEST; \ + ldx [DEST + TMP], DEST; + +/* Given the current thread info pointer in THR, load the per-cpu + * area base of the current processor into DEST. REG1, REG2, and REG3 are + * clobbered. + * + * You absolutely cannot use DEST as a temporary in this code. The + * reason is that traps can happen during execution, and return from + * trap will load the fully resolved DEST per-cpu base. This can corrupt + * the calculations done by the macro mid-stream. + */ +#define LOAD_PER_CPU_BASE(DEST, THR, REG1, REG2, REG3) \ + ldub [THR + TI_CPU], REG1; \ + sethi %hi(__per_cpu_shift), REG3; \ + sethi %hi(__per_cpu_base), REG2; \ + ldx [REG3 + %lo(__per_cpu_shift)], REG3; \ + ldx [REG2 + %lo(__per_cpu_base)], REG2; \ + sllx REG1, REG3, REG3; \ + add REG3, REG2, DEST; + +#else + +/* Uniprocessor versions, we know the cpuid is zero. */ +#define TRAP_LOAD_PGD_PHYS(DEST, TMP) \ + sethi %hi(trap_block), DEST; \ + or DEST, %lo(trap_block), DEST; \ + ldx [DEST + TRAP_PER_CPU_PGD_PADDR], DEST; + +#define TRAP_LOAD_IRQ_WORK(DEST, TMP) \ + sethi %hi(__irq_work), DEST; \ + or DEST, %lo(__irq_work), DEST; + +#define TRAP_LOAD_THREAD_REG(DEST, TMP) \ + sethi %hi(trap_block), DEST; \ + ldx [DEST + %lo(trap_block)], DEST; + +/* No per-cpu areas on uniprocessor, so no need to load DEST. */ +#define LOAD_PER_CPU_BASE(DEST, THR, REG1, REG2, REG3) + +#endif /* !(CONFIG_SMP) */ + #endif /* _SPARC64_CPUDATA_H */ diff --git a/include/asm-sparc64/head.h b/include/asm-sparc64/head.h index 0abd3a6..ff76c09 100644 --- a/include/asm-sparc64/head.h +++ b/include/asm-sparc64/head.h @@ -4,12 +4,17 @@ #include + /* wrpr %g0, val, %gl */ +#define SET_GL(val) \ + .word 0xa1902000 | val + #define KERNBASE 0x400000 #define PTREGS_OFF (STACK_BIAS + STACKFRAME_SZ) #define __CHEETAH_ID 0x003e0014 #define __JALAPENO_ID 0x003e0016 +#define __SERRANO_ID 0x003e0022 #define CHEETAH_MANUF 0x003e #define CHEETAH_IMPL 0x0014 /* Ultra-III */ diff --git a/include/asm-sparc64/hypervisor.h b/include/asm-sparc64/hypervisor.h new file mode 100644 index 0000000..9c8e453 --- /dev/null +++ b/include/asm-sparc64/hypervisor.h @@ -0,0 +1,2072 @@ +#ifndef _SPARC64_HYPERVISOR_H +#define _SPARC64_HYPERVISOR_H + +/* Sun4v hypervisor interfaces and defines. + * + * Hypervisor calls are made via traps to software traps number 0x80 + * and above. Registers %o0 to %o5 serve as argument, status, and + * return value registers. + * + * There are two kinds of these traps. First there are the normal + * "fast traps" which use software trap 0x80 and encode the function + * to invoke by number in register %o5. Argument and return value + * handling is as follows: + * + * ----------------------------------------------- + * | %o5 | function number | undefined | + * | %o0 | argument 0 | return status | + * | %o1 | argument 1 | return value 1 | + * | %o2 | argument 2 | return value 2 | + * | %o3 | argument 3 | return value 3 | + * | %o4 | argument 4 | return value 4 | + * ----------------------------------------------- + * + * The second type are "hyper-fast traps" which encode the function + * number in the software trap number itself. So these use trap + * numbers > 0x80. The register usage for hyper-fast traps is as + * follows: + * + * ----------------------------------------------- + * | %o0 | argument 0 | return status | + * | %o1 | argument 1 | return value 1 | + * | %o2 | argument 2 | return value 2 | + * | %o3 | argument 3 | return value 3 | + * | %o4 | argument 4 | return value 4 | + * ----------------------------------------------- + * + * Registers providing explicit arguments to the hypervisor calls + * are volatile across the call. Upon return their values are + * undefined unless explicitly specified as containing a particular + * return value by the specific call. The return status is always + * returned in register %o0, zero indicates a successful execution of + * the hypervisor call and other values indicate an error status as + * defined below. So, for example, if a hyper-fast trap takes + * arguments 0, 1, and 2, then %o0, %o1, and %o2 are volatile across + * the call and %o3, %o4, and %o5 would be preserved. + * + * If the hypervisor trap is invalid, or the fast trap function number + * is invalid, HV_EBADTRAP will be returned in %o0. Also, all 64-bits + * of the argument and return values are significant. + */ + +/* Trap numbers. */ +#define HV_FAST_TRAP 0x80 +#define HV_MMU_MAP_ADDR_TRAP 0x83 +#define HV_MMU_UNMAP_ADDR_TRAP 0x84 +#define HV_TTRACE_ADDENTRY_TRAP 0x85 +#define HV_CORE_TRAP 0xff + +/* Error codes. */ +#define HV_EOK 0 /* Successful return */ +#define HV_ENOCPU 1 /* Invalid CPU id */ +#define HV_ENORADDR 2 /* Invalid real address */ +#define HV_ENOINTR 3 /* Invalid interrupt id */ +#define HV_EBADPGSZ 4 /* Invalid pagesize encoding */ +#define HV_EBADTSB 5 /* Invalid TSB description */ +#define HV_EINVAL 6 /* Invalid argument */ +#define HV_EBADTRAP 7 /* Invalid function number */ +#define HV_EBADALIGN 8 /* Invalid address alignment */ +#define HV_EWOULDBLOCK 9 /* Cannot complete w/o blocking */ +#define HV_ENOACCESS 10 /* No access to resource */ +#define HV_EIO 11 /* I/O error */ +#define HV_ECPUERROR 12 /* CPU in error state */ +#define HV_ENOTSUPPORTED 13 /* Function not supported */ +#define HV_ENOMAP 14 /* No mapping found */ +#define HV_ETOOMANY 15 /* Too many items specified */ + +/* mach_exit() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MACH_EXIT + * ARG0: exit code + * ERRORS: This service does not return. + * + * Stop all CPUs in the virtual domain and place them into the stopped + * state. The 64-bit exit code may be passed to a service entity as + * the domain's exit status. On systems without a service entity, the + * domain will undergo a reset, and the boot firmware will be + * reloaded. + * + * This function will never return to the guest that invokes it. + * + * Note: By convention an exit code of zero denotes a successful exit by + * the guest code. A non-zero exit code denotes a guest specific + * error indication. + * + */ +#define HV_FAST_MACH_EXIT 0x00 + +/* Domain services. */ + +/* mach_desc() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MACH_DESC + * ARG0: buffer + * ARG1: length + * RET0: status + * RET1: length + * ERRORS: HV_EBADALIGN Buffer is badly aligned + * HV_ENORADDR Buffer is to an illegal real address. + * HV_EINVAL Buffer length is too small for complete + * machine description. + * + * Copy the most current machine description into the buffer indicated + * by the real address in ARG0. The buffer provided must be 16 byte + * aligned. Upon success or HV_EINVAL, this service returns the + * actual size of the machine description in the RET1 return value. + * + * Note: A method of determining the appropriate buffer size for the + * machine description is to first call this service with a buffer + * length of 0 bytes. + */ +#define HV_FAST_MACH_DESC 0x01 + +/* mach_exit() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MACH_SIR + * ERRORS: This service does not return. + * + * Perform a software initiated reset of the virtual machine domain. + * All CPUs are captured as soon as possible, all hardware devices are + * returned to the entry default state, and the domain is restarted at + * the SIR (trap type 0x04) real trap table (RTBA) entry point on one + * of the CPUs. The single CPU restarted is selected as determined by + * platform specific policy. Memory is preserved across this + * operation. + */ +#define HV_FAST_MACH_SIR 0x02 + +/* mach_set_soft_state() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MACH_SET_SOFT_STATE + * ARG0: software state + * ARG1: software state description pointer + * RET0: status + * ERRORS: EINVAL software state not valid or software state + * description is not NULL terminated + * ENORADDR software state description pointer is not a + * valid real address + * EBADALIGNED software state description is not correctly + * aligned + * + * This allows the guest to report it's soft state to the hypervisor. There + * are two primary components to this state. The first part states whether + * the guest software is running or not. The second containts optional + * details specific to the software. + * + * The software state argument is defined below in HV_SOFT_STATE_*, and + * indicates whether the guest is operating normally or in a transitional + * state. + * + * The software state description argument is a real address of a data buffer + * of size 32-bytes aligned on a 32-byte boundary. It is treated as a NULL + * terminated 7-bit ASCII string of up to 31 characters not including the + * NULL termination. + */ +#define HV_FAST_MACH_SET_SOFT_STATE 0x03 +#define HV_SOFT_STATE_NORMAL 0x01 +#define HV_SOFT_STATE_TRANSITION 0x02 + +/* mach_get_soft_state() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MACH_GET_SOFT_STATE + * ARG0: software state description pointer + * RET0: status + * RET1: software state + * ERRORS: ENORADDR software state description pointer is not a + * valid real address + * EBADALIGNED software state description is not correctly + * aligned + * + * Retrieve the current value of the guest's software state. The rules + * for the software state pointer are the same as for mach_set_soft_state() + * above. + */ +#define HV_FAST_MACH_GET_SOFT_STATE 0x04 + +/* CPU services. + * + * CPUs represent devices that can execute software threads. A single + * chip that contains multiple cores or strands is represented as + * multiple CPUs with unique CPU identifiers. CPUs are exported to + * OBP via the machine description (and to the OS via the OBP device + * tree). CPUs are always in one of three states: stopped, running, + * or error. + * + * A CPU ID is a pre-assigned 16-bit value that uniquely identifies a + * CPU within a logical domain. Operations that are to be performed + * on multiple CPUs specify them via a CPU list. A CPU list is an + * array in real memory, of which each 16-bit word is a CPU ID. CPU + * lists are passed through the API as two arguments. The first is + * the number of entries (16-bit words) in the CPU list, and the + * second is the (real address) pointer to the CPU ID list. + */ + +/* cpu_start() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_START + * ARG0: CPU ID + * ARG1: PC + * ARG1: RTBA + * ARG1: target ARG0 + * RET0: status + * ERRORS: ENOCPU Invalid CPU ID + * EINVAL Target CPU ID is not in the stopped state + * ENORADDR Invalid PC or RTBA real address + * EBADALIGN Unaligned PC or unaligned RTBA + * EWOULDBLOCK Starting resources are not available + * + * Start CPU with given CPU ID with PC in %pc and with a real trap + * base address value of RTBA. The indicated CPU must be in the + * stopped state. The supplied RTBA must be aligned on a 256 byte + * boundary. On successful completion, the specified CPU will be in + * the running state and will be supplied with "target ARG0" in %o0 + * and RTBA in %tba. + */ +#define HV_FAST_CPU_START 0x10 + +/* cpu_stop() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_STOP + * ARG0: CPU ID + * RET0: status + * ERRORS: ENOCPU Invalid CPU ID + * EINVAL Target CPU ID is the current cpu + * EINVAL Target CPU ID is not in the running state + * EWOULDBLOCK Stopping resources are not available + * ENOTSUPPORTED Not supported on this platform + * + * The specified CPU is stopped. The indicated CPU must be in the + * running state. On completion, it will be in the stopped state. It + * is not legal to stop the current CPU. + * + * Note: As this service cannot be used to stop the current cpu, this service + * may not be used to stop the last running CPU in a domain. To stop + * and exit a running domain, a guest must use the mach_exit() service. + */ +#define HV_FAST_CPU_STOP 0x11 + +/* cpu_yield() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_YIELD + * RET0: status + * ERRORS: No possible error. + * + * Suspend execution on the current CPU. Execution will resume when + * an interrupt (device, %stick_compare, or cross-call) is targeted to + * the CPU. On some CPUs, this API may be used by the hypervisor to + * save power by disabling hardware strands. + */ +#define HV_FAST_CPU_YIELD 0x12 + + +/* cpu_qconf() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_QCONF + * ARG0: queue + * ARG1: base real address + * ARG2: number of entries + * RET0: status + * ERRORS: ENORADDR Invalid base real address + * EINVAL Invalid queue or number of entries is less + * than 2 or too large. + * EBADALIGN Base real address is not correctly aligned + * for size. + * + * Configure the given queue to be placed at the givem base real + * address, with the given number of entries. The number of entries + * must be a power of 2. The base real address must be aligned + * exactly to match the queue size. Each queue entry is 64 bytes + * long, so for example a 32 entry queue must be aligned on a 2048 + * byte real address boundary. + * + * The specified queue is unconfigured is number of entries is given as zero. + * + * For the current version of this API service, the argument queue is defined + * as follows: + * queue description + * ----- ------------------------- + * 0x3c cpu mondo queue + * 0x3d device mondo queue + * 0x3e resumable error queue + * 0x3f non-resumable error queue + * + * Note: The maximum number of entries for each queue for a specific cpu may + * be determined from the machine description. + */ +#define HV_FAST_CPU_QCONF 0x14 +#define HV_CPU_QUEUE_CPU_MONDO 0x3c +#define HV_CPU_QUEUE_DEVICE_MONDO 0x3d +#define HV_CPU_QUEUE_RES_ERROR 0x3e +#define HV_CPU_QUEUE_NONRES_ERROR 0x3f + +/* cpu_qinfo() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_QINFO + * ARG0: queue + * RET0: status + * RET1: base real address + * RET1: number of entries + * ERRORS: EINVAL Invalid queue + * + * Return the configuration info for the given queue. The base real + * address and number of entries of the defined queue are returned. + * The queue argument values are the same as for cpu_qconf() above. + * + * If the specified queue is a valid queue number, but no queue has + * been defined, the number of entries will be set to zero and the + * base real address returned is undefined. + */ +#define HV_FAST_CPU_QINFO 0x15 + +/* cpu_mondo_send() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_MONDO_SEND + * ARG0-1: CPU list + * ARG2: data real address + * RET0: status + * ERRORS: EBADALIGN Mondo data is not 64-byte aligned or CPU list + * is not 2-byte aligned. + * ENORADDR Invalid data mondo address, or invalid cpu list + * address. + * ENOCPU Invalid cpu in CPU list + * EWOULDBLOCK Some or all of the listed CPUs did not receive + * the mondo + * EINVAL CPU list includes caller's CPU ID + * + * Send a mondo interrupt to the CPUs in the given CPU list with the + * 64-bytes at the given data real address. The data must be 64-byte + * aligned. The mondo data will be delivered to the cpu_mondo queues + * of the recipient CPUs. + * + * In all cases, error or not, the CPUs in the CPU list to which the + * mondo has been successfully delivered will be indicated by having + * their entry in CPU list updated with the value 0xffff. + */ +#define HV_FAST_CPU_MONDO_SEND 0x42 + +/* cpu_myid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_MYID + * RET0: status + * RET1: CPU ID + * ERRORS: No errors defined. + * + * Return the hypervisor ID handle for the current CPU. Use by a + * virtual CPU to discover it's own identity. + */ +#define HV_FAST_CPU_MYID 0x16 + +/* cpu_state() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_STATE + * ARG0: CPU ID + * RET0: status + * RET1: state + * ERRORS: ENOCPU Invalid CPU ID + * + * Retrieve the current state of the CPU with the given CPU ID. + */ +#define HV_FAST_CPU_STATE 0x17 +#define HV_CPU_STATE_STOPPED 0x01 +#define HV_CPU_STATE_RUNNING 0x02 +#define HV_CPU_STATE_ERROR 0x03 + +/* cpu_set_rtba() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_SET_RTBA + * ARG0: RTBA + * RET0: status + * RET1: previous RTBA + * ERRORS: ENORADDR Invalid RTBA real address + * EBADALIGN RTBA is incorrectly aligned for a trap table + * + * Set the real trap base address of the local cpu to the given RTBA. + * The supplied RTBA must be aligned on a 256 byte boundary. Upon + * success the previous value of the RTBA is returned in RET1. + * + * Note: This service does not affect %tba + */ +#define HV_FAST_CPU_SET_RTBA 0x18 + +/* cpu_set_rtba() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CPU_GET_RTBA + * RET0: status + * RET1: previous RTBA + * ERRORS: No possible error. + * + * Returns the current value of RTBA in RET1. + */ +#define HV_FAST_CPU_GET_RTBA 0x19 + +/* MMU services. + * + * Layout of a TSB description for mmu_tsb_ctx{,non}0() calls. + */ +#ifndef __ASSEMBLY__ +struct hv_tsb_descr { + unsigned short pgsz_idx; + unsigned short assoc; + unsigned int num_ttes; /* in TTEs */ + unsigned int ctx_idx; + unsigned int pgsz_mask; + unsigned long tsb_base; + unsigned long resv; +}; +#endif +#define HV_TSB_DESCR_PGSZ_IDX_OFFSET 0x00 +#define HV_TSB_DESCR_ASSOC_OFFSET 0x02 +#define HV_TSB_DESCR_NUM_TTES_OFFSET 0x04 +#define HV_TSB_DESCR_CTX_IDX_OFFSET 0x08 +#define HV_TSB_DESCR_PGSZ_MASK_OFFSET 0x0c +#define HV_TSB_DESCR_TSB_BASE_OFFSET 0x10 +#define HV_TSB_DESCR_RESV_OFFSET 0x18 + +/* Page size bitmask. */ +#define HV_PGSZ_MASK_8K (1 << 0) +#define HV_PGSZ_MASK_64K (1 << 1) +#define HV_PGSZ_MASK_512K (1 << 2) +#define HV_PGSZ_MASK_4MB (1 << 3) +#define HV_PGSZ_MASK_32MB (1 << 4) +#define HV_PGSZ_MASK_256MB (1 << 5) +#define HV_PGSZ_MASK_2GB (1 << 6) +#define HV_PGSZ_MASK_16GB (1 << 7) + +/* Page size index. The value given in the TSB descriptor must correspond + * to the smallest page size specified in the pgsz_mask page size bitmask. + */ +#define HV_PGSZ_IDX_8K 0 +#define HV_PGSZ_IDX_64K 1 +#define HV_PGSZ_IDX_512K 2 +#define HV_PGSZ_IDX_4MB 3 +#define HV_PGSZ_IDX_32MB 4 +#define HV_PGSZ_IDX_256MB 5 +#define HV_PGSZ_IDX_2GB 6 +#define HV_PGSZ_IDX_16GB 7 + +/* MMU fault status area. + * + * MMU related faults have their status and fault address information + * placed into a memory region made available by privileged code. Each + * virtual processor must make a mmu_fault_area_conf() call to tell the + * hypervisor where that processor's fault status should be stored. + * + * The fault status block is a multiple of 64-bytes and must be aligned + * on a 64-byte boundary. + */ +#ifndef __ASSEMBLY__ +struct hv_fault_status { + unsigned long i_fault_type; + unsigned long i_fault_addr; + unsigned long i_fault_ctx; + unsigned long i_reserved[5]; + unsigned long d_fault_type; + unsigned long d_fault_addr; + unsigned long d_fault_ctx; + unsigned long d_reserved[5]; +}; +#endif +#define HV_FAULT_I_TYPE_OFFSET 0x00 +#define HV_FAULT_I_ADDR_OFFSET 0x08 +#define HV_FAULT_I_CTX_OFFSET 0x10 +#define HV_FAULT_D_TYPE_OFFSET 0x40 +#define HV_FAULT_D_ADDR_OFFSET 0x48 +#define HV_FAULT_D_CTX_OFFSET 0x50 + +#define HV_FAULT_TYPE_FAST_MISS 1 +#define HV_FAULT_TYPE_FAST_PROT 2 +#define HV_FAULT_TYPE_MMU_MISS 3 +#define HV_FAULT_TYPE_INV_RA 4 +#define HV_FAULT_TYPE_PRIV_VIOL 5 +#define HV_FAULT_TYPE_PROT_VIOL 6 +#define HV_FAULT_TYPE_NFO 7 +#define HV_FAULT_TYPE_NFO_SEFF 8 +#define HV_FAULT_TYPE_INV_VA 9 +#define HV_FAULT_TYPE_INV_ASI 10 +#define HV_FAULT_TYPE_NC_ATOMIC 11 +#define HV_FAULT_TYPE_PRIV_ACT 12 +#define HV_FAULT_TYPE_RESV1 13 +#define HV_FAULT_TYPE_UNALIGNED 14 +#define HV_FAULT_TYPE_INV_PGSZ 15 +/* Values 16 --> -2 are reserved. */ +#define HV_FAULT_TYPE_MULTIPLE -1 + +/* Flags argument for mmu_{map,unmap}_addr(), mmu_demap_{page,context,all}(), + * and mmu_{map,unmap}_perm_addr(). + */ +#define HV_MMU_DMMU 0x01 +#define HV_MMU_IMMU 0x02 +#define HV_MMU_ALL (HV_MMU_DMMU | HV_MMU_IMMU) + +/* mmu_map_addr() + * TRAP: HV_MMU_MAP_ADDR_TRAP + * ARG0: virtual address + * ARG1: mmu context + * ARG2: TTE + * ARG3: flags (HV_MMU_{IMMU,DMMU}) + * ERRORS: EINVAL Invalid virtual address, mmu context, or flags + * EBADPGSZ Invalid page size value + * ENORADDR Invalid real address in TTE + * + * Create a non-permanent mapping using the given TTE, virtual + * address, and mmu context. The flags argument determines which + * (data, or instruction, or both) TLB the mapping gets loaded into. + * + * The behavior is undefined if the valid bit is clear in the TTE. + * + * Note: This API call is for privileged code to specify temporary translation + * mappings without the need to create and manage a TSB. + */ + +/* mmu_unmap_addr() + * TRAP: HV_MMU_UNMAP_ADDR_TRAP + * ARG0: virtual address + * ARG1: mmu context + * ARG2: flags (HV_MMU_{IMMU,DMMU}) + * ERRORS: EINVAL Invalid virtual address, mmu context, or flags + * + * Demaps the given virtual address in the given mmu context on this + * CPU. This function is intended to be used to demap pages mapped + * with mmu_map_addr. This service is equivalent to invoking + * mmu_demap_page() with only the current CPU in the CPU list. The + * flags argument determines which (data, or instruction, or both) TLB + * the mapping gets unmapped from. + * + * Attempting to perform an unmap operation for a previously defined + * permanent mapping will have undefined results. + */ + +/* mmu_tsb_ctx0() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_TSB_CTX0 + * ARG0: number of TSB descriptions + * ARG1: TSB descriptions pointer + * RET0: status + * ERRORS: ENORADDR Invalid TSB descriptions pointer or + * TSB base within a descriptor + * EBADALIGN TSB descriptions pointer is not aligned + * to an 8-byte boundary, or TSB base + * within a descriptor is not aligned for + * the given TSB size + * EBADPGSZ Invalid page size in a TSB descriptor + * EBADTSB Invalid associativity or size in a TSB + * descriptor + * EINVAL Invalid number of TSB descriptions, or + * invalid context index in a TSB + * descriptor, or index page size not + * equal to smallest page size in page + * size bitmask field. + * + * Configures the TSBs for the current CPU for virtual addresses with + * context zero. The TSB descriptions pointer is a pointer to an + * array of the given number of TSB descriptions. + * + * Note: The maximum number of TSBs available to a virtual CPU is given by the + * mmu-max-#tsbs property of the cpu's corresponding "cpu" node in the + * machine description. + */ +#define HV_FAST_MMU_TSB_CTX0 0x20 + +/* mmu_tsb_ctxnon0() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_TSB_CTXNON0 + * ARG0: number of TSB descriptions + * ARG1: TSB descriptions pointer + * RET0: status + * ERRORS: Same as for mmu_tsb_ctx0() above. + * + * Configures the TSBs for the current CPU for virtual addresses with + * non-zero contexts. The TSB descriptions pointer is a pointer to an + * array of the given number of TSB descriptions. + * + * Note: A maximum of 16 TSBs may be specified in the TSB description list. + */ +#define HV_FAST_MMU_TSB_CTXNON0 0x21 + +/* mmu_demap_page() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_DEMAP_PAGE + * ARG0: reserved, must be zero + * ARG1: reserved, must be zero + * ARG2: virtual address + * ARG3: mmu context + * ARG4: flags (HV_MMU_{IMMU,DMMU}) + * RET0: status + * ERRORS: EINVAL Invalid virutal address, context, or + * flags value + * ENOTSUPPORTED ARG0 or ARG1 is non-zero + * + * Demaps any page mapping of the given virtual address in the given + * mmu context for the current virtual CPU. Any virtually tagged + * caches are guaranteed to be kept consistent. The flags argument + * determines which TLB (instruction, or data, or both) participate in + * the operation. + * + * ARG0 and ARG1 are both reserved and must be set to zero. + */ +#define HV_FAST_MMU_DEMAP_PAGE 0x22 + +/* mmu_demap_ctx() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_DEMAP_CTX + * ARG0: reserved, must be zero + * ARG1: reserved, must be zero + * ARG2: mmu context + * ARG3: flags (HV_MMU_{IMMU,DMMU}) + * RET0: status + * ERRORS: EINVAL Invalid context or flags value + * ENOTSUPPORTED ARG0 or ARG1 is non-zero + * + * Demaps all non-permanent virtual page mappings previously specified + * for the given context for the current virtual CPU. Any virtual + * tagged caches are guaranteed to be kept consistent. The flags + * argument determines which TLB (instruction, or data, or both) + * participate in the operation. + * + * ARG0 and ARG1 are both reserved and must be set to zero. + */ +#define HV_FAST_MMU_DEMAP_CTX 0x23 + +/* mmu_demap_all() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_DEMAP_ALL + * ARG0: reserved, must be zero + * ARG1: reserved, must be zero + * ARG2: flags (HV_MMU_{IMMU,DMMU}) + * RET0: status + * ERRORS: EINVAL Invalid flags value + * ENOTSUPPORTED ARG0 or ARG1 is non-zero + * + * Demaps all non-permanent virtual page mappings previously specified + * for the current virtual CPU. Any virtual tagged caches are + * guaranteed to be kept consistent. The flags argument determines + * which TLB (instruction, or data, or both) participate in the + * operation. + * + * ARG0 and ARG1 are both reserved and must be set to zero. + */ +#define HV_FAST_MMU_DEMAP_ALL 0x24 + +/* mmu_map_perm_addr() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_MAP_PERM_ADDR + * ARG0: virtual address + * ARG1: reserved, must be zero + * ARG2: TTE + * ARG3: flags (HV_MMU_{IMMU,DMMU}) + * RET0: status + * ERRORS: EINVAL Invalid virutal address or flags value + * EBADPGSZ Invalid page size value + * ENORADDR Invalid real address in TTE + * ETOOMANY Too many mappings (max of 8 reached) + * + * Create a permanent mapping using the given TTE and virtual address + * for context 0 on the calling virtual CPU. A maximum of 8 such + * permanent mappings may be specified by privileged code. Mappings + * may be removed with mmu_unmap_perm_addr(). + * + * The behavior is undefined if a TTE with the valid bit clear is given. + * + * Note: This call is used to specify address space mappings for which + * privileged code does not expect to receive misses. For example, + * this mechanism can be used to map kernel nucleus code and data. + */ +#define HV_FAST_MMU_MAP_PERM_ADDR 0x25 + +/* mmu_fault_area_conf() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_FAULT_AREA_CONF + * ARG0: real address + * RET0: status + * RET1: previous mmu fault area real address + * ERRORS: ENORADDR Invalid real address + * EBADALIGN Invalid alignment for fault area + * + * Configure the MMU fault status area for the calling CPU. A 64-byte + * aligned real address specifies where MMU fault status information + * is placed. The return value is the previously specified area, or 0 + * for the first invocation. Specifying a fault area at real address + * 0 is not allowed. + */ +#define HV_FAST_MMU_FAULT_AREA_CONF 0x26 + +/* mmu_enable() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_ENABLE + * ARG0: enable flag + * ARG1: return target address + * RET0: status + * ERRORS: ENORADDR Invalid real address when disabling + * translation. + * EBADALIGN The return target address is not + * aligned to an instruction. + * EINVAL The enable flag request the current + * operating mode (e.g. disable if already + * disabled) + * + * Enable or disable virtual address translation for the calling CPU + * within the virtual machine domain. If the enable flag is zero, + * translation is disabled, any non-zero value will enable + * translation. + * + * When this function returns, the newly selected translation mode + * will be active. If the mmu is being enabled, then the return + * target address is a virtual address else it is a real address. + * + * Upon successful completion, control will be returned to the given + * return target address (ie. the cpu will jump to that address). On + * failure, the previous mmu mode remains and the trap simply returns + * as normal with the appropriate error code in RET0. + */ +#define HV_FAST_MMU_ENABLE 0x27 + +/* mmu_unmap_perm_addr() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_UNMAP_PERM_ADDR + * ARG0: virtual address + * ARG1: reserved, must be zero + * ARG2: flags (HV_MMU_{IMMU,DMMU}) + * RET0: status + * ERRORS: EINVAL Invalid virutal address or flags value + * ENOMAP Specified mapping was not found + * + * Demaps any permanent page mapping (established via + * mmu_map_perm_addr()) at the given virtual address for context 0 on + * the current virtual CPU. Any virtual tagged caches are guaranteed + * to be kept consistent. + */ +#define HV_FAST_MMU_UNMAP_PERM_ADDR 0x28 + +/* mmu_tsb_ctx0_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_TSB_CTX0_INFO + * ARG0: max TSBs + * ARG1: buffer pointer + * RET0: status + * RET1: number of TSBs + * ERRORS: EINVAL Supplied buffer is too small + * EBADALIGN The buffer pointer is badly aligned + * ENORADDR Invalid real address for buffer pointer + * + * Return the TSB configuration as previous defined by mmu_tsb_ctx0() + * into the provided buffer. The size of the buffer is given in ARG1 + * in terms of the number of TSB description entries. + * + * Upon return, RET1 always contains the number of TSB descriptions + * previously configured. If zero TSBs were configured, EOK is + * returned with RET1 containing 0. + */ +#define HV_FAST_MMU_TSB_CTX0_INFO 0x29 + +/* mmu_tsb_ctxnon0_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_TSB_CTXNON0_INFO + * ARG0: max TSBs + * ARG1: buffer pointer + * RET0: status + * RET1: number of TSBs + * ERRORS: EINVAL Supplied buffer is too small + * EBADALIGN The buffer pointer is badly aligned + * ENORADDR Invalid real address for buffer pointer + * + * Return the TSB configuration as previous defined by + * mmu_tsb_ctxnon0() into the provided buffer. The size of the buffer + * is given in ARG1 in terms of the number of TSB description entries. + * + * Upon return, RET1 always contains the number of TSB descriptions + * previously configured. If zero TSBs were configured, EOK is + * returned with RET1 containing 0. + */ +#define HV_FAST_MMU_TSB_CTXNON0_INFO 0x2a + +/* mmu_fault_area_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMU_FAULT_AREA_INFO + * RET0: status + * RET1: fault area real address + * ERRORS: No errors defined. + * + * Return the currently defined MMU fault status area for the current + * CPU. The real address of the fault status area is returned in + * RET1, or 0 is returned in RET1 if no fault status area is defined. + * + * Note: mmu_fault_area_conf() may be called with the return value (RET1) + * from this service if there is a need to save and restore the fault + * area for a cpu. + */ +#define HV_FAST_MMU_FAULT_AREA_INFO 0x2b + +/* Cache and Memory services. */ + +/* mem_scrub() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MEM_SCRUB + * ARG0: real address + * ARG1: length + * RET0: status + * RET1: length scrubbed + * ERRORS: ENORADDR Invalid real address + * EBADALIGN Start address or length are not correctly + * aligned + * EINVAL Length is zero + * + * Zero the memory contents in the range real address to real address + * plus length minus 1. Also, valid ECC will be generated for that + * memory address range. Scrubbing is started at the given real + * address, but may not scrub the entire given length. The actual + * length scrubbed will be returned in RET1. + * + * The real address and length must be aligned on an 8K boundary, or + * contain the start address and length from a sun4v error report. + * + * Note: There are two uses for this function. The first use is to block clear + * and initialize memory and the second is to scrub an u ncorrectable + * error reported via a resumable or non-resumable trap. The second + * use requires the arguments to be equal to the real address and length + * provided in a sun4v memory error report. + */ +#define HV_FAST_MEM_SCRUB 0x31 + +/* mem_sync() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MEM_SYNC + * ARG0: real address + * ARG1: length + * RET0: status + * RET1: length synced + * ERRORS: ENORADDR Invalid real address + * EBADALIGN Start address or length are not correctly + * aligned + * EINVAL Length is zero + * + * Force the next access within the real address to real address plus + * length minus 1 to be fetches from main system memory. Less than + * the given length may be synced, the actual amount synced is + * returned in RET1. The real address and length must be aligned on + * an 8K boundary. + */ +#define HV_FAST_MEM_SYNC 0x32 + +/* Time of day services. + * + * The hypervisor maintains the time of day on a per-domain basis. + * Changing the time of day in one domain does not affect the time of + * day on any other domain. + * + * Time is described by a single unsigned 64-bit word which is the + * number of seconds since the UNIX Epoch (00:00:00 UTC, January 1, + * 1970). + */ + +/* tod_get() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_TOD_GET + * RET0: status + * RET1: TOD + * ERRORS: EWOULDBLOCK TOD resource is temporarily unavailable + * ENOTSUPPORTED If TOD not supported on this platform + * + * Return the current time of day. May block if TOD access is + * temporarily not possible. + */ +#define HV_FAST_TOD_GET 0x50 + +/* tod_set() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_TOD_SET + * ARG0: TOD + * RET0: status + * ERRORS: EWOULDBLOCK TOD resource is temporarily unavailable + * ENOTSUPPORTED If TOD not supported on this platform + * + * The current time of day is set to the value specified in ARG0. May + * block if TOD access is temporarily not possible. + */ +#define HV_FAST_TOD_SET 0x51 + +/* Console services */ + +/* con_getchar() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CONS_GETCHAR + * RET0: status + * RET1: character + * ERRORS: EWOULDBLOCK No character available. + * + * Returns a character from the console device. If no character is + * available then an EWOULDBLOCK error is returned. If a character is + * available, then the returned status is EOK and the character value + * is in RET1. + * + * A virtual BREAK is represented by the 64-bit value -1. + * + * A virtual HUP signal is represented by the 64-bit value -2. + */ +#define HV_FAST_CONS_GETCHAR 0x60 + +/* con_putchar() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_CONS_PUTCHAR + * ARG0: character + * RET0: status + * ERRORS: EINVAL Illegal character + * EWOULDBLOCK Output buffer currentl full, would block + * + * Send a character to the console device. Only character values + * between 0 and 255 may be used. Values outside this range are + * invalid except for the 64-bit value -1 which is used to send a + * virtual BREAK. + */ +#define HV_FAST_CONS_PUTCHAR 0x61 + +/* Trap trace services. + * + * The hypervisor provides a trap tracing capability for privileged + * code running on each virtual CPU. Privileged code provides a + * round-robin trap trace queue within which the hypervisor writes + * 64-byte entries detailing hyperprivileged traps taken n behalf of + * privileged code. This is provided as a debugging capability for + * privileged code. + * + * The trap trace control structure is 64-bytes long and placed at the + * start (offset 0) of the trap trace buffer, and is described as + * follows: + */ +#ifndef __ASSEMBLY__ +struct hv_trap_trace_control { + unsigned long head_offset; + unsigned long tail_offset; + unsigned long __reserved[0x30 / sizeof(unsigned long)]; +}; +#endif +#define HV_TRAP_TRACE_CTRL_HEAD_OFFSET 0x00 +#define HV_TRAP_TRACE_CTRL_TAIL_OFFSET 0x08 + +/* The head offset is the offset of the most recently completed entry + * in the trap-trace buffer. The tail offset is the offset of the + * next entry to be written. The control structure is owned and + * modified by the hypervisor. A guest may not modify the control + * structure contents. Attempts to do so will result in undefined + * behavior for the guest. + * + * Each trap trace buffer entry is layed out as follows: + */ +#ifndef __ASSEMBLY__ +struct hv_trap_trace_entry { + unsigned char type; /* Hypervisor or guest entry? */ + unsigned char hpstate; /* Hyper-privileged state */ + unsigned char tl; /* Trap level */ + unsigned char gl; /* Global register level */ + unsigned short tt; /* Trap type */ + unsigned short tag; /* Extended trap identifier */ + unsigned long tstate; /* Trap state */ + unsigned long tick; /* Tick */ + unsigned long tpc; /* Trap PC */ + unsigned long f1; /* Entry specific */ + unsigned long f2; /* Entry specific */ + unsigned long f3; /* Entry specific */ + unsigned long f4; /* Entry specific */ +}; +#endif +#define HV_TRAP_TRACE_ENTRY_TYPE 0x00 +#define HV_TRAP_TRACE_ENTRY_HPSTATE 0x01 +#define HV_TRAP_TRACE_ENTRY_TL 0x02 +#define HV_TRAP_TRACE_ENTRY_GL 0x03 +#define HV_TRAP_TRACE_ENTRY_TT 0x04 +#define HV_TRAP_TRACE_ENTRY_TAG 0x06 +#define HV_TRAP_TRACE_ENTRY_TSTATE 0x08 +#define HV_TRAP_TRACE_ENTRY_TICK 0x10 +#define HV_TRAP_TRACE_ENTRY_TPC 0x18 +#define HV_TRAP_TRACE_ENTRY_F1 0x20 +#define HV_TRAP_TRACE_ENTRY_F2 0x28 +#define HV_TRAP_TRACE_ENTRY_F3 0x30 +#define HV_TRAP_TRACE_ENTRY_F4 0x38 + +/* The type field is encoded as follows. */ +#define HV_TRAP_TYPE_UNDEF 0x00 /* Entry content undefined */ +#define HV_TRAP_TYPE_HV 0x01 /* Hypervisor trap entry */ +#define HV_TRAP_TYPE_GUEST 0xff /* Added via ttrace_addentry() */ + +/* ttrace_buf_conf() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_TTRACE_BUF_CONF + * ARG0: real address + * ARG1: number of entries + * RET0: status + * RET1: number of entries + * ERRORS: ENORADDR Invalid real address + * EINVAL Size is too small + * EBADALIGN Real address not aligned on 64-byte boundary + * + * Requests hypervisor trap tracing and declares a virtual CPU's trap + * trace buffer to the hypervisor. The real address supplies the real + * base address of the trap trace queue and must be 64-byte aligned. + * Specifying a value of 0 for the number of entries disables trap + * tracing for the calling virtual CPU. The buffer allocated must be + * sized for a power of two number of 64-byte trap trace entries plus + * an initial 64-byte control structure. + * + * This may be invoked any number of times so that a virtual CPU may + * relocate a trap trace buffer or create "snapshots" of information. + * + * If the real address is illegal or badly aligned, then trap tracing + * is disabled and an error is returned. + * + * Upon failure with EINVAL, this service call returns in RET1 the + * minimum number of buffer entries required. Upon other failures + * RET1 is undefined. + */ +#define HV_FAST_TTRACE_BUF_CONF 0x90 + +/* ttrace_buf_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_TTRACE_BUF_INFO + * RET0: status + * RET1: real address + * RET2: size + * ERRORS: None defined. + * + * Returns the size and location of the previously declared trap-trace + * buffer. In the event that no buffer was previously defined, or the + * buffer is disabled, this call will return a size of zero bytes. + */ +#define HV_FAST_TTRACE_BUF_INFO 0x91 + +/* ttrace_enable() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_TTRACE_ENABLE + * ARG0: enable + * RET0: status + * RET1: previous enable state + * ERRORS: EINVAL No trap trace buffer currently defined + * + * Enable or disable trap tracing, and return the previous enabled + * state in RET1. Future systems may define various flags for the + * enable argument (ARG0), for the moment a guest should pass + * "(uint64_t) -1" to enable, and "(uint64_t) 0" to disable all + * tracing - which will ensure future compatability. + */ +#define HV_FAST_TTRACE_ENABLE 0x92 + +/* ttrace_freeze() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_TTRACE_FREEZE + * ARG0: freeze + * RET0: status + * RET1: previous freeze state + * ERRORS: EINVAL No trap trace buffer currently defined + * + * Freeze or unfreeze trap tracing, returning the previous freeze + * state in RET1. A guest should pass a non-zero value to freeze and + * a zero value to unfreeze all tracing. The returned previous state + * is 0 for not frozen and 1 for frozen. + */ +#define HV_FAST_TTRACE_FREEZE 0x93 + +/* ttrace_addentry() + * TRAP: HV_TTRACE_ADDENTRY_TRAP + * ARG0: tag (16-bits) + * ARG1: data word 0 + * ARG2: data word 1 + * ARG3: data word 2 + * ARG4: data word 3 + * RET0: status + * ERRORS: EINVAL No trap trace buffer currently defined + * + * Add an entry to the trap trace buffer. Upon return only ARG0/RET0 + * is modified - none of the other registers holding arguments are + * volatile across this hypervisor service. + */ + +/* Core dump services. + * + * Since the hypervisor viraulizes and thus obscures a lot of the + * physical machine layout and state, traditional OS crash dumps can + * be difficult to diagnose especially when the problem is a + * configuration error of some sort. + * + * The dump services provide an opaque buffer into which the + * hypervisor can place it's internal state in order to assist in + * debugging such situations. The contents are opaque and extremely + * platform and hypervisor implementation specific. The guest, during + * a core dump, requests that the hypervisor update any information in + * the dump buffer in preparation to being dumped as part of the + * domain's memory image. + */ + +/* dump_buf_update() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_DUMP_BUF_UPDATE + * ARG0: real address + * ARG1: size + * RET0: status + * RET1: required size of dump buffer + * ERRORS: ENORADDR Invalid real address + * EBADALIGN Real address is not aligned on a 64-byte + * boundary + * EINVAL Size is non-zero but less than minimum size + * required + * ENOTSUPPORTED Operation not supported on current logical + * domain + * + * Declare a domain dump buffer to the hypervisor. The real address + * provided for the domain dump buffer must be 64-byte aligned. The + * size specifies the size of the dump buffer and may be larger than + * the minimum size specified in the machine description. The + * hypervisor will fill the dump buffer with opaque data. + * + * Note: A guest may elect to include dump buffer contents as part of a crash + * dump to assist with debugging. This function may be called any number + * of times so that a guest may relocate a dump buffer, or create + * "snapshots" of any dump-buffer information. Each call to + * dump_buf_update() atomically declares the new dump buffer to the + * hypervisor. + * + * A specified size of 0 unconfigures the dump buffer. If the real + * address is illegal or badly aligned, then any currently active dump + * buffer is disabled and an error is returned. + * + * In the event that the call fails with EINVAL, RET1 contains the + * minimum size requires by the hypervisor for a valid dump buffer. + */ +#define HV_FAST_DUMP_BUF_UPDATE 0x94 + +/* dump_buf_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_DUMP_BUF_INFO + * RET0: status + * RET1: real address of current dump buffer + * RET2: size of current dump buffer + * ERRORS: No errors defined. + * + * Return the currently configures dump buffer description. A + * returned size of 0 bytes indicates an undefined dump buffer. In + * this case the return address in RET1 is undefined. + */ +#define HV_FAST_DUMP_BUF_INFO 0x95 + +/* Device interrupt services. + * + * Device interrupts are allocated to system bus bridges by the hypervisor, + * and described to OBP in the machine description. OBP then describes + * these interrupts to the OS via properties in the device tree. + * + * Terminology: + * + * cpuid Unique opaque value which represents a target cpu. + * + * devhandle Device handle. It uniquely identifies a device, and + * consistes of the lower 28-bits of the hi-cell of the + * first entry of the device's "reg" property in the + * OBP device tree. + * + * devino Device interrupt number. Specifies the relative + * interrupt number within the device. The unique + * combination of devhandle and devino are used to + * identify a specific device interrupt. + * + * Note: The devino value is the same as the values in the + * "interrupts" property or "interrupt-map" property + * in the OBP device tree for that device. + * + * sysino System interrupt number. A 64-bit unsigned interger + * representing a unique interrupt within a virtual + * machine. + * + * intr_state A flag representing the interrupt state for a given + * sysino. The state values are defined below. + * + * intr_enabled A flag representing the 'enabled' state for a given + * sysino. The enable values are defined below. + */ + +#define HV_INTR_STATE_IDLE 0 /* Nothing pending */ +#define HV_INTR_STATE_RECEIVED 1 /* Interrupt received by hardware */ +#define HV_INTR_STATE_DELIVERED 2 /* Interrupt delivered to queue */ + +#define HV_INTR_DISABLED 0 /* sysino not enabled */ +#define HV_INTR_ENABLED 1 /* sysino enabled */ + +/* intr_devino_to_sysino() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_DEVINO2SYSINO + * ARG0: devhandle + * ARG1: devino + * RET0: status + * RET1: sysino + * ERRORS: EINVAL Invalid devhandle/devino + * + * Converts a device specific interrupt number of the given + * devhandle/devino into a system specific ino (sysino). + */ +#define HV_FAST_INTR_DEVINO2SYSINO 0xa0 + +/* intr_getenabled() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_GETENABLED + * ARG0: sysino + * RET0: status + * RET1: intr_enabled (HV_INTR_{DISABLED,ENABLED}) + * ERRORS: EINVAL Invalid sysino + * + * Returns interrupt enabled state in RET1 for the interrupt defined + * by the given sysino. + */ +#define HV_FAST_INTR_GETENABLED 0xa1 + +/* intr_setenabled() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_SETENABLED + * ARG0: sysino + * ARG1: intr_enabled (HV_INTR_{DISABLED,ENABLED}) + * RET0: status + * ERRORS: EINVAL Invalid sysino or intr_enabled value + * + * Set the 'enabled' state of the interrupt sysino. + */ +#define HV_FAST_INTR_SETENABLED 0xa2 + +/* intr_getstate() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_GETSTATE + * ARG0: sysino + * RET0: status + * RET1: intr_state (HV_INTR_STATE_*) + * ERRORS: EINVAL Invalid sysino + * + * Returns current state of the interrupt defined by the given sysino. + */ +#define HV_FAST_INTR_GETSTATE 0xa3 + +/* intr_setstate() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_SETSTATE + * ARG0: sysino + * ARG1: intr_state (HV_INTR_STATE_*) + * RET0: status + * ERRORS: EINVAL Invalid sysino or intr_state value + * + * Sets the current state of the interrupt described by the given sysino + * value. + * + * Note: Setting the state to HV_INTR_STATE_IDLE clears any pending + * interrupt for sysino. + */ +#define HV_FAST_INTR_SETSTATE 0xa4 + +/* intr_gettarget() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_GETTARGET + * ARG0: sysino + * RET0: status + * RET1: cpuid + * ERRORS: EINVAL Invalid sysino + * + * Returns CPU that is the current target of the interrupt defined by + * the given sysino. The CPU value returned is undefined if the target + * has not been set via intr_settarget(). + */ +#define HV_FAST_INTR_GETTARGET 0xa5 + +/* intr_settarget() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_INTR_SETTARGET + * ARG0: sysino + * ARG1: cpuid + * RET0: status + * ERRORS: EINVAL Invalid sysino + * ENOCPU Invalid cpuid + * + * Set the target CPU for the interrupt defined by the given sysino. + */ +#define HV_FAST_INTR_SETTARGET 0xa6 + +/* PCI IO services. + * + * See the terminology descriptions in the device interrupt services + * section above as those apply here too. Here are terminology + * definitions specific to these PCI IO services: + * + * tsbnum TSB number. Indentifies which io-tsb is used. + * For this version of the specification, tsbnum + * must be zero. + * + * tsbindex TSB index. Identifies which entry in the TSB + * is used. The first entry is zero. + * + * tsbid A 64-bit aligned data structure which contains + * a tsbnum and a tsbindex. Bits 63:32 contain the + * tsbnum and bits 31:00 contain the tsbindex. + * + * io_attributes IO attributes for IOMMU mappings. One of more + * of the attritbute bits are stores in a 64-bit + * value. The values are defined below. + * + * r_addr 64-bit real address + * + * pci_device PCI device address. A PCI device address identifies + * a specific device on a specific PCI bus segment. + * A PCI device address ia a 32-bit unsigned integer + * with the following format: + * + * 00000000.bbbbbbbb.dddddfff.00000000 + * + * Use the HV_PCI_DEVICE_BUILD() macro to construct + * such values. + * + * pci_config_offset + * PCI configureation space offset. For conventional + * PCI a value between 0 and 255. For extended + * configuration space, a value between 0 and 4095. + * + * Note: For PCI configuration space accesses, the offset + * must be aligned to the access size. + * + * error_flag A return value which specifies if the action succeeded + * or failed. 0 means no error, non-0 means some error + * occurred while performing the service. + * + * io_sync_direction + * Direction definition for pci_dma_sync(), defined + * below in HV_PCI_SYNC_*. + * + * io_page_list A list of io_page_addresses, an io_page_address is + * a real address. + * + * io_page_list_p A pointer to an io_page_list. + * + * "size based byte swap" - Some functions do size based byte swapping + * which allows sw to access pointers and + * counters in native form when the processor + * operates in a different endianness than the + * IO bus. Size-based byte swapping converts a + * multi-byte field between big-endian and + * little-endian format. + */ + +#define HV_PCI_MAP_ATTR_READ 0x01 +#define HV_PCI_MAP_ATTR_WRITE 0x02 + +#define HV_PCI_DEVICE_BUILD(b,d,f) \ + ((((b) & 0xff) << 16) | \ + (((d) & 0x1f) << 11) | \ + (((f) & 0x07) << 8)) + +#define HV_PCI_SYNC_FOR_DEVICE 0x01 +#define HV_PCI_SYNC_FOR_CPU 0x02 + +/* pci_iommu_map() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_IOMMU_MAP + * ARG0: devhandle + * ARG1: tsbid + * ARG2: #ttes + * ARG3: io_attributes + * ARG4: io_page_list_p + * RET0: status + * RET1: #ttes mapped + * ERRORS: EINVAL Invalid devhandle/tsbnum/tsbindex/io_attributes + * EBADALIGN Improperly aligned real address + * ENORADDR Invalid real address + * + * Create IOMMU mappings in the sun4v device defined by the given + * devhandle. The mappings are created in the TSB defined by the + * tsbnum component of the given tsbid. The first mapping is created + * in the TSB i ndex defined by the tsbindex component of the given tsbid. + * The call creates up to #ttes mappings, the first one at tsbnum, tsbindex, + * the second at tsbnum, tsbindex + 1, etc. + * + * All mappings are created with the attributes defined by the io_attributes + * argument. The page mapping addresses are described in the io_page_list + * defined by the given io_page_list_p, which is a pointer to the io_page_list. + * The first entry in the io_page_list is the address for the first iotte, the + * 2nd for the 2nd iotte, and so on. + * + * Each io_page_address in the io_page_list must be appropriately aligned. + * #ttes must be greater than zero. For this version of the spec, the tsbnum + * component of the given tsbid must be zero. + * + * Returns the actual number of mappings creates, which may be less than + * or equal to the argument #ttes. If the function returns a value which + * is less than the #ttes, the caller may continus to call the function with + * an updated tsbid, #ttes, io_page_list_p arguments until all pages are + * mapped. + * + * Note: This function does not imply an iotte cache flush. The guest must + * demap an entry before re-mapping it. + */ +#define HV_FAST_PCI_IOMMU_MAP 0xb0 + +/* pci_iommu_demap() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_IOMMU_DEMAP + * ARG0: devhandle + * ARG1: tsbid + * ARG2: #ttes + * RET0: status + * RET1: #ttes demapped + * ERRORS: EINVAL Invalid devhandle/tsbnum/tsbindex + * + * Demap and flush IOMMU mappings in the device defined by the given + * devhandle. Demaps up to #ttes entries in the TSB defined by the tsbnum + * component of the given tsbid, starting at the TSB index defined by the + * tsbindex component of the given tsbid. + * + * For this version of the spec, the tsbnum of the given tsbid must be zero. + * #ttes must be greater than zero. + * + * Returns the actual number of ttes demapped, which may be less than or equal + * to the argument #ttes. If #ttes demapped is less than #ttes, the caller + * may continue to call this function with updated tsbid and #ttes arguments + * until all pages are demapped. + * + * Note: Entries do not have to be mapped to be demapped. A demap of an + * unmapped page will flush the entry from the tte cache. + */ +#define HV_FAST_PCI_IOMMU_DEMAP 0xb1 + +/* pci_iommu_getmap() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_IOMMU_GETMAP + * ARG0: devhandle + * ARG1: tsbid + * RET0: status + * RET1: io_attributes + * RET2: real address + * ERRORS: EINVAL Invalid devhandle/tsbnum/tsbindex + * ENOMAP Mapping is not valid, no translation exists + * + * Read and return the mapping in the device described by the given devhandle + * and tsbid. If successful, the io_attributes shall be returned in RET1 + * and the page address of the mapping shall be returned in RET2. + * + * For this version of the spec, the tsbnum component of the given tsbid + * must be zero. + */ +#define HV_FAST_PCI_IOMMU_GETMAP 0xb2 + +/* pci_iommu_getbypass() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_IOMMU_GETBYPASS + * ARG0: devhandle + * ARG1: real address + * ARG2: io_attributes + * RET0: status + * RET1: io_addr + * ERRORS: EINVAL Invalid devhandle/io_attributes + * ENORADDR Invalid real address + * ENOTSUPPORTED Function not supported in this implementation. + * + * Create a "special" mapping in the device described by the given devhandle, + * for the given real address and attributes. Return the IO address in RET1 + * if successful. + */ +#define HV_FAST_PCI_IOMMU_GETBYPASS 0xb3 + +/* pci_config_get() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_CONFIG_GET + * ARG0: devhandle + * ARG1: pci_device + * ARG2: pci_config_offset + * ARG3: size + * RET0: status + * RET1: error_flag + * RET2: data + * ERRORS: EINVAL Invalid devhandle/pci_device/offset/size + * EBADALIGN pci_config_offset not size aligned + * ENOACCESS Access to this offset is not permitted + * + * Read PCI configuration space for the adapter described by the given + * devhandle. Read size (1, 2, or 4) bytes of data from the given + * pci_device, at pci_config_offset from the beginning of the device's + * configuration space. If there was no error, RET1 is set to zero and + * RET2 is set to the data read. Insignificant bits in RET2 are not + * guarenteed to have any specific value and therefore must be ignored. + * + * The data returned in RET2 is size based byte swapped. + * + * If an error occurs during the read, set RET1 to a non-zero value. The + * given pci_config_offset must be 'size' aligned. + */ +#define HV_FAST_PCI_CONFIG_GET 0xb4 + +/* pci_config_put() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_CONFIG_PUT + * ARG0: devhandle + * ARG1: pci_device + * ARG2: pci_config_offset + * ARG3: size + * ARG4: data + * RET0: status + * RET1: error_flag + * ERRORS: EINVAL Invalid devhandle/pci_device/offset/size + * EBADALIGN pci_config_offset not size aligned + * ENOACCESS Access to this offset is not permitted + * + * Write PCI configuration space for the adapter described by the given + * devhandle. Write size (1, 2, or 4) bytes of data in a single operation, + * at pci_config_offset from the beginning of the device's configuration + * space. The data argument contains the data to be written to configuration + * space. Prior to writing, the data is size based byte swapped. + * + * If an error occurs during the write access, do not generate an error + * report, do set RET1 to a non-zero value. Otherwise RET1 is zero. + * The given pci_config_offset must be 'size' aligned. + * + * This function is permitted to read from offset zero in the configuration + * space described by the given pci_device if necessary to ensure that the + * write access to config space completes. + */ +#define HV_FAST_PCI_CONFIG_PUT 0xb5 + +/* pci_peek() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_PEEK + * ARG0: devhandle + * ARG1: real address + * ARG2: size + * RET0: status + * RET1: error_flag + * RET2: data + * ERRORS: EINVAL Invalid devhandle or size + * EBADALIGN Improperly aligned real address + * ENORADDR Bad real address + * ENOACCESS Guest access prohibited + * + * Attempt to read the IO address given by the given devhandle, real address, + * and size. Size must be 1, 2, 4, or 8. The read is performed as a single + * access operation using the given size. If an error occurs when reading + * from the given location, do not generate an error report, but return a + * non-zero value in RET1. If the read was successful, return zero in RET1 + * and return the actual data read in RET2. The data returned is size based + * byte swapped. + * + * Non-significant bits in RET2 are not guarenteed to have any specific value + * and therefore must be ignored. If RET1 is returned as non-zero, the data + * value is not guarenteed to have any specific value and should be ignored. + * + * The caller must have permission to read from the given devhandle, real + * address, which must be an IO address. The argument real address must be a + * size aligned address. + * + * The hypervisor implementation of this function must block access to any + * IO address that the guest does not have explicit permission to access. + */ +#define HV_FAST_PCI_PEEK 0xb6 + +/* pci_poke() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_POKE + * ARG0: devhandle + * ARG1: real address + * ARG2: size + * ARG3: data + * ARG4: pci_device + * RET0: status + * RET1: error_flag + * ERRORS: EINVAL Invalid devhandle, size, or pci_device + * EBADALIGN Improperly aligned real address + * ENORADDR Bad real address + * ENOACCESS Guest access prohibited + * ENOTSUPPORTED Function is not supported by implementation + * + * Attempt to write data to the IO address given by the given devhandle, + * real address, and size. Size must be 1, 2, 4, or 8. The write is + * performed as a single access operation using the given size. Prior to + * writing the data is size based swapped. + * + * If an error occurs when writing to the given location, do not generate an + * error report, but return a non-zero value in RET1. If the write was + * successful, return zero in RET1. + * + * pci_device describes the configuration address of the device being + * written to. The implementation may safely read from offset 0 with + * the configuration space of the device described by devhandle and + * pci_device in order to guarantee that the write portion of the operation + * completes + * + * Any error that occurs due to the read shall be reported using the normal + * error reporting mechanisms .. the read error is not suppressed. + * + * The caller must have permission to write to the given devhandle, real + * address, which must be an IO address. The argument real address must be a + * size aligned address. The caller must have permission to read from + * the given devhandle, pci_device cofiguration space offset 0. + * + * The hypervisor implementation of this function must block access to any + * IO address that the guest does not have explicit permission to access. + */ +#define HV_FAST_PCI_POKE 0xb7 + +/* pci_dma_sync() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_DMA_SYNC + * ARG0: devhandle + * ARG1: real address + * ARG2: size + * ARG3: io_sync_direction + * RET0: status + * RET1: #synced + * ERRORS: EINVAL Invalid devhandle or io_sync_direction + * ENORADDR Bad real address + * + * Synchronize a memory region described by the given real address and size, + * for the device defined by the given devhandle using the direction(s) + * defined by the given io_sync_direction. The argument size is the size of + * the memory region in bytes. + * + * Return the actual number of bytes synchronized in the return value #synced, + * which may be less than or equal to the argument size. If the return + * value #synced is less than size, the caller must continue to call this + * function with updated real address and size arguments until the entire + * memory region is synchronized. + */ +#define HV_FAST_PCI_DMA_SYNC 0xb8 + +/* PCI MSI services. */ + +#define HV_MSITYPE_MSI32 0x00 +#define HV_MSITYPE_MSI64 0x01 + +#define HV_MSIQSTATE_IDLE 0x00 +#define HV_MSIQSTATE_ERROR 0x01 + +#define HV_MSIQ_INVALID 0x00 +#define HV_MSIQ_VALID 0x01 + +#define HV_MSISTATE_IDLE 0x00 +#define HV_MSISTATE_DELIVERED 0x01 + +#define HV_MSIVALID_INVALID 0x00 +#define HV_MSIVALID_VALID 0x01 + +#define HV_PCIE_MSGTYPE_PME_MSG 0x18 +#define HV_PCIE_MSGTYPE_PME_ACK_MSG 0x1b +#define HV_PCIE_MSGTYPE_CORR_MSG 0x30 +#define HV_PCIE_MSGTYPE_NONFATAL_MSG 0x31 +#define HV_PCIE_MSGTYPE_FATAL_MSG 0x33 + +#define HV_MSG_INVALID 0x00 +#define HV_MSG_VALID 0x01 + +/* pci_msiq_conf() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_CONF + * ARG0: devhandle + * ARG1: msiqid + * ARG2: real address + * ARG3: number of entries + * RET0: status + * ERRORS: EINVAL Invalid devhandle, msiqid or nentries + * EBADALIGN Improperly aligned real address + * ENORADDR Bad real address + * + * Configure the MSI queue given by the devhandle and msiqid arguments, + * and to be placed at the given real address and be of the given + * number of entries. The real address must be aligned exactly to match + * the queue size. Each queue entry is 64-bytes long, so f.e. a 32 entry + * queue must be aligned on a 2048 byte real address boundary. The MSI-EQ + * Head and Tail are initialized so that the MSI-EQ is 'empty'. + * + * Implementation Note: Certain implementations have fixed sized queues. In + * that case, number of entries must contain the correct + * value. + */ +#define HV_FAST_PCI_MSIQ_CONF 0xc0 + +/* pci_msiq_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_INFO + * ARG0: devhandle + * ARG1: msiqid + * RET0: status + * RET1: real address + * RET2: number of entries + * ERRORS: EINVAL Invalid devhandle or msiqid + * + * Return the configuration information for the MSI queue described + * by the given devhandle and msiqid. The base address of the queue + * is returned in ARG1 and the number of entries is returned in ARG2. + * If the queue is unconfigured, the real address is undefined and the + * number of entries will be returned as zero. + */ +#define HV_FAST_PCI_MSIQ_INFO 0xc1 + +/* pci_msiq_getvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_GETVALID + * ARG0: devhandle + * ARG1: msiqid + * RET0: status + * RET1: msiqvalid (HV_MSIQ_VALID or HV_MSIQ_INVALID) + * ERRORS: EINVAL Invalid devhandle or msiqid + * + * Get the valid state of the MSI-EQ described by the given devhandle and + * msiqid. + */ +#define HV_FAST_PCI_MSIQ_GETVALID 0xc2 + +/* pci_msiq_setvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_SETVALID + * ARG0: devhandle + * ARG1: msiqid + * ARG2: msiqvalid (HV_MSIQ_VALID or HV_MSIQ_INVALID) + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msiqid or msiqvalid + * value or MSI EQ is uninitialized + * + * Set the valid state of the MSI-EQ described by the given devhandle and + * msiqid to the given msiqvalid. + */ +#define HV_FAST_PCI_MSIQ_SETVALID 0xc3 + +/* pci_msiq_getstate() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_GETSTATE + * ARG0: devhandle + * ARG1: msiqid + * RET0: status + * RET1: msiqstate (HV_MSIQSTATE_IDLE or HV_MSIQSTATE_ERROR) + * ERRORS: EINVAL Invalid devhandle or msiqid + * + * Get the state of the MSI-EQ described by the given devhandle and + * msiqid. + */ +#define HV_FAST_PCI_MSIQ_GETSTATE 0xc4 + +/* pci_msiq_getvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_GETVALID + * ARG0: devhandle + * ARG1: msiqid + * ARG2: msiqstate (HV_MSIQSTATE_IDLE or HV_MSIQSTATE_ERROR) + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msiqid or msiqstate + * value or MSI EQ is uninitialized + * + * Set the state of the MSI-EQ described by the given devhandle and + * msiqid to the given msiqvalid. + */ +#define HV_FAST_PCI_MSIQ_SETSTATE 0xc5 + +/* pci_msiq_gethead() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_GETHEAD + * ARG0: devhandle + * ARG1: msiqid + * RET0: status + * RET1: msiqhead + * ERRORS: EINVAL Invalid devhandle or msiqid + * + * Get the current MSI EQ queue head for the MSI-EQ described by the + * given devhandle and msiqid. + */ +#define HV_FAST_PCI_MSIQ_GETHEAD 0xc6 + +/* pci_msiq_sethead() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_SETHEAD + * ARG0: devhandle + * ARG1: msiqid + * ARG2: msiqhead + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msiqid or msiqhead, + * or MSI EQ is uninitialized + * + * Set the current MSI EQ queue head for the MSI-EQ described by the + * given devhandle and msiqid. + */ +#define HV_FAST_PCI_MSIQ_SETHEAD 0xc7 + +/* pci_msiq_gettail() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSIQ_GETTAIL + * ARG0: devhandle + * ARG1: msiqid + * RET0: status + * RET1: msiqtail + * ERRORS: EINVAL Invalid devhandle or msiqid + * + * Get the current MSI EQ queue tail for the MSI-EQ described by the + * given devhandle and msiqid. + */ +#define HV_FAST_PCI_MSIQ_GETTAIL 0xc8 + +/* pci_msi_getvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSI_GETVALID + * ARG0: devhandle + * ARG1: msinum + * RET0: status + * RET1: msivalidstate + * ERRORS: EINVAL Invalid devhandle or msinum + * + * Get the current valid/enabled state for the MSI defined by the + * given devhandle and msinum. + */ +#define HV_FAST_PCI_MSI_GETVALID 0xc9 + +/* pci_msi_setvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSI_SETVALID + * ARG0: devhandle + * ARG1: msinum + * ARG2: msivalidstate + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msinum or msivalidstate + * + * Set the current valid/enabled state for the MSI defined by the + * given devhandle and msinum. + */ +#define HV_FAST_PCI_MSI_SETVALID 0xca + +/* pci_msi_getmsiq() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSI_GETMSIQ + * ARG0: devhandle + * ARG1: msinum + * RET0: status + * RET1: msiqid + * ERRORS: EINVAL Invalid devhandle or msinum or MSI is unbound + * + * Get the MSI EQ that the MSI defined by the given devhandle and + * msinum is bound to. + */ +#define HV_FAST_PCI_MSI_GETMSIQ 0xcb + +/* pci_msi_setmsiq() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSI_SETMSIQ + * ARG0: devhandle + * ARG1: msinum + * ARG2: msitype + * ARG3: msiqid + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msinum or msiqid + * + * Set the MSI EQ that the MSI defined by the given devhandle and + * msinum is bound to. + */ +#define HV_FAST_PCI_MSI_SETMSIQ 0xcc + +/* pci_msi_getstate() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSI_GETSTATE + * ARG0: devhandle + * ARG1: msinum + * RET0: status + * RET1: msistate + * ERRORS: EINVAL Invalid devhandle or msinum + * + * Get the state of the MSI defined by the given devhandle and msinum. + * If not initialized, return HV_MSISTATE_IDLE. + */ +#define HV_FAST_PCI_MSI_GETSTATE 0xcd + +/* pci_msi_setstate() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSI_SETSTATE + * ARG0: devhandle + * ARG1: msinum + * ARG2: msistate + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msinum or msistate + * + * Set the state of the MSI defined by the given devhandle and msinum. + */ +#define HV_FAST_PCI_MSI_SETSTATE 0xce + +/* pci_msg_getmsiq() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSG_GETMSIQ + * ARG0: devhandle + * ARG1: msgtype + * RET0: status + * RET1: msiqid + * ERRORS: EINVAL Invalid devhandle or msgtype + * + * Get the MSI EQ of the MSG defined by the given devhandle and msgtype. + */ +#define HV_FAST_PCI_MSG_GETMSIQ 0xd0 + +/* pci_msg_setmsiq() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSG_SETMSIQ + * ARG0: devhandle + * ARG1: msgtype + * ARG2: msiqid + * RET0: status + * ERRORS: EINVAL Invalid devhandle, msgtype, or msiqid + * + * Set the MSI EQ of the MSG defined by the given devhandle and msgtype. + */ +#define HV_FAST_PCI_MSG_SETMSIQ 0xd1 + +/* pci_msg_getvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSG_GETVALID + * ARG0: devhandle + * ARG1: msgtype + * RET0: status + * RET1: msgvalidstate + * ERRORS: EINVAL Invalid devhandle or msgtype + * + * Get the valid/enabled state of the MSG defined by the given + * devhandle and msgtype. + */ +#define HV_FAST_PCI_MSG_GETVALID 0xd2 + +/* pci_msg_setvalid() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_PCI_MSG_SETVALID + * ARG0: devhandle + * ARG1: msgtype + * ARG2: msgvalidstate + * RET0: status + * ERRORS: EINVAL Invalid devhandle or msgtype or msgvalidstate + * + * Set the valid/enabled state of the MSG defined by the given + * devhandle and msgtype. + */ +#define HV_FAST_PCI_MSG_SETVALID 0xd3 + +/* Performance counter services. */ + +#define HV_PERF_JBUS_PERF_CTRL_REG 0x00 +#define HV_PERF_JBUS_PERF_CNT_REG 0x01 +#define HV_PERF_DRAM_PERF_CTRL_REG_0 0x02 +#define HV_PERF_DRAM_PERF_CNT_REG_0 0x03 +#define HV_PERF_DRAM_PERF_CTRL_REG_1 0x04 +#define HV_PERF_DRAM_PERF_CNT_REG_1 0x05 +#define HV_PERF_DRAM_PERF_CTRL_REG_2 0x06 +#define HV_PERF_DRAM_PERF_CNT_REG_2 0x07 +#define HV_PERF_DRAM_PERF_CTRL_REG_3 0x08 +#define HV_PERF_DRAM_PERF_CNT_REG_3 0x09 + +/* get_perfreg() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_GET_PERFREG + * ARG0: performance reg number + * RET0: status + * RET1: performance reg value + * ERRORS: EINVAL Invalid performance register number + * ENOACCESS No access allowed to performance counters + * + * Read the value of the given DRAM/JBUS performance counter/control register. + */ +#define HV_FAST_GET_PERFREG 0x100 + +/* set_perfreg() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_SET_PERFREG + * ARG0: performance reg number + * ARG1: performance reg value + * RET0: status + * ERRORS: EINVAL Invalid performance register number + * ENOACCESS No access allowed to performance counters + * + * Write the given performance reg value to the given DRAM/JBUS + * performance counter/control register. + */ +#define HV_FAST_SET_PERFREG 0x101 + +/* MMU statistics services. + * + * The hypervisor maintains MMU statistics and privileged code provides + * a buffer where these statistics can be collected. It is continually + * updated once configured. The layout is as follows: + */ +#ifndef __ASSEMBLY__ +struct hv_mmu_statistics { + unsigned long immu_tsb_hits_ctx0_8k_tte; + unsigned long immu_tsb_ticks_ctx0_8k_tte; + unsigned long immu_tsb_hits_ctx0_64k_tte; + unsigned long immu_tsb_ticks_ctx0_64k_tte; + unsigned long __reserved1[2]; + unsigned long immu_tsb_hits_ctx0_4mb_tte; + unsigned long immu_tsb_ticks_ctx0_4mb_tte; + unsigned long __reserved2[2]; + unsigned long immu_tsb_hits_ctx0_256mb_tte; + unsigned long immu_tsb_ticks_ctx0_256mb_tte; + unsigned long __reserved3[4]; + unsigned long immu_tsb_hits_ctxnon0_8k_tte; + unsigned long immu_tsb_ticks_ctxnon0_8k_tte; + unsigned long immu_tsb_hits_ctxnon0_64k_tte; + unsigned long immu_tsb_ticks_ctxnon0_64k_tte; + unsigned long __reserved4[2]; + unsigned long immu_tsb_hits_ctxnon0_4mb_tte; + unsigned long immu_tsb_ticks_ctxnon0_4mb_tte; + unsigned long __reserved5[2]; + unsigned long immu_tsb_hits_ctxnon0_256mb_tte; + unsigned long immu_tsb_ticks_ctxnon0_256mb_tte; + unsigned long __reserved6[4]; + unsigned long dmmu_tsb_hits_ctx0_8k_tte; + unsigned long dmmu_tsb_ticks_ctx0_8k_tte; + unsigned long dmmu_tsb_hits_ctx0_64k_tte; + unsigned long dmmu_tsb_ticks_ctx0_64k_tte; + unsigned long __reserved7[2]; + unsigned long dmmu_tsb_hits_ctx0_4mb_tte; + unsigned long dmmu_tsb_ticks_ctx0_4mb_tte; + unsigned long __reserved8[2]; + unsigned long dmmu_tsb_hits_ctx0_256mb_tte; + unsigned long dmmu_tsb_ticks_ctx0_256mb_tte; + unsigned long __reserved9[4]; + unsigned long dmmu_tsb_hits_ctxnon0_8k_tte; + unsigned long dmmu_tsb_ticks_ctxnon0_8k_tte; + unsigned long dmmu_tsb_hits_ctxnon0_64k_tte; + unsigned long dmmu_tsb_ticks_ctxnon0_64k_tte; + unsigned long __reserved10[2]; + unsigned long dmmu_tsb_hits_ctxnon0_4mb_tte; + unsigned long dmmu_tsb_ticks_ctxnon0_4mb_tte; + unsigned long __reserved11[2]; + unsigned long dmmu_tsb_hits_ctxnon0_256mb_tte; + unsigned long dmmu_tsb_ticks_ctxnon0_256mb_tte; + unsigned long __reserved12[4]; +}; +#endif + +/* mmustat_conf() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMUSTAT_CONF + * ARG0: real address + * RET0: status + * RET1: real address + * ERRORS: ENORADDR Invalid real address + * EBADALIGN Real address not aligned on 64-byte boundary + * EBADTRAP API not supported on this processor + * + * Enable MMU statistic gathering using the buffer at the given real + * address on the current virtual CPU. The new buffer real address + * is given in ARG1, and the previously specified buffer real address + * is returned in RET1, or is returned as zero for the first invocation. + * + * If the passed in real address argument is zero, this will disable + * MMU statistic collection on the current virtual CPU. If an error is + * returned then no statistics are collected. + * + * The buffer contents should be initialized to all zeros before being + * given to the hypervisor or else the statistics will be meaningless. + */ +#define HV_FAST_MMUSTAT_CONF 0x102 + +/* mmustat_info() + * TRAP: HV_FAST_TRAP + * FUNCTION: HV_FAST_MMUSTAT_INFO + * RET0: status + * RET1: real address + * ERRORS: EBADTRAP API not supported on this processor + * + * Return the current state and real address of the currently configured + * MMU statistics buffer on the current virtual CPU. + */ +#define HV_FAST_MMUSTAT_INFO 0x103 + +/* Function numbers for HV_CORE_TRAP. */ +#define HV_CORE_VER 0x00 +#define HV_CORE_PUTCHAR 0x01 +#define HV_CORE_EXIT 0x02 + +#endif /* !(_SPARC64_HYPERVISOR_H) */ diff --git a/include/asm-sparc64/intr_queue.h b/include/asm-sparc64/intr_queue.h new file mode 100644 index 0000000..206077d --- /dev/null +++ b/include/asm-sparc64/intr_queue.h @@ -0,0 +1,15 @@ +#ifndef _SPARC64_INTR_QUEUE_H +#define _SPARC64_INTR_QUEUE_H + +/* Sun4v interrupt queue registers, accessed via ASI_QUEUE. */ + +#define INTRQ_CPU_MONDO_HEAD 0x3c0 /* CPU mondo head */ +#define INTRQ_CPU_MONDO_TAIL 0x3c8 /* CPU mondo tail */ +#define INTRQ_DEVICE_MONDO_HEAD 0x3d0 /* Device mondo head */ +#define INTRQ_DEVICE_MONDO_TAIL 0x3d8 /* Device mondo tail */ +#define INTRQ_RESUM_MONDO_HEAD 0x3e0 /* Resumable error mondo head */ +#define INTRQ_RESUM_MONDO_TAIL 0x3e8 /* Resumable error mondo tail */ +#define INTRQ_NONRESUM_MONDO_HEAD 0x3f0 /* Non-resumable error mondo head */ +#define INTRQ_NONRESUM_MONDO_TAIL 0x3f8 /* Non-resumable error mondo head */ + +#endif /* !(_SPARC64_INTR_QUEUE_H) */ diff --git a/include/asm-sparc64/mmu.h b/include/asm-sparc64/mmu.h index 8627eed..55e6227 100644 --- a/include/asm-sparc64/mmu.h +++ b/include/asm-sparc64/mmu.h @@ -90,8 +90,24 @@ #ifndef __ASSEMBLY__ +#define TSB_ENTRY_ALIGNMENT 16 + +struct tsb { + unsigned long tag; + unsigned long pte; +} __attribute__((aligned(TSB_ENTRY_ALIGNMENT))); + +extern void __tsb_insert(unsigned long ent, unsigned long tag, unsigned long pte); +extern void tsb_flush(unsigned long ent, unsigned long tag); + typedef struct { unsigned long sparc64_ctx_val; + struct tsb *tsb; + unsigned long tsb_rss_limit; + unsigned long tsb_nentries; + unsigned long tsb_reg_val; + unsigned long tsb_map_vaddr; + unsigned long tsb_map_pte; } mm_context_t; #endif /* !__ASSEMBLY__ */ diff --git a/include/asm-sparc64/mmu_context.h b/include/asm-sparc64/mmu_context.h index 57ee7b3..1d23267 100644 --- a/include/asm-sparc64/mmu_context.h +++ b/include/asm-sparc64/mmu_context.h @@ -19,59 +19,25 @@ extern unsigned long tlb_context_cache; extern unsigned long mmu_context_bmap[]; extern void get_new_mmu_context(struct mm_struct *mm); +extern int init_new_context(struct task_struct *tsk, struct mm_struct *mm); +extern void destroy_context(struct mm_struct *mm); -/* Initialize a new mmu context. This is invoked when a new - * address space instance (unique or shared) is instantiated. - * This just needs to set mm->context to an invalid context. - */ -#define init_new_context(__tsk, __mm) \ - (((__mm)->context.sparc64_ctx_val = 0UL), 0) - -/* Destroy a dead context. This occurs when mmput drops the - * mm_users count to zero, the mmaps have been released, and - * all the page tables have been flushed. Our job is to destroy - * any remaining processor-specific state, and in the sparc64 - * case this just means freeing up the mmu context ID held by - * this task if valid. - */ -#define destroy_context(__mm) \ -do { spin_lock(&ctx_alloc_lock); \ - if (CTX_VALID((__mm)->context)) { \ - unsigned long nr = CTX_NRBITS((__mm)->context); \ - mmu_context_bmap[nr>>6] &= ~(1UL << (nr & 63)); \ - } \ - spin_unlock(&ctx_alloc_lock); \ -} while(0) - -/* Reload the two core values used by TLB miss handler - * processing on sparc64. They are: - * 1) The physical address of mm->pgd, when full page - * table walks are necessary, this is where the - * search begins. - * 2) A "PGD cache". For 32-bit tasks only pgd[0] is - * ever used since that maps the entire low 4GB - * completely. To speed up TLB miss processing we - * make this value available to the handlers. This - * decreases the amount of memory traffic incurred. - */ -#define reload_tlbmiss_state(__tsk, __mm) \ -do { \ - register unsigned long paddr asm("o5"); \ - register unsigned long pgd_cache asm("o4"); \ - paddr = __pa((__mm)->pgd); \ - pgd_cache = 0UL; \ - if (task_thread_info(__tsk)->flags & _TIF_32BIT) \ - pgd_cache = get_pgd_cache((__mm)->pgd); \ - __asm__ __volatile__("wrpr %%g0, 0x494, %%pstate\n\t" \ - "mov %3, %%g4\n\t" \ - "mov %0, %%g7\n\t" \ - "stxa %1, [%%g4] %2\n\t" \ - "membar #Sync\n\t" \ - "wrpr %%g0, 0x096, %%pstate" \ - : /* no outputs */ \ - : "r" (paddr), "r" (pgd_cache),\ - "i" (ASI_DMMU), "i" (TSB_REG)); \ -} while(0) +extern void __tsb_context_switch(unsigned long pgd_pa, unsigned long tsb_reg, + unsigned long tsb_vaddr, unsigned long tsb_pte); + +static inline void tsb_context_switch(struct mm_struct *mm) +{ + __tsb_context_switch(__pa(mm->pgd), mm->context.tsb_reg_val, + mm->context.tsb_map_vaddr, + mm->context.tsb_map_pte); +} + +extern void tsb_grow(struct mm_struct *mm, unsigned long mm_rss, gfp_t gfp_flags); +#ifdef CONFIG_SMP +extern void smp_tsb_sync(struct mm_struct *mm); +#else +#define smp_tsb_sync(__mm) do { } while (0) +#endif /* Set MMU context in the actual hardware. */ #define load_secondary_context(__mm) \ @@ -101,7 +67,7 @@ static inline void switch_mm(struct mm_s if (!ctx_valid || (old_mm != mm)) { load_secondary_context(mm); - reload_tlbmiss_state(tsk, mm); + tsb_context_switch(mm); } /* Even if (mm == old_mm) we _must_ check @@ -139,7 +105,7 @@ static inline void activate_mm(struct mm load_secondary_context(mm); __flush_tlb_mm(CTX_HWBITS(mm->context), SECONDARY_CONTEXT); - reload_tlbmiss_state(current, mm); + tsb_context_switch(mm); } #endif /* !(__ASSEMBLY__) */ diff --git a/include/asm-sparc64/pgalloc.h b/include/asm-sparc64/pgalloc.h index a96067c..12e4a27 100644 --- a/include/asm-sparc64/pgalloc.h +++ b/include/asm-sparc64/pgalloc.h @@ -6,6 +6,7 @@ #include #include #include +#include #include #include @@ -13,172 +14,59 @@ #include /* Page table allocation/freeing. */ -#ifdef CONFIG_SMP -/* Sliiiicck */ -#define pgt_quicklists local_cpu_data() -#else -extern struct pgtable_cache_struct { - unsigned long *pgd_cache; - unsigned long *pte_cache[2]; - unsigned int pgcache_size; -} pgt_quicklists; -#endif -#define pgd_quicklist (pgt_quicklists.pgd_cache) -#define pmd_quicklist ((unsigned long *)0) -#define pte_quicklist (pgt_quicklists.pte_cache) -#define pgtable_cache_size (pgt_quicklists.pgcache_size) - -static __inline__ void free_pgd_fast(pgd_t *pgd) -{ - preempt_disable(); - *(unsigned long *)pgd = (unsigned long) pgd_quicklist; - pgd_quicklist = (unsigned long *) pgd; - pgtable_cache_size++; - preempt_enable(); -} - -static __inline__ pgd_t *get_pgd_fast(void) -{ - unsigned long *ret; - - preempt_disable(); - if((ret = pgd_quicklist) != NULL) { - pgd_quicklist = (unsigned long *)(*ret); - ret[0] = 0; - pgtable_cache_size--; - preempt_enable(); - } else { - preempt_enable(); - ret = (unsigned long *) __get_free_page(GFP_KERNEL|__GFP_REPEAT); - if(ret) - memset(ret, 0, PAGE_SIZE); - } - return (pgd_t *)ret; -} - -static __inline__ void free_pgd_slow(pgd_t *pgd) -{ - free_page((unsigned long)pgd); -} - -#ifdef DCACHE_ALIASING_POSSIBLE -#define VPTE_COLOR(address) (((address) >> (PAGE_SHIFT + 10)) & 1UL) -#define DCACHE_COLOR(address) (((address) >> PAGE_SHIFT) & 1UL) -#else -#define VPTE_COLOR(address) 0 -#define DCACHE_COLOR(address) 0 -#endif +extern kmem_cache_t *pgtable_cache; -#define pud_populate(MM, PUD, PMD) pud_set(PUD, PMD) - -static __inline__ pmd_t *pmd_alloc_one_fast(struct mm_struct *mm, unsigned long address) -{ - unsigned long *ret; - int color = 0; - - preempt_disable(); - if (pte_quicklist[color] == NULL) - color = 1; - - if((ret = (unsigned long *)pte_quicklist[color]) != NULL) { - pte_quicklist[color] = (unsigned long *)(*ret); - ret[0] = 0; - pgtable_cache_size--; - } - preempt_enable(); - - return (pmd_t *)ret; -} - -static __inline__ pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address) +static inline pgd_t *pgd_alloc(struct mm_struct *mm) { - pmd_t *pmd; - - pmd = pmd_alloc_one_fast(mm, address); - if (!pmd) { - pmd = (pmd_t *)__get_free_page(GFP_KERNEL|__GFP_REPEAT); - if (pmd) - memset(pmd, 0, PAGE_SIZE); - } - return pmd; + return kmem_cache_alloc(pgtable_cache, GFP_KERNEL); } -static __inline__ void free_pmd_fast(pmd_t *pmd) +static inline void pgd_free(pgd_t *pgd) { - unsigned long color = DCACHE_COLOR((unsigned long)pmd); - - preempt_disable(); - *(unsigned long *)pmd = (unsigned long) pte_quicklist[color]; - pte_quicklist[color] = (unsigned long *) pmd; - pgtable_cache_size++; - preempt_enable(); + kmem_cache_free(pgtable_cache, pgd); } -static __inline__ void free_pmd_slow(pmd_t *pmd) -{ - free_page((unsigned long)pmd); -} - -#define pmd_populate_kernel(MM, PMD, PTE) pmd_set(PMD, PTE) -#define pmd_populate(MM,PMD,PTE_PAGE) \ - pmd_populate_kernel(MM,PMD,page_address(PTE_PAGE)) - -extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address); +#define pud_populate(MM, PUD, PMD) pud_set(PUD, PMD) -static inline struct page * -pte_alloc_one(struct mm_struct *mm, unsigned long addr) +static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr) { - pte_t *pte = pte_alloc_one_kernel(mm, addr); - - if (pte) - return virt_to_page(pte); - - return NULL; + return kmem_cache_alloc(pgtable_cache, + GFP_KERNEL|__GFP_REPEAT); } -static __inline__ pte_t *pte_alloc_one_fast(struct mm_struct *mm, unsigned long address) +static inline void pmd_free(pmd_t *pmd) { - unsigned long color = VPTE_COLOR(address); - unsigned long *ret; - - preempt_disable(); - if((ret = (unsigned long *)pte_quicklist[color]) != NULL) { - pte_quicklist[color] = (unsigned long *)(*ret); - ret[0] = 0; - pgtable_cache_size--; - } - preempt_enable(); - return (pte_t *)ret; + kmem_cache_free(pgtable_cache, pmd); } -static __inline__ void free_pte_fast(pte_t *pte) +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm, + unsigned long address) { - unsigned long color = DCACHE_COLOR((unsigned long)pte); - - preempt_disable(); - *(unsigned long *)pte = (unsigned long) pte_quicklist[color]; - pte_quicklist[color] = (unsigned long *) pte; - pgtable_cache_size++; - preempt_enable(); + return kmem_cache_alloc(pgtable_cache, + GFP_KERNEL|__GFP_REPEAT); } -static __inline__ void free_pte_slow(pte_t *pte) +static inline struct page *pte_alloc_one(struct mm_struct *mm, + unsigned long address) { - free_page((unsigned long)pte); + return virt_to_page(pte_alloc_one_kernel(mm, address)); } - + static inline void pte_free_kernel(pte_t *pte) { - free_pte_fast(pte); + kmem_cache_free(pgtable_cache, pte); } static inline void pte_free(struct page *ptepage) { - free_pte_fast(page_address(ptepage)); + pte_free_kernel(page_address(ptepage)); } -#define pmd_free(pmd) free_pmd_fast(pmd) -#define pgd_free(pgd) free_pgd_fast(pgd) -#define pgd_alloc(mm) get_pgd_fast() + +#define pmd_populate_kernel(MM, PMD, PTE) pmd_set(PMD, PTE) +#define pmd_populate(MM,PMD,PTE_PAGE) \ + pmd_populate_kernel(MM,PMD,page_address(PTE_PAGE)) + +#define check_pgt_cache() do { } while (0) #endif /* _SPARC64_PGALLOC_H */ diff --git a/include/asm-sparc64/pgtable.h b/include/asm-sparc64/pgtable.h index f0a9b44..2b2ecd6 100644 --- a/include/asm-sparc64/pgtable.h +++ b/include/asm-sparc64/pgtable.h @@ -25,7 +25,8 @@ #include /* The kernel image occupies 0x4000000 to 0x1000000 (4MB --> 32MB). - * The page copy blockops can use 0x2000000 to 0x10000000. + * The page copy blockops can use 0x2000000 to 0x4000000. + * The TSB is mapped in the 0x4000000 to 0x6000000 range. * The PROM resides in an area spanning 0xf0000000 to 0x100000000. * The vmalloc area spans 0x100000000 to 0x200000000. * Since modules need to be in the lowest 32-bits of the address space, @@ -34,6 +35,7 @@ * 0x400000000. */ #define TLBTEMP_BASE _AC(0x0000000002000000,UL) +#define TSBMAP_BASE _AC(0x0000000004000000,UL) #define MODULES_VADDR _AC(0x0000000010000000,UL) #define MODULES_LEN _AC(0x00000000e0000000,UL) #define MODULES_END _AC(0x00000000f0000000,UL) @@ -114,6 +116,10 @@ #define _PAGE_W _AC(0x0000000000000002,UL) /* Writable */ #define _PAGE_G _AC(0x0000000000000001,UL) /* Global */ +#define _PAGE_ALL_SZ_BITS \ + (_PAGE_SZ4MB | _PAGE_SZ512K | _PAGE_SZ64K | \ + _PAGE_SZ8K | _PAGE_SZ32MB | _PAGE_SZ256MB) + /* Here are the SpitFire software bits we use in the TTE's. * * WARNING: If you are going to try and start using some @@ -296,11 +302,6 @@ static inline pte_t pte_modify(pte_t ori /* to find an entry in a kernel page-table-directory */ #define pgd_offset_k(address) pgd_offset(&init_mm, address) -/* extract the pgd cache used for optimizing the tlb miss - * slow path when executing 32-bit compat processes - */ -#define get_pgd_cache(pgd) ((unsigned long) pgd_val(*pgd) << 11) - /* Find an entry in the second-level page table.. */ #define pmd_offset(pudp, address) \ ((pmd_t *) pud_page(*(pudp)) + \ @@ -435,12 +436,7 @@ extern unsigned long get_fb_unmapped_are unsigned long); #define HAVE_ARCH_FB_UNMAPPED_AREA -/* - * No page table caches to initialise - */ -#define pgtable_cache_init() do { } while (0) - -extern void check_pgt_cache(void); +extern void pgtable_cache_init(void); #endif /* !(__ASSEMBLY__) */ diff --git a/include/asm-sparc64/processor.h b/include/asm-sparc64/processor.h index cd8d9b4..b3889f3 100644 --- a/include/asm-sparc64/processor.h +++ b/include/asm-sparc64/processor.h @@ -28,6 +28,8 @@ * User lives in his very own context, and cannot reference us. Note * that TASK_SIZE is a misnomer, it really gives maximum user virtual * address that the kernel will allocate out. + * + * XXX No longer using virtual page tables, kill this upper limit... */ #define VA_BITS 44 #ifndef __ASSEMBLY__ @@ -37,18 +39,6 @@ #endif #define TASK_SIZE ((unsigned long)-VPTE_SIZE) -/* - * The vpte base must be able to hold the entire vpte, half - * of which lives above, and half below, the base. And it - * is placed as close to the highest address range as possible. - */ -#define VPTE_BASE_SPITFIRE (-(VPTE_SIZE/2)) -#if 1 -#define VPTE_BASE_CHEETAH VPTE_BASE_SPITFIRE -#else -#define VPTE_BASE_CHEETAH 0xffe0000000000000 -#endif - #ifndef __ASSEMBLY__ typedef struct { diff --git a/include/asm-sparc64/pstate.h b/include/asm-sparc64/pstate.h index 29fb74a..1181d73 100644 --- a/include/asm-sparc64/pstate.h +++ b/include/asm-sparc64/pstate.h @@ -29,10 +29,11 @@ /* The V9 TSTATE Register (with SpitFire and Linux extensions). * * --------------------------------------------------------------- - * | Resv | CCR | ASI | %pil | PSTATE | Resv | CWP | + * | Resv | GL | CCR | ASI | %pil | PSTATE | Resv | CWP | * --------------------------------------------------------------- - * 63 40 39 32 31 24 23 20 19 8 7 5 4 0 + * 63 43 42 40 39 32 31 24 23 20 19 8 7 5 4 0 */ +#define TSTATE_GL _AC(0x0000070000000000,UL) /* Global reg level */ #define TSTATE_CCR _AC(0x000000ff00000000,UL) /* Condition Codes. */ #define TSTATE_XCC _AC(0x000000f000000000,UL) /* Condition Codes. */ #define TSTATE_XNEG _AC(0x0000008000000000,UL) /* %xcc Negative. */ diff --git a/include/asm-sparc64/scratchpad.h b/include/asm-sparc64/scratchpad.h new file mode 100644 index 0000000..5e8b01f --- /dev/null +++ b/include/asm-sparc64/scratchpad.h @@ -0,0 +1,14 @@ +#ifndef _SPARC64_SCRATCHPAD_H +#define _SPARC64_SCRATCHPAD_H + +/* Sun4v scratchpad registers, accessed via ASI_SCRATCHPAD. */ + +#define SCRATCHPAD_MMU_MISS 0x00 /* Shared with OBP - set by OBP */ +#define SCRATCHPAD_CPUID 0x08 /* Shared with OBP - set by hypervisor */ +#define SCRATCHPAD_UTSBREG1 0x10 +#define SCRATCHPAD_UTSBREG2 0x18 + /* 0x20 and 0x28, hypervisor only... */ +#define SCRATCHPAD_UNUSED1 0x30 +#define SCRATCHPAD_UNUSED2 0x38 /* Reserved for OBP */ + +#endif /* !(_SPARC64_SCRATCHPAD_H) */ diff --git a/include/asm-sparc64/smp.h b/include/asm-sparc64/smp.h index 110a2de..64c610c 100644 --- a/include/asm-sparc64/smp.h +++ b/include/asm-sparc64/smp.h @@ -37,33 +37,7 @@ extern cpumask_t phys_cpu_present_map; * General functions that each host system must provide. */ -static __inline__ int hard_smp_processor_id(void) -{ - if (tlb_type == cheetah || tlb_type == cheetah_plus) { - unsigned long cfg, ver; - __asm__ __volatile__("rdpr %%ver, %0" : "=r" (ver)); - if ((ver >> 32) == 0x003e0016) { - __asm__ __volatile__("ldxa [%%g0] %1, %0" - : "=r" (cfg) - : "i" (ASI_JBUS_CONFIG)); - return ((cfg >> 17) & 0x1f); - } else { - __asm__ __volatile__("ldxa [%%g0] %1, %0" - : "=r" (cfg) - : "i" (ASI_SAFARI_CONFIG)); - return ((cfg >> 17) & 0x3ff); - } - } else if (this_is_starfire != 0) { - return starfire_hard_smp_processor_id(); - } else { - unsigned long upaconfig; - __asm__ __volatile__("ldxa [%%g0] %1, %0" - : "=r" (upaconfig) - : "i" (ASI_UPA_CONFIG)); - return ((upaconfig >> 17) & 0x1f); - } -} - +extern int hard_smp_processor_id(void); #define raw_smp_processor_id() (current_thread_info()->cpu) #endif /* !(__ASSEMBLY__) */ diff --git a/include/asm-sparc64/spitfire.h b/include/asm-sparc64/spitfire.h index 962638c..23ad8a7 100644 --- a/include/asm-sparc64/spitfire.h +++ b/include/asm-sparc64/spitfire.h @@ -44,6 +44,7 @@ enum ultra_tlb_layout { spitfire = 0, cheetah = 1, cheetah_plus = 2, + hypervisor = 3, }; extern enum ultra_tlb_layout tlb_type; diff --git a/include/asm-sparc64/system.h b/include/asm-sparc64/system.h index af254e5..a18ec87 100644 --- a/include/asm-sparc64/system.h +++ b/include/asm-sparc64/system.h @@ -209,9 +209,10 @@ do { if (test_thread_flag(TIF_PERFCTR)) /* so that ASI is only written if it changes, think again. */ \ __asm__ __volatile__("wr %%g0, %0, %%asi" \ : : "r" (__thread_flag_byte_ptr(task_thread_info(next))[TI_FLAG_BYTE_CURRENT_DS]));\ + trap_block[current_thread_info()->cpu].thread = \ + task_thread_info(next); \ __asm__ __volatile__( \ "mov %%g4, %%g7\n\t" \ - "wrpr %%g0, 0x95, %%pstate\n\t" \ "stx %%i6, [%%sp + 2047 + 0x70]\n\t" \ "stx %%i7, [%%sp + 2047 + 0x78]\n\t" \ "rdpr %%wstate, %%o5\n\t" \ @@ -225,14 +226,10 @@ do { if (test_thread_flag(TIF_PERFCTR)) "ldx [%%g6 + %3], %%o6\n\t" \ "ldub [%%g6 + %2], %%o5\n\t" \ "ldub [%%g6 + %4], %%o7\n\t" \ - "mov %%g6, %%l2\n\t" \ "wrpr %%o5, 0x0, %%wstate\n\t" \ "ldx [%%sp + 2047 + 0x70], %%i6\n\t" \ "ldx [%%sp + 2047 + 0x78], %%i7\n\t" \ - "wrpr %%g0, 0x94, %%pstate\n\t" \ - "mov %%l2, %%g6\n\t" \ "ldx [%%g6 + %6], %%g4\n\t" \ - "wrpr %%g0, 0x96, %%pstate\n\t" \ "brz,pt %%o7, 1f\n\t" \ " mov %%g7, %0\n\t" \ "b,a ret_from_syscall\n\t" \ diff --git a/include/asm-sparc64/thread_info.h b/include/asm-sparc64/thread_info.h index ac9d068..2ebf7f2 100644 --- a/include/asm-sparc64/thread_info.h +++ b/include/asm-sparc64/thread_info.h @@ -64,8 +64,6 @@ struct thread_info { __u64 kernel_cntd0, kernel_cntd1; __u64 pcr_reg; - __u64 cee_stuff; - struct restart_block restart_block; struct pt_regs *kern_una_regs; @@ -104,10 +102,9 @@ struct thread_info { #define TI_KERN_CNTD0 0x00000480 #define TI_KERN_CNTD1 0x00000488 #define TI_PCR 0x00000490 -#define TI_CEE_STUFF 0x00000498 -#define TI_RESTART_BLOCK 0x000004a0 -#define TI_KUNA_REGS 0x000004c8 -#define TI_KUNA_INSN 0x000004d0 +#define TI_RESTART_BLOCK 0x00000498 +#define TI_KUNA_REGS 0x000004c0 +#define TI_KUNA_INSN 0x000004c8 #define TI_FPREGS 0x00000500 /* We embed this in the uppermost byte of thread_info->flags */ diff --git a/include/asm-sparc64/tlbflush.h b/include/asm-sparc64/tlbflush.h index 3ef9909..9ad5d9c 100644 --- a/include/asm-sparc64/tlbflush.h +++ b/include/asm-sparc64/tlbflush.h @@ -5,6 +5,11 @@ #include #include +/* TSB flush operations. */ +struct mmu_gather; +extern void flush_tsb_kernel_range(unsigned long start, unsigned long end); +extern void flush_tsb_user(struct mmu_gather *mp); + /* TLB flush operations. */ extern void flush_tlb_pending(void); @@ -14,28 +19,36 @@ extern void flush_tlb_pending(void); #define flush_tlb_page(vma,addr) flush_tlb_pending() #define flush_tlb_mm(mm) flush_tlb_pending() +/* Local cpu only. */ extern void __flush_tlb_all(void); + extern void __flush_tlb_page(unsigned long context, unsigned long page, unsigned long r); extern void __flush_tlb_kernel_range(unsigned long start, unsigned long end); #ifndef CONFIG_SMP -#define flush_tlb_all() __flush_tlb_all() #define flush_tlb_kernel_range(start,end) \ - __flush_tlb_kernel_range(start,end) +do { flush_tsb_kernel_range(start,end); \ + __flush_tlb_kernel_range(start,end); \ +} while (0) #else /* CONFIG_SMP */ -extern void smp_flush_tlb_all(void); extern void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end); -#define flush_tlb_all() smp_flush_tlb_all() #define flush_tlb_kernel_range(start, end) \ - smp_flush_tlb_kernel_range(start, end) +do { flush_tsb_kernel_range(start,end); \ + smp_flush_tlb_kernel_range(start, end); \ +} while (0) #endif /* ! CONFIG_SMP */ -extern void flush_tlb_pgtables(struct mm_struct *, unsigned long, unsigned long); +static inline void flush_tlb_pgtables(struct mm_struct *mm, unsigned long start, unsigned long end) +{ + /* We don't use virtual page tables for TLB miss processing + * any more. Nowadays we use the TSB. + */ +} #endif /* _SPARC64_TLBFLUSH_H */ diff --git a/include/asm-sparc64/tsb.h b/include/asm-sparc64/tsb.h new file mode 100644 index 0000000..7f3abc3 --- /dev/null +++ b/include/asm-sparc64/tsb.h @@ -0,0 +1,258 @@ +#ifndef _SPARC64_TSB_H +#define _SPARC64_TSB_H + +/* The sparc64 TSB is similar to the powerpc hashtables. It's a + * power-of-2 sized table of TAG/PTE pairs. The cpu precomputes + * pointers into this table for 8K and 64K page sizes, and also a + * comparison TAG based upon the virtual address and context which + * faults. + * + * TLB miss trap handler software does the actual lookup via something + * of the form: + * + * ldxa [%g0] ASI_{D,I}MMU_TSB_8KB_PTR, %g1 + * ldxa [%g0] ASI_{D,I}MMU, %g6 + * ldda [%g1] ASI_NUCLEUS_QUAD_LDD, %g4 + * cmp %g4, %g6 + * bne,pn %xcc, tsb_miss_{d,i}tlb + * mov FAULT_CODE_{D,I}TLB, %g3 + * stxa %g5, [%g0] ASI_{D,I}TLB_DATA_IN + * retry + * + * + * Each 16-byte slot of the TSB is the 8-byte tag and then the 8-byte + * PTE. The TAG is of the same layout as the TLB TAG TARGET mmu + * register which is: + * + * ------------------------------------------------- + * | - | CONTEXT | - | VADDR bits 63:22 | + * ------------------------------------------------- + * 63 61 60 48 47 42 41 0 + * + * Like the powerpc hashtables we need to use locking in order to + * synchronize while we update the entries. PTE updates need locking + * as well. + * + * We need to carefully choose a lock bits for the TSB entry. We + * choose to use bit 47 in the tag. Also, since we never map anything + * at page zero in context zero, we use zero as an invalid tag entry. + * When the lock bit is set, this forces a tag comparison failure. + */ + +#define TSB_TAG_LOCK_BIT 47 +#define TSB_TAG_LOCK_HIGH (1 << (TSB_TAG_LOCK_BIT - 32)) + +#define TSB_MEMBAR membar #StoreStore + +/* Some cpus support physical address quad loads. We want to use + * those if possible so we don't need to hard-lock the TSB mapping + * into the TLB. We encode some instruction patching in order to + * support this. + * + * The kernel TSB is locked into the TLB by virtue of being in the + * kernel image, so we don't play these games for swapper_tsb access. + */ +#ifndef __ASSEMBLY__ +struct tsb_ldquad_phys_patch_entry { + unsigned int addr; + unsigned int sun4u_insn; + unsigned int sun4v_insn; +}; +extern struct tsb_ldquad_phys_patch_entry __tsb_ldquad_phys_patch, + __tsb_ldquad_phys_patch_end; + +struct tsb_phys_patch_entry { + unsigned int addr; + unsigned int insn; +}; +extern struct tsb_phys_patch_entry __tsb_phys_patch, __tsb_phys_patch_end; +#endif +#define TSB_LOAD_QUAD(TSB, REG) \ +661: ldda [TSB] ASI_NUCLEUS_QUAD_LDD, REG; \ + .section .tsb_ldquad_phys_patch, "ax"; \ + .word 661b; \ + ldda [TSB] ASI_QUAD_LDD_PHYS, REG; \ + ldda [TSB] ASI_QUAD_LDD_PHYS_4V, REG; \ + .previous + +#define TSB_LOAD_TAG_HIGH(TSB, REG) \ +661: lduwa [TSB] ASI_N, REG; \ + .section .tsb_phys_patch, "ax"; \ + .word 661b; \ + lduwa [TSB] ASI_PHYS_USE_EC, REG; \ + .previous + +#define TSB_LOAD_TAG(TSB, REG) \ +661: ldxa [TSB] ASI_N, REG; \ + .section .tsb_phys_patch, "ax"; \ + .word 661b; \ + ldxa [TSB] ASI_PHYS_USE_EC, REG; \ + .previous + +#define TSB_CAS_TAG_HIGH(TSB, REG1, REG2) \ +661: casa [TSB] ASI_N, REG1, REG2; \ + .section .tsb_phys_patch, "ax"; \ + .word 661b; \ + casa [TSB] ASI_PHYS_USE_EC, REG1, REG2; \ + .previous + +#define TSB_CAS_TAG(TSB, REG1, REG2) \ +661: casxa [TSB] ASI_N, REG1, REG2; \ + .section .tsb_phys_patch, "ax"; \ + .word 661b; \ + casxa [TSB] ASI_PHYS_USE_EC, REG1, REG2; \ + .previous + +#define TSB_STORE(ADDR, VAL) \ +661: stxa VAL, [ADDR] ASI_N; \ + .section .tsb_phys_patch, "ax"; \ + .word 661b; \ + stxa VAL, [ADDR] ASI_PHYS_USE_EC; \ + .previous + +#define TSB_LOCK_TAG(TSB, REG1, REG2) \ +99: TSB_LOAD_TAG_HIGH(TSB, REG1); \ + sethi %hi(TSB_TAG_LOCK_HIGH), REG2;\ + andcc REG1, REG2, %g0; \ + bne,pn %icc, 99b; \ + nop; \ + TSB_CAS_TAG_HIGH(TSB, REG1, REG2); \ + cmp REG1, REG2; \ + bne,pn %icc, 99b; \ + nop; \ + TSB_MEMBAR + +#define TSB_WRITE(TSB, TTE, TAG) \ + add TSB, 0x8, TSB; \ + TSB_STORE(TSB, TTE); \ + sub TSB, 0x8, TSB; \ + TSB_MEMBAR; \ + TSB_STORE(TSB, TAG); + +#define KTSB_LOAD_QUAD(TSB, REG) \ + ldda [TSB] ASI_NUCLEUS_QUAD_LDD, REG; + +#define KTSB_STORE(ADDR, VAL) \ + stxa VAL, [ADDR] ASI_N; + +#define KTSB_LOCK_TAG(TSB, REG1, REG2) \ +99: lduwa [TSB] ASI_N, REG1; \ + sethi %hi(TSB_TAG_LOCK_HIGH), REG2;\ + andcc REG1, REG2, %g0; \ + bne,pn %icc, 99b; \ + nop; \ + casa [TSB] ASI_N, REG1, REG2;\ + cmp REG1, REG2; \ + bne,pn %icc, 99b; \ + nop; \ + TSB_MEMBAR + +#define KTSB_WRITE(TSB, TTE, TAG) \ + add TSB, 0x8, TSB; \ + stxa TTE, [TSB] ASI_N; \ + sub TSB, 0x8, TSB; \ + TSB_MEMBAR; \ + stxa TAG, [TSB] ASI_N; + + /* Do a kernel page table walk. Leaves physical PTE pointer in + * REG1. Jumps to FAIL_LABEL on early page table walk termination. + * VADDR will not be clobbered, but REG2 will. + */ +#define KERN_PGTABLE_WALK(VADDR, REG1, REG2, FAIL_LABEL) \ + sethi %hi(swapper_pg_dir), REG1; \ + or REG1, %lo(swapper_pg_dir), REG1; \ + sllx VADDR, 64 - (PGDIR_SHIFT + PGDIR_BITS), REG2; \ + srlx REG2, 64 - PAGE_SHIFT, REG2; \ + andn REG2, 0x3, REG2; \ + lduw [REG1 + REG2], REG1; \ + brz,pn REG1, FAIL_LABEL; \ + sllx VADDR, 64 - (PMD_SHIFT + PMD_BITS), REG2; \ + srlx REG2, 64 - PAGE_SHIFT, REG2; \ + sllx REG1, 11, REG1; \ + andn REG2, 0x3, REG2; \ + lduwa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \ + brz,pn REG1, FAIL_LABEL; \ + sllx VADDR, 64 - PMD_SHIFT, REG2; \ + srlx REG2, 64 - PAGE_SHIFT, REG2; \ + sllx REG1, 11, REG1; \ + andn REG2, 0x7, REG2; \ + add REG1, REG2, REG1; + + /* Do a user page table walk in MMU globals. Leaves physical PTE + * pointer in REG1. Jumps to FAIL_LABEL on early page table walk + * termination. Physical base of page tables is in PHYS_PGD which + * will not be modified. + * + * VADDR will not be clobbered, but REG1 and REG2 will. + */ +#define USER_PGTABLE_WALK_TL1(VADDR, PHYS_PGD, REG1, REG2, FAIL_LABEL) \ + sllx VADDR, 64 - (PGDIR_SHIFT + PGDIR_BITS), REG2; \ + srlx REG2, 64 - PAGE_SHIFT, REG2; \ + andn REG2, 0x3, REG2; \ + lduwa [PHYS_PGD + REG2] ASI_PHYS_USE_EC, REG1; \ + brz,pn REG1, FAIL_LABEL; \ + sllx VADDR, 64 - (PMD_SHIFT + PMD_BITS), REG2; \ + srlx REG2, 64 - PAGE_SHIFT, REG2; \ + sllx REG1, 11, REG1; \ + andn REG2, 0x3, REG2; \ + lduwa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \ + brz,pn REG1, FAIL_LABEL; \ + sllx VADDR, 64 - PMD_SHIFT, REG2; \ + srlx REG2, 64 - PAGE_SHIFT, REG2; \ + sllx REG1, 11, REG1; \ + andn REG2, 0x7, REG2; \ + add REG1, REG2, REG1; + +/* Lookup a OBP mapping on VADDR in the prom_trans[] table at TL>0. + * If no entry is found, FAIL_LABEL will be branched to. On success + * the resulting PTE value will be left in REG1. VADDR is preserved + * by this routine. + */ +#define OBP_TRANS_LOOKUP(VADDR, REG1, REG2, REG3, FAIL_LABEL) \ + sethi %hi(prom_trans), REG1; \ + or REG1, %lo(prom_trans), REG1; \ +97: ldx [REG1 + 0x00], REG2; \ + brz,pn REG2, FAIL_LABEL; \ + nop; \ + ldx [REG1 + 0x08], REG3; \ + add REG2, REG3, REG3; \ + cmp REG2, VADDR; \ + bgu,pt %xcc, 98f; \ + cmp VADDR, REG3; \ + bgeu,pt %xcc, 98f; \ + ldx [REG1 + 0x10], REG3; \ + sub VADDR, REG2, REG2; \ + ba,pt %xcc, 99f; \ + add REG3, REG2, REG1; \ +98: ba,pt %xcc, 97b; \ + add REG1, (3 * 8), REG1; \ +99: + + /* We use a 32K TSB for the whole kernel, this allows to + * handle about 16MB of modules and vmalloc mappings without + * incurring many hash conflicts. + */ +#define KERNEL_TSB_SIZE_BYTES (32 * 1024) +#define KERNEL_TSB_NENTRIES \ + (KERNEL_TSB_SIZE_BYTES / 16) + + /* Do a kernel TSB lookup at tl>0 on VADDR+TAG, branch to OK_LABEL + * on TSB hit. REG1, REG2, REG3, and REG4 are used as temporaries + * and the found TTE will be left in REG1. REG3 and REG4 must + * be an even/odd pair of registers. + * + * VADDR and TAG will be preserved and not clobbered by this macro. + */ +#define KERN_TSB_LOOKUP_TL1(VADDR, TAG, REG1, REG2, REG3, REG4, OK_LABEL) \ + sethi %hi(swapper_tsb), REG1; \ + or REG1, %lo(swapper_tsb), REG1; \ + srlx VADDR, PAGE_SHIFT, REG2; \ + and REG2, (KERNEL_TSB_NENTRIES - 1), REG2; \ + sllx REG2, 4, REG2; \ + add REG1, REG2, REG2; \ + KTSB_LOAD_QUAD(REG2, REG3); \ + cmp REG3, TAG; \ + be,a,pt %xcc, OK_LABEL; \ + mov REG4, REG1; + +#endif /* !(_SPARC64_TSB_H) */ diff --git a/include/asm-sparc64/ttable.h b/include/asm-sparc64/ttable.h index 2784f80..f912f52 100644 --- a/include/asm-sparc64/ttable.h +++ b/include/asm-sparc64/ttable.h @@ -93,7 +93,7 @@ #define SYSCALL_TRAP(routine, systbl) \ sethi %hi(109f), %g7; \ - ba,pt %xcc, scetrap; \ + ba,pt %xcc, etrap; \ 109: or %g7, %lo(109b), %g7; \ sethi %hi(systbl), %l7; \ ba,pt %xcc, routine; \ @@ -109,14 +109,14 @@ nop;nop;nop; #define TRAP_UTRAP(handler,lvl) \ - ldx [%g6 + TI_UTRAPS], %g1; \ - sethi %hi(109f), %g7; \ - brz,pn %g1, utrap; \ - or %g7, %lo(109f), %g7; \ - ba,pt %xcc, utrap; \ -109: ldx [%g1 + handler*8], %g1; \ - ba,pt %xcc, utrap_ill; \ - mov lvl, %o1; + mov handler, %g3; \ + ba,pt %xcc, utrap_trap; \ + mov lvl, %g4; \ + nop; \ + nop; \ + nop; \ + nop; \ + nop; #ifdef CONFIG_SUNOS_EMUL #define SUNOS_SYSCALL_TRAP SYSCALL_TRAP(linux_sparc_syscall32, sunos_sys_table) @@ -136,8 +136,6 @@ #else #define SOLARIS_SYSCALL_TRAP TRAP(solaris_syscall) #endif -/* FIXME: Write these actually */ -#define NETBSD_SYSCALL_TRAP TRAP(netbsd_syscall) #define BREAKPOINT_TRAP TRAP(breakpoint_trap) #define TRAP_IRQ(routine, level) \ @@ -221,6 +219,31 @@ saved; retry; nop; nop; nop; nop; nop; nop; \ nop; nop; nop; nop; nop; nop; nop; nop; +#define SPILL_0_NORMAL_ETRAP \ +etrap_kernel_spill: \ + stx %l0, [%sp + STACK_BIAS + 0x00]; \ + stx %l1, [%sp + STACK_BIAS + 0x08]; \ + stx %l2, [%sp + STACK_BIAS + 0x10]; \ + stx %l3, [%sp + STACK_BIAS + 0x18]; \ + stx %l4, [%sp + STACK_BIAS + 0x20]; \ + stx %l5, [%sp + STACK_BIAS + 0x28]; \ + stx %l6, [%sp + STACK_BIAS + 0x30]; \ + stx %l7, [%sp + STACK_BIAS + 0x38]; \ + stx %i0, [%sp + STACK_BIAS + 0x40]; \ + stx %i1, [%sp + STACK_BIAS + 0x48]; \ + stx %i2, [%sp + STACK_BIAS + 0x50]; \ + stx %i3, [%sp + STACK_BIAS + 0x58]; \ + stx %i4, [%sp + STACK_BIAS + 0x60]; \ + stx %i5, [%sp + STACK_BIAS + 0x68]; \ + stx %i6, [%sp + STACK_BIAS + 0x70]; \ + stx %i7, [%sp + STACK_BIAS + 0x78]; \ + saved; \ + sub %g1, 2, %g1; \ + ba,pt %xcc, etrap_save; \ + wrpr %g1, %cwp; \ + nop; nop; nop; nop; nop; nop; nop; nop; \ + nop; nop; nop; nop; + /* Normal 64bit spill */ #define SPILL_1_GENERIC(ASI) \ add %sp, STACK_BIAS + 0x00, %g1; \ @@ -254,6 +277,67 @@ b,a,pt %xcc, spill_fixup_mna; \ b,a,pt %xcc, spill_fixup; +#define SPILL_1_GENERIC_ETRAP \ +etrap_user_spill_64bit: \ + stxa %l0, [%sp + STACK_BIAS + 0x00] %asi; \ + stxa %l1, [%sp + STACK_BIAS + 0x08] %asi; \ + stxa %l2, [%sp + STACK_BIAS + 0x10] %asi; \ + stxa %l3, [%sp + STACK_BIAS + 0x18] %asi; \ + stxa %l4, [%sp + STACK_BIAS + 0x20] %asi; \ + stxa %l5, [%sp + STACK_BIAS + 0x28] %asi; \ + stxa %l6, [%sp + STACK_BIAS + 0x30] %asi; \ + stxa %l7, [%sp + STACK_BIAS + 0x38] %asi; \ + stxa %i0, [%sp + STACK_BIAS + 0x40] %asi; \ + stxa %i1, [%sp + STACK_BIAS + 0x48] %asi; \ + stxa %i2, [%sp + STACK_BIAS + 0x50] %asi; \ + stxa %i3, [%sp + STACK_BIAS + 0x58] %asi; \ + stxa %i4, [%sp + STACK_BIAS + 0x60] %asi; \ + stxa %i5, [%sp + STACK_BIAS + 0x68] %asi; \ + stxa %i6, [%sp + STACK_BIAS + 0x70] %asi; \ + stxa %i7, [%sp + STACK_BIAS + 0x78] %asi; \ + saved; \ + sub %g1, 2, %g1; \ + ba,pt %xcc, etrap_save; \ + wrpr %g1, %cwp; \ + nop; nop; nop; nop; nop; \ + nop; nop; nop; nop; \ + ba,a,pt %xcc, etrap_spill_fixup_64bit; \ + ba,a,pt %xcc, etrap_spill_fixup_64bit; \ + ba,a,pt %xcc, etrap_spill_fixup_64bit; + +#define SPILL_1_GENERIC_ETRAP_FIXUP \ +etrap_spill_fixup_64bit: \ + ldub [%g6 + TI_WSAVED], %g1; \ + sll %g1, 3, %g3; \ + add %g6, %g3, %g3; \ + stx %sp, [%g3 + TI_RWIN_SPTRS]; \ + sll %g1, 7, %g3; \ + add %g6, %g3, %g3; \ + stx %l0, [%g3 + TI_REG_WINDOW + 0x00]; \ + stx %l1, [%g3 + TI_REG_WINDOW + 0x08]; \ + stx %l2, [%g3 + TI_REG_WINDOW + 0x10]; \ + stx %l3, [%g3 + TI_REG_WINDOW + 0x18]; \ + stx %l4, [%g3 + TI_REG_WINDOW + 0x20]; \ + stx %l5, [%g3 + TI_REG_WINDOW + 0x28]; \ + stx %l6, [%g3 + TI_REG_WINDOW + 0x30]; \ + stx %l7, [%g3 + TI_REG_WINDOW + 0x38]; \ + stx %i0, [%g3 + TI_REG_WINDOW + 0x40]; \ + stx %i1, [%g3 + TI_REG_WINDOW + 0x48]; \ + stx %i2, [%g3 + TI_REG_WINDOW + 0x50]; \ + stx %i3, [%g3 + TI_REG_WINDOW + 0x58]; \ + stx %i4, [%g3 + TI_REG_WINDOW + 0x60]; \ + stx %i5, [%g3 + TI_REG_WINDOW + 0x68]; \ + stx %i6, [%g3 + TI_REG_WINDOW + 0x70]; \ + stx %i7, [%g3 + TI_REG_WINDOW + 0x78]; \ + add %g1, 1, %g1; \ + stb %g1, [%g6 + TI_WSAVED]; \ + saved; \ + rdpr %cwp, %g1; \ + sub %g1, 2, %g1; \ + ba,pt %xcc, etrap_save; \ + wrpr %g1, %cwp; \ + nop; nop; nop + /* Normal 32bit spill */ #define SPILL_2_GENERIC(ASI) \ srl %sp, 0, %sp; \ @@ -287,6 +371,68 @@ b,a,pt %xcc, spill_fixup_mna; \ b,a,pt %xcc, spill_fixup; +#define SPILL_2_GENERIC_ETRAP \ +etrap_user_spill_32bit: \ + srl %sp, 0, %sp; \ + stwa %l0, [%sp + 0x00] %asi; \ + stwa %l1, [%sp + 0x04] %asi; \ + stwa %l2, [%sp + 0x08] %asi; \ + stwa %l3, [%sp + 0x0c] %asi; \ + stwa %l4, [%sp + 0x10] %asi; \ + stwa %l5, [%sp + 0x14] %asi; \ + stwa %l6, [%sp + 0x18] %asi; \ + stwa %l7, [%sp + 0x1c] %asi; \ + stwa %i0, [%sp + 0x20] %asi; \ + stwa %i1, [%sp + 0x24] %asi; \ + stwa %i2, [%sp + 0x28] %asi; \ + stwa %i3, [%sp + 0x2c] %asi; \ + stwa %i4, [%sp + 0x30] %asi; \ + stwa %i5, [%sp + 0x34] %asi; \ + stwa %i6, [%sp + 0x38] %asi; \ + stwa %i7, [%sp + 0x3c] %asi; \ + saved; \ + sub %g1, 2, %g1; \ + ba,pt %xcc, etrap_save; \ + wrpr %g1, %cwp; \ + nop; nop; nop; nop; \ + nop; nop; nop; nop; \ + ba,a,pt %xcc, etrap_spill_fixup_32bit; \ + ba,a,pt %xcc, etrap_spill_fixup_32bit; \ + ba,a,pt %xcc, etrap_spill_fixup_32bit; + +#define SPILL_2_GENERIC_ETRAP_FIXUP \ +etrap_spill_fixup_32bit: \ + ldub [%g6 + TI_WSAVED], %g1; \ + sll %g1, 3, %g3; \ + add %g6, %g3, %g3; \ + stx %sp, [%g3 + TI_RWIN_SPTRS]; \ + sll %g1, 7, %g3; \ + add %g6, %g3, %g3; \ + stw %l0, [%g3 + TI_REG_WINDOW + 0x00]; \ + stw %l1, [%g3 + TI_REG_WINDOW + 0x04]; \ + stw %l2, [%g3 + TI_REG_WINDOW + 0x08]; \ + stw %l3, [%g3 + TI_REG_WINDOW + 0x0c]; \ + stw %l4, [%g3 + TI_REG_WINDOW + 0x10]; \ + stw %l5, [%g3 + TI_REG_WINDOW + 0x14]; \ + stw %l6, [%g3 + TI_REG_WINDOW + 0x18]; \ + stw %l7, [%g3 + TI_REG_WINDOW + 0x1c]; \ + stw %i0, [%g3 + TI_REG_WINDOW + 0x20]; \ + stw %i1, [%g3 + TI_REG_WINDOW + 0x24]; \ + stw %i2, [%g3 + TI_REG_WINDOW + 0x28]; \ + stw %i3, [%g3 + TI_REG_WINDOW + 0x2c]; \ + stw %i4, [%g3 + TI_REG_WINDOW + 0x30]; \ + stw %i5, [%g3 + TI_REG_WINDOW + 0x34]; \ + stw %i6, [%g3 + TI_REG_WINDOW + 0x38]; \ + stw %i7, [%g3 + TI_REG_WINDOW + 0x3c]; \ + add %g1, 1, %g1; \ + stb %g1, [%g6 + TI_WSAVED]; \ + saved; \ + rdpr %cwp, %g1; \ + sub %g1, 2, %g1; \ + ba,pt %xcc, etrap_save; \ + wrpr %g1, %cwp; \ + nop; nop; nop + #define SPILL_1_NORMAL SPILL_1_GENERIC(ASI_AIUP) #define SPILL_2_NORMAL SPILL_2_GENERIC(ASI_AIUP) #define SPILL_3_NORMAL SPILL_0_NORMAL @@ -325,6 +471,35 @@ restored; retry; nop; nop; nop; nop; nop; nop; \ nop; nop; nop; nop; nop; nop; nop; nop; +#define FILL_0_NORMAL_RTRAP \ +kern_rtt_fill: \ + rdpr %cwp, %g1; \ + sub %g1, 1, %g1; \ + wrpr %g1, %cwp; \ + ldx [%sp + STACK_BIAS + 0x00], %l0; \ + ldx [%sp + STACK_BIAS + 0x08], %l1; \ + ldx [%sp + STACK_BIAS + 0x10], %l2; \ + ldx [%sp + STACK_BIAS + 0x18], %l3; \ + ldx [%sp + STACK_BIAS + 0x20], %l4; \ + ldx [%sp + STACK_BIAS + 0x28], %l5; \ + ldx [%sp + STACK_BIAS + 0x30], %l6; \ + ldx [%sp + STACK_BIAS + 0x38], %l7; \ + ldx [%sp + STACK_BIAS + 0x40], %i0; \ + ldx [%sp + STACK_BIAS + 0x48], %i1; \ + ldx [%sp + STACK_BIAS + 0x50], %i2; \ + ldx [%sp + STACK_BIAS + 0x58], %i3; \ + ldx [%sp + STACK_BIAS + 0x60], %i4; \ + ldx [%sp + STACK_BIAS + 0x68], %i5; \ + ldx [%sp + STACK_BIAS + 0x70], %i6; \ + ldx [%sp + STACK_BIAS + 0x78], %i7; \ + restored; \ + add %g1, 1, %g1; \ + ba,pt %xcc, kern_rtt_restore; \ + wrpr %g1, %cwp; \ + nop; nop; nop; nop; nop; \ + nop; nop; nop; nop; + + /* Normal 64bit fill */ #define FILL_1_GENERIC(ASI) \ add %sp, STACK_BIAS + 0x00, %g1; \ @@ -356,6 +531,33 @@ b,a,pt %xcc, fill_fixup_mna; \ b,a,pt %xcc, fill_fixup; +#define FILL_1_GENERIC_RTRAP \ +user_rtt_fill_64bit: \ + ldxa [%sp + STACK_BIAS + 0x00] %asi, %l0; \ + ldxa [%sp + STACK_BIAS + 0x08] %asi, %l1; \ + ldxa [%sp + STACK_BIAS + 0x10] %asi, %l2; \ + ldxa [%sp + STACK_BIAS + 0x18] %asi, %l3; \ + ldxa [%sp + STACK_BIAS + 0x20] %asi, %l4; \ + ldxa [%sp + STACK_BIAS + 0x28] %asi, %l5; \ + ldxa [%sp + STACK_BIAS + 0x30] %asi, %l6; \ + ldxa [%sp + STACK_BIAS + 0x38] %asi, %l7; \ + ldxa [%sp + STACK_BIAS + 0x40] %asi, %i0; \ + ldxa [%sp + STACK_BIAS + 0x48] %asi, %i1; \ + ldxa [%sp + STACK_BIAS + 0x50] %asi, %i2; \ + ldxa [%sp + STACK_BIAS + 0x58] %asi, %i3; \ + ldxa [%sp + STACK_BIAS + 0x60] %asi, %i4; \ + ldxa [%sp + STACK_BIAS + 0x68] %asi, %i5; \ + ldxa [%sp + STACK_BIAS + 0x70] %asi, %i6; \ + ldxa [%sp + STACK_BIAS + 0x78] %asi, %i7; \ + ba,pt %xcc, user_rtt_pre_restore; \ + restored; \ + nop; nop; nop; nop; nop; nop; \ + nop; nop; nop; nop; nop; \ + ba,a,pt %xcc, user_rtt_fill_fixup; \ + ba,a,pt %xcc, user_rtt_fill_fixup; \ + ba,a,pt %xcc, user_rtt_fill_fixup; + + /* Normal 32bit fill */ #define FILL_2_GENERIC(ASI) \ srl %sp, 0, %sp; \ @@ -387,6 +589,34 @@ b,a,pt %xcc, fill_fixup_mna; \ b,a,pt %xcc, fill_fixup; +#define FILL_2_GENERIC_RTRAP \ +user_rtt_fill_32bit: \ + srl %sp, 0, %sp; \ + lduwa [%sp + 0x00] %asi, %l0; \ + lduwa [%sp + 0x04] %asi, %l1; \ + lduwa [%sp + 0x08] %asi, %l2; \ + lduwa [%sp + 0x0c] %asi, %l3; \ + lduwa [%sp + 0x10] %asi, %l4; \ + lduwa [%sp + 0x14] %asi, %l5; \ + lduwa [%sp + 0x18] %asi, %l6; \ + lduwa [%sp + 0x1c] %asi, %l7; \ + lduwa [%sp + 0x20] %asi, %i0; \ + lduwa [%sp + 0x24] %asi, %i1; \ + lduwa [%sp + 0x28] %asi, %i2; \ + lduwa [%sp + 0x2c] %asi, %i3; \ + lduwa [%sp + 0x30] %asi, %i4; \ + lduwa [%sp + 0x34] %asi, %i5; \ + lduwa [%sp + 0x38] %asi, %i6; \ + lduwa [%sp + 0x3c] %asi, %i7; \ + ba,pt %xcc, user_rtt_pre_restore; \ + restored; \ + nop; nop; nop; nop; nop; \ + nop; nop; nop; nop; nop; \ + ba,a,pt %xcc, user_rtt_fill_fixup; \ + ba,a,pt %xcc, user_rtt_fill_fixup; \ + ba,a,pt %xcc, user_rtt_fill_fixup; + + #define FILL_1_NORMAL FILL_1_GENERIC(ASI_AIUP) #define FILL_2_NORMAL FILL_2_GENERIC(ASI_AIUP) #define FILL_3_NORMAL FILL_0_NORMAL