commit a205752d1ad2d37d6597aaae5a56fc396a770868 Merge: 39bc89f... e900a7d... Author: Linus Torvalds Date: Fri Apr 27 10:47:29 2007 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: preserve boolean values across policy reloads selinux: change numbering of boolean directory inodes in selinuxfs selinux: remove unused enumeration constant from selinuxfs selinux: explicitly number all selinuxfs inodes selinux: export initial SID contexts via selinuxfs selinux: remove userland security class and permission definitions SELinux: move security_skb_extlbl_sid() out of the security server MAINTAINERS: update selinux entry SELinux: rename selinux_netlabel.h to netlabel.h SELinux: extract the NetLabel SELinux support from the security server NetLabel: convert a BUG_ON in the CIPSO code to a runtime check NetLabel: cleanup and document CIPSO constants commit 39bc89fd4019b164002adaacef92c4140e37955a Author: Ingo Molnar Date: Wed Apr 25 20:50:03 2007 -0700 make SysRq-T show all tasks again show_state() (SysRq-T) developed the buggy habbit of not showing TASK_RUNNING tasks. This was due to the mistaken belief that state_filter == -1 would be a pass-through filter - while in reality it did not let TASK_RUNNING == 0 p->state values through. Fix this by restoring the original '!state_filter means all tasks' special-case i had in the original version. Test-built and test-booted on i686, SysRq-T now works as intended. Signed-off-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 20f09390b2da2432309afe8aaa0bd64ec64c4584 Author: Daniel Walker Date: Thu Apr 26 09:46:05 2007 -0700 seqlocks: trivial remove weird whitespace Signed-off-by: Daniel Walker Signed-off-by: Linus Torvalds commit b928ed56182b8ea59bd43f2d5b865f13a54d5719 Merge: ea6db58... d468a03... Author: Linus Torvalds Date: Fri Apr 27 10:42:35 2007 -0700 Merge branch 'for-linus' of git://git.infradead.org/ubi-2.6 * 'for-linus' of git://git.infradead.org/ubi-2.6: UBI: remove unused variable UBI: add me to MAINTAINERS JFFS2: add UBI support UBI: Unsorted Block Images commit ea6db58f3ea55f413c882095d2afaea8137f4f8c Merge: c58b8e4... 8341897... Author: Linus Torvalds Date: Fri Apr 27 10:29:56 2007 -0700 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (27 commits) ocfs2: Cache extent records ocfs2: Remember rw lock level during direct io ocfs2: Fix up i_blocks calculation to know about holes ocfs2: Fix extent lookup to return true size of holes ocfs2: Read from an unwritten extent returns zeros ocfs2: make room for unwritten extents flag ocfs2: Use own splice write actor ocfs2: Use do_sync_mapping_range() in ocfs2_zero_tail_for_truncate() [PATCH] Turn do_sync_file_range() into do_sync_mapping_range() ocfs2: zero tail of sparse files on truncate ocfs2: Teach ocfs2_get_block() about holes ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write() ocfs2: teach ocfs2_file_aio_write() about sparse files ocfs2: Turn off shared writeable mmap for local files systems with holes. ocfs2: abstract out allocation locking ocfs2: teach extend/truncate about sparse files ocfs2: temporarily remove extent map caching ocfs2: sparse b-tree support ocfs2: small cleanup of ocfs2_request_delete() ocfs2: remove unused code ... commit c58b8e4a25a1ba347a0e5d21984c97bd296f1691 Merge: afc2e82... f50393f... Author: Linus Torvalds Date: Fri Apr 27 10:14:53 2007 -0700 Merge branch 'e1000-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'e1000-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: e1000: FIX: Stop raw interrupts disabled nag from RT e1000: FIX: firmware handover bits e1000: FIX: be ready for incoming irq at pci_request_irq commit afc2e82c0851317931a9bfdb98271253371825c6 Merge: 0278ef8... 1912ffb... Author: Linus Torvalds Date: Fri Apr 27 09:39:27 2007 -0700 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits) IB: Set class_dev->dev in core for nice device symlink IB/ehca: Implement modify_port IB/umad: Clarify documentation of transaction ID IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements IB/mad: Change SMI to use enums rather than magic return codes IB/umad: Implement GRH handling for sent/received MADs IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr IB/sa: Set src_path_bits correctly in ib_init_ah_from_path() IB/ucm: Simplify ib_ucm_event() RDMA/ucma: Simplify ucma_get_event() IB/mthca: Simplify CQ cleaning in mthca_free_qp() IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory IB/mthca: Update HCA firmware revisions IB/ipath: Fix WC format drift between user and kernel space IB/ipath: Check that a UD work request's address handle is valid IB/ipath: Remove duplicate stuff from ipath_verbs.h IB/ipath: Check reserved memory keys IB/ipath: Fix unit selection when all CPU affinity bits set IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times IB/ipath: Disable IB link earlier in shutdown sequence ... commit 0278ef8b484a71917bd4f03a763285cdaac10954 Merge: 15c5403... cd9ad58... Author: Linus Torvalds Date: Fri Apr 27 09:29:04 2007 -0700 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: (67 commits) [SCSI] SUNESP: Complete driver rewrite to version 2.0 [SPARC64]: Convert PCI over to generic struct iommu/strbuf. [SPARC]: device_node name constification fallout [SPARC64]: Convert SBUS over to generic iommu/strbuf structs. [SPARC64]: Add generic iommu and strbuf structs to iommu.h [SPARC64]: Consolidate {sbus,pci}_iommu_arena. [SPARC]: Make device_node name and type const [SPARC64]: constify some paramaters of OF routines [TIGON3]: of_get_property() returns const. [SPARC64]: Fix PCI rework to adhere to of_get_property() const return. [SPARC64]: Document and fix calculation of pages_avail. [SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c [SPARC64]: Use bootmem_bootmap_pages() in choose_bootmap_pfn(). [SPARC64]: Add proper header file extern for cmdline_memory_size. [SPARC64]: Kill sparc_ultra_dump_{i,d}tlb() [SPARC64]: Use DECLARE_BITMAP and BITS_TO_LONGS in mm/init.c [SPARC64]: Give move verbose show_mem() output just like i386. [SPARC64]: Mark show_mem() printk's with KERN_INFO. [SPARC64]: Kill kvaddr_to_phys() and friends. [SPARC64]: Privatize sun4u_get_pte() and fix name. ... commit 15c54033964a943de7b0763efd3bd0ede7326395 Merge: ad5da3c... 912a41a... Author: Linus Torvalds Date: Fri Apr 27 09:26:46 2007 -0700 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (448 commits) [IPV4] nl_fib_lookup: Initialise res.r before fib_res_put(&res) [IPV6]: Fix thinko in ipv6_rthdr_rcv() changes. [IPV4]: Add multipath cached to feature-removal-schedule.txt [WIRELESS] cfg80211: Clarify locking comment. [WIRELESS] cfg80211: Fix locking in wiphy_new. [WEXT] net_device: Don't include wext bits if not required. [WEXT]: Misc code cleanups. [WEXT]: Reduce inline abuse. [WEXT]: Move EXPORT_SYMBOL statements where they belong. [WEXT]: Cleanup early ioctl call path. [WEXT]: Remove options. [WEXT]: Remove dead debug code. [WEXT]: Clean up how wext is called. [WEXT]: Move to net/wireless [AFS]: Eliminate cmpxchg() usage in vlocation code. [RXRPC]: Fix pointers passed to bitops. [RXRPC]: Remove bogus atomic_* overrides. [AFS]: Fix u64 printing in debug logging. [AFS]: Add "directory write" support. [AFS]: Implement the CB.InitCallBackState3 operation. ... commit ad5da3cf39a5b11a198929be1f2644e17ecd767e Merge: da8ac5e... 14cf232... Author: Linus Torvalds Date: Fri Apr 27 09:20:51 2007 -0700 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (22 commits) [MIPS] Don't force frame pointers for lockdep on MIPS [MIPS] update vr41xx Kconfig [MIPS] remove 2 select entries for VR41xx [MIPS] rename VR41XX to VR4100 series [MIPS] Use DEFINE_SPINLOCK instead of SPIN_LOCK_UNLOCKED. [MIPS] Replace old fashioned "__typeof" with "__typeof__". [MIPS] Remove unused _THREAD_SIZE_ORDER from asm-offset.c. [MIPS] Change PCI host bridge setup/resources [MIPS] Register PCI host bridge resource earlier [MIPS] Remove pnx8550-v2pci_defconfig [MIPS] Add bcm1480 ZBus trace support, fix wait related bugs [MIPS] Updated Sibyte headers [MIPS] Remove unused argument from kunmap_coherent(). [MIPS] Malta: Delete unused prototype of mips_timer_interrupt. [MIPS] Select ZONE_DMA only if GENERIC_ISA_DMA selected [MIPS] MIPS Tech: Get rid of volatile in core code. [MIPS] IP22: Get rid of volatile in IP22 core code. [MIPS] JMR3927 cleanup [MIPS] merge GT64111 PCI routines and GT64120 PCI_0 routines [MIPS] Cobalt: Split PCI codes from setup.c ... commit da8ac5e0fab11d0e84be4e49aaaa828c52d17097 Merge: 32f15dc... cb629a0... Author: Linus Torvalds Date: Fri Apr 27 09:15:31 2007 -0700 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (38 commits) [S390] SPIN_LOCK_UNLOCKED cleanup in drivers/s390 [S390] Clean up smp code in preparation for some larger changes. [S390] Remove debugging junk. [S390] Switch etr from tasklet to workqueue. [S390] split page_test_and_clear_dirty. [S390] Processor degradation notification. [S390] vtime: cleanup per_cpu usage. [S390] crypto: cleanup. [S390] sclp: fix coding style. [S390] vmlogrdr: stop IUCV connection in vmlogrdr_release. [S390] sclp: initialize early. [S390] ctc: kmalloc->kzalloc/casting cleanups. [S390] zfcpdump support. [S390] dasd: Add ipldev parameter. [S390] dasd: Add sysfs attribute status and generate uevents. [S390] Improved kernel stack overflow checking. [S390] Get rid of console setup functions. [S390] No execute support cleanup. [S390] Minor fault path optimization. [S390] Use generic bug. ... commit 32f15dc5e6252f03aa2e04a2b140827a8297f21f Merge: 07db59b... 8224ca1... Author: Linus Torvalds Date: Fri Apr 27 09:14:46 2007 -0700 Merge branch 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32 * 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32: (21 commits) [AVR32] Fix compile error with gcc 4.1 avr32: remove unneeded cast in atomic.h AVR32: Remove useless config option "GENERIC_BUST_SPINLOCK". [AVR32] Optimize the TLB miss handler [AVR32] Board code for ATNGW100 [AVR32] Use memcpy/memset in memcpy_{from,to}_io and memset_io [AVR32] Get rid of board_setup_fbmem() [AVR32] Reserve framebuffer memory in early_parse_fbmem() [AVR32] Simplify early handling of memory regions [AVR32] Move setup_bootmem() from mm/init.c to kernel/setup.c [AVR32] Make I/O access macros work with external devices [AVR32] Fix NMI handler [AVR32] Clean up exception handling code [AVR32] Clean up cpu identification and add features bitmap [AVR32] Clean up asm/sysreg.h [AVR32] Don't enable clocks with no users [AVR32] Put cpu in sleep 0 when idle. [AVR32] Change system timer from count-compare to Timer/Counter 0 [AVR32] Add mach-specific Kconfig [AVR32] Add nwait and tdf parameters to SMC configuration ... commit 07db59bd6b0f279c31044cba6787344f63be87ea Author: Linus Torvalds Date: Fri Apr 27 09:10:47 2007 -0700 Change default dirty-writeback limits Do this really early in the 2.6.22-rc series, so that we'll get feedback. And don't change by half measures. Just cut the default dirty limit to a quarter of what it was, and see if anybody even notices. Signed-off-by: Linus Torvalds commit 14cf232ab161ce87ca538af3daad5f717c20d487 Author: Franck Bui-Huu Date: Thu Apr 26 00:20:15 2007 -0700 [MIPS] Don't force frame pointers for lockdep on MIPS Stacktrace support on MIPS doesn't use frame pointers. Since this option considerably increases the size of the kernel code, force lockdep to not use it. Signed-off-by: Franck Bui-Huu Signed-off-by: Ralf Baechle commit c4be17370b76bc6c23a239ca512fe360785d2369 Author: Yoichi Yuasa Date: Thu Apr 26 19:53:59 2007 +0900 [MIPS] update vr41xx Kconfig This patch has updated vr41xx/Kconfig. Signed-off-by: Yoichi Yuasa Signed-off-by: Ralf Baechle commit 678f4e34a66eb364f4f8e7dc82746c3df14dc1e8 Author: Yoichi Yuasa Date: Thu Apr 26 19:51:31 2007 +0900 [MIPS] remove 2 select entries for VR41xx This patch has removed 2 select entries for VR41xx. These entries are selected in arch/mips/vr41xx/Kconfig. Signed-off-by: Yoichi Yuasa Signed-off-by: Ralf Baechle commit 74142d65b23b46587ea329202e957c901d9a57a1 Author: Yoichi Yuasa Date: Thu Apr 26 19:45:09 2007 +0900 [MIPS] rename VR41XX to VR4100 series This patch has renamed VR41XX to VR4100 series. That's better. Signed-off-by: Yoichi Yuasa Signed-off-by: Ralf Baechle commit 820c229f6c82bb91c0dbfbce90f7e6eb9639c7ab Author: Milind Arun Choudhary Date: Thu Apr 19 15:05:16 2007 +0530 [MIPS] Use DEFINE_SPINLOCK instead of SPIN_LOCK_UNLOCKED. Signed-off-by: Milind Arun Choudhary Signed-off-by: Ralf Baechle commit a4c9bb7d22aa61ec585984f46df185669e557e3d Author: Robert P. J. Day Date: Tue Apr 10 06:23:27 2007 -0400 [MIPS] Replace old fashioned "__typeof" with "__typeof__". [Robert's original log message said this was a bug but it isn't, it's just very old fashioned syntax that is not (no longer?) documented in the gcc documentation. So for the sake of uniformity I'm applying his patch but with a modified log message. -- Ralf] Signed-off-by: Robert P. J. Day Signed-off-by: Ralf Baechle commit 05bc284a719b778243f51e23c88fe6cefe6b219b Author: Ralf Baechle Date: Thu Apr 26 15:46:28 2007 +0100 [MIPS] Remove unused _THREAD_SIZE_ORDER from asm-offset.c. Signed-off-by: Ralf Baechle commit bea771751c116a690054581902b4144fe5a4520e Author: Thomas Bogendoerfer Date: Sun Apr 8 13:34:57 2007 +0200 [MIPS] Change PCI host bridge setup/resources PCI host bridge setup for SNI RM machines with PCI is quite broken, now that Linux does it's resource setup own its own. It will use IO addresses, which are needed by the EISA config detection and assigns PCI memory addresses, which overlap with ISA legacy addresses (video ram). Below is a patch, which changes the way how the PCI memory addresses are used and sets the minimum IO address to give enough IO space for 8 EISA slots). This patch needs the other PCI resource change, I've posted. Signed-off-by: Thomas Bogendoerfer Signed-off-by: Ralf Baechle commit 639702bd725b3cc1a9bd442a7822c83849d66e91 Author: Thomas Bogendoerfer Date: Sun Apr 8 13:28:44 2007 +0200 [MIPS] Register PCI host bridge resource earlier PCI based SNI RM machines have their EISA bus behind an Intel PCI/EISA bridge. So the PCI IO range must start at 0x0000. Changing that will break the PCI bus, because i8259.c already has registered it's IO addresses before the PCI bus gets initialized. Below is a patch, which will register the PCI host bridge resources inside register_pci_controller(). It also changes i8259.c to use insert_region(), because request_resource() will fail, if the IO space of the PIT hanging of the PCI host bridge (maybe passing the resource parent to init_i8259_irqs() is a cleaner fix for that). Signed-off-by: Thomas Bogendoerfer Signed-off-by: Ralf Baechle commit 3c5e370600c2dda8a4f59f841f323df04e6ce7b2 Author: Yoichi Yuasa Date: Wed Apr 4 16:40:52 2007 +0900 [MIPS] Remove pnx8550-v2pci_defconfig Signed-off-by: Yoichi Yuasa Signed-off-by: Ralf Baechle commit d619f38fdacb5cec0c841798bbadeaf903868852 Author: Mark Mason Date: Thu Mar 29 11:39:56 2007 -0700 [MIPS] Add bcm1480 ZBus trace support, fix wait related bugs Make ZBus tracing generic - moving it to a common direcotry under arch/mips/sibyte, add bcm1480 support and fix some wait related bugs (thanks to Ralf for assistance on that). Signed-off-by: Mark Mason Signed-off-by: Ralf Baechle commit 8deab1144b553548fb2f1b51affdd36dcd652aaa Author: Mark Mason Date: Wed Mar 28 14:40:25 2007 -0700 [MIPS] Updated Sibyte headers This is an update to the earlier patch for the sibyte headers, and superceeds the previous patch. Changes were necessary to get the tbprof driver working on the bcm1480. Patch to update Sibyte header files to match master versions maintained at Broadcom. This patch also corrects some whitespace problems, and (hopefully) shouldn't introduce any new ones. Signed-off-by: Mark Mason Signed-off-by: Ralf Baechle commit eacb9d61919db56482dcea7ec943c9508175dc16 Author: Ralf Baechle Date: Thu Apr 26 15:46:25 2007 +0100 [MIPS] Remove unused argument from kunmap_coherent(). Signed-off-by: Ralf Baechle commit e3cf10e93f42171224e88b06c226d450f3b3a55f Author: Ralf Baechle Date: Thu Apr 26 15:46:25 2007 +0100 [MIPS] Malta: Delete unused prototype of mips_timer_interrupt. Signed-off-by: Ralf Baechle commit 05502339332564ffd545be9ca37b208296a2eaad Author: Atsushi Nemoto Date: Wed Mar 21 00:36:02 2007 +0900 [MIPS] Select ZONE_DMA only if GENERIC_ISA_DMA selected Signed-off-by: Atsushi Nemoto Signed-off-by: Ralf Baechle commit f197465384bf7ef1af184c2ed1a4e268911a91e3 Author: Ralf Baechle Date: Thu Apr 26 15:46:24 2007 +0100 [MIPS] MIPS Tech: Get rid of volatile in core code. Signed-off-by: Ralf Baechle commit 78709b9df35346965b214e0e548412748d147776 Author: Ralf Baechle Date: Thu Apr 26 15:46:24 2007 +0100 [MIPS] IP22: Get rid of volatile in IP22 core code. Signed-off-by: Ralf Baechle commit 2127435e57a15f1fea8d6969e264eeb05b28ba4b Author: Atsushi Nemoto Date: Thu Mar 15 00:58:28 2007 +0900 [MIPS] JMR3927 cleanup * Kill dead codes * Rearrange irq chip handlers * Minimize defconfig Signed-off-by: Atsushi Nemoto Signed-off-by: Ralf Baechle commit 252161eccd1a44f32a506d0fedb424d4ff84e4dc Author: Yoichi Yuasa Date: Wed Mar 14 21:51:26 2007 +0900 [MIPS] merge GT64111 PCI routines and GT64120 PCI_0 routines This patch has merged GT64111 PCI routines and GT64120 PCI_0 routines. GT64111 PCI is almost the same as GT64120's PCI_0. This patch don't change GT64120 PCI routines. Signed-off-by: Yoichi Yuasa Signed-off-by: Ralf Baechle commit 2a9effc67804102d6d5182eb0116520588ae2256 Author: Yoichi Yuasa Date: Mon Mar 5 19:10:03 2007 +0900 [MIPS] Cobalt: Split PCI codes from setup.c It's removed #ifdef CONFIG_PCI/#endif from cobalt setup.c . Signed-off-by: Yoichi Yuasa Signed-off-by: Ralf Baechle commit cc50b67dcd84c6215232c0e1c95e24786e555782 Author: Yoichi Yuasa Date: Tue Mar 6 21:34:44 2007 +0900 [MIPS] Cobalt: clean up include files Signed-off-by: Yoichi Yuasa commit 7f5a7716dc0b380fd3c85ca5a5841969555feaa7 Author: Ralf Baechle Date: Wed Apr 25 15:08:57 2007 +0100 [MIPS] Fix AP/SP to work in the reality of modern kernels. Signed-off-by: Ralf Baechle commit cb629a01bb5bca951287e761c590a5686c6ca416 Author: Milind Arun Choudhary Date: Fri Apr 27 16:02:01 2007 +0200 [S390] SPIN_LOCK_UNLOCKED cleanup in drivers/s390 SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead. Signed-off-by: Milind Arun Choudhary Cc: Heiko Carstens Signed-off-by: Andrew Morton Signed-off-by: Martin Schwidefsky commit 39ce010d38bf6703b49f59eb73bef030b1d659f2 Author: Heiko Carstens Date: Fri Apr 27 16:02:00 2007 +0200 [S390] Clean up smp code in preparation for some larger changes. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 9ff6f4577e69801a43c0d58606a80040aecbc4bc Author: Martin Schwidefsky Date: Fri Apr 27 16:01:59 2007 +0200 [S390] Remove debugging junk. arch/s390/appldata/appldata_base.c has some confusing debugging code left over to allow compiling it as a module. In practice, it cannot be configured as module and there is no need to keep that code. Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky commit ecdcc0234b27472b561378ac59e2beeea06ec6ff Author: Martin Schwidefsky Date: Fri Apr 27 16:01:58 2007 +0200 [S390] Switch etr from tasklet to workqueue. The clock synchronization of the ETR code requires an smp_call_function to synchronize all cpus. Calling smp_call_function from a tasklet is illegal. Replace the tasklet with a job on the global workqueue. ETR work is rare and can be postponed to a be done by a kernel thread. Signed-off-by: Martin Schwidefsky commit 6c210482ae4a9a5bb9377ad250feaacec3faa3cd Author: Martin Schwidefsky Date: Fri Apr 27 16:01:57 2007 +0200 [S390] split page_test_and_clear_dirty. The page_test_and_clear_dirty primitive really consists of two operations, page_test_dirty and the page_clear_dirty. The combination of the two is not an atomic operation, so it makes more sense to have two separate operations instead of one. In addition to the improved readability of the s390 version of SetPageUptodate, it now avoids the page_test_dirty operation which is an insert-storage-key-extended (iske) instruction which is an expensive operation. Signed-off-by: Martin Schwidefsky commit 2fc2d1e9ffcde78af7ab63ed640d9a4901797de2 Author: Heiko Carstens Date: Fri Apr 27 16:01:56 2007 +0200 [S390] Processor degradation notification. Generate uevents for all cpus if cpu capability changes. This can happen e.g. because the cpus are overheating. The cpu capability can be read via /sys/devices/system/cpu/cpuN/capability. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit db77aa5f3d01fe6a6cc629dbd37936b1fdd129ba Author: Jan Glauber Date: Fri Apr 27 16:01:55 2007 +0200 [S390] vtime: cleanup per_cpu usage. Replace per_cpu(... , smp_processor_id()) with __get_cpu_var() Signed-off-by: Jan Glauber Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit 131a395c18af43d824841642038e5cc0d48f0bd2 Author: Jan Glauber Date: Fri Apr 27 16:01:54 2007 +0200 [S390] crypto: cleanup. Cleanup code and remove obsolete documentation. Signed-off-by: Jan Glauber Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit 6d4740c89c187ee8f5ac7355c4eeffda26493d1f Author: Stefan Haberland Date: Fri Apr 27 16:01:53 2007 +0200 [S390] sclp: fix coding style. Use only capital letters for defines. Cc: Peter Oberparleiter Signed-off-by: Stefan Haberland Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit 66b494a7178cbd84d8fc0e5f1e92d81fb6ec9f6e Author: Ursula Braun Date: Fri Apr 27 16:01:52 2007 +0200 [S390] vmlogrdr: stop IUCV connection in vmlogrdr_release. Reopen of /dev/account failed. The IUCV path has to be terminated in vmlogrdr_release. Signed-off-by: Ursula Braun Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit b3d00c3b9278876b84a808bc513048b145fdef90 Author: Peter Oberparleiter Date: Fri Apr 27 16:01:51 2007 +0200 [S390] sclp: initialize early. Add explicit sclp initialization for those sclp users that do not register with the interface. Signed-off-by: Peter Oberparleiter Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit b7127dfeed3252a76aa31f016aac5fba53d99711 Author: Ahmed S. Darwish Date: Fri Apr 27 16:01:50 2007 +0200 [S390] ctc: kmalloc->kzalloc/casting cleanups. A patch for the CTC / ESCON network driver. Switch from kmalloc to kzalloc when appropriate, remove some unnecessary kmalloc casts too. Since I have no s390 machine, I didn't compile it but I examined it carefully. Signed-off-by: "Ahmed S. Darwish" Cc: Frank Pavlic Cc: Ursula Braun Signed-off-by: Andrew Morton Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit 411ed3225733dbd83b4cbaaa992ef80d6ec1534e Author: Michael Holzheu Date: Fri Apr 27 16:01:49 2007 +0200 [S390] zfcpdump support. s390 machines provide hardware support for creating Linux dumps on SCSI disks. For creating a dump a special purpose dump Linux is used. The first 32 MB of memory are saved by the hardware before the dump Linux is booted. Via an SCLP interface, the saved memory can be accessed from Linux. This patch exports memory and registers of the crashed Linux to userspace via a debugfs file. For more information refer to Documentation/s390/zfcpdump.txt, which is included in this patch. Signed-off-by: Michael Holzheu Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 7039d3a11c4b4b59f9ef933b4b0a28304bdd07d1 Author: Peter Oberparleiter Date: Fri Apr 27 16:01:48 2007 +0200 [S390] dasd: Add ipldev parameter. Specifying 'ipldev' in the dasd= kernel parameter will automatically activate the boot device for use by the dasd driver. Signed-off-by: Peter Oberparleiter Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit 4dfd5c4593e69e9d399dd9e01d184dc534408f7e Author: Horst Hummel Date: Fri Apr 27 16:01:47 2007 +0200 [S390] dasd: Add sysfs attribute status and generate uevents. This patch adds a sysfs-attribute 'status' to make the DASD device-status accessible from user-space. In addition, the DASD driver generates an uevent(CHANGE) for the ccw-device on each device-status change. This enables user-space applications (e.g. udev) to do related processing. Signed-off-by: Horst Hummel Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit be7962856d299a0f231ac36f89f4a89cbecfe0ff Author: Martin Schwidefsky Date: Fri Apr 27 16:01:46 2007 +0200 [S390] Improved kernel stack overflow checking. Recent cvs versions of gcc have support for an improved stack overflow checking that calculates the size of the guard size for each function. If the compiler accepts -mstack-size without -mstack-guard then the new stack check is available. We always want to use the new stack checker. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 60691d3c2c0fe9ecc264741ff41f283fef579b8a Author: Heiko Carstens Date: Fri Apr 27 16:01:45 2007 +0200 [S390] Get rid of console setup functions. We get this: Section mismatch: reference to .init.text:con3270_consetup from .data between 'con3270' (at offset 0x45c8) and 'con3270_fn' Section mismatch: reference to .init.text:con3215_consetup from .data between 'con3215' (at offset 0x4678) and 'raw3215_ccw_driver' Since there is no difference between a non present console setup function and one that returns only 0 remove them. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit be5ec363e958982454ac9b3138b0e78c032e758d Author: Martin Schwidefsky Date: Fri Apr 27 16:01:44 2007 +0200 [S390] No execute support cleanup. Simplify the signal_return function that checks for the two special system calls sigreturn and rt_sigreturn. No need to do a page table walk, a call to copy_from_user while disabled page faults will work as well. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 10c1031f706bbe0690d84cdbccad15b11c6dc661 Author: Martin Schwidefsky Date: Fri Apr 27 16:01:43 2007 +0200 [S390] Minor fault path optimization. The minor fault path has grown a lot in terms of cycles. In particular the kprobes hook is very costly. Optimize the path to save a couple of cycles. If kprobes is enabled more than 300 cycles can be avoided if kprobes_running() is false. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit c0007f1a65762eaf55633d403b380130ec60adad Author: Heiko Carstens Date: Fri Apr 27 16:01:42 2007 +0200 [S390] Use generic bug. Generic bug implementation for s390. Will increase the value of the console output on BUG() statements since registers r0-r5,r14 will not be clobbered by a printk() call that was previously done before the illegal instruction of BUG() was hit. Also implements an architecture specific WARN_ON(). Output of that could be increased but requires common code change. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit bb11e3bdbac08f773a89f3ca287024a956ee8a12 Author: Martin Schwidefsky Date: Fri Apr 27 16:01:41 2007 +0200 [S390] Improved oops output. This patch adds two improvements to the oops output. First it adds an additional line after the PSW which decodes the different fields of it. Second a disassembler is added that decodes the instructions surrounding the faulting PSW. The output of a test oops now looks like this: kernel BUG at init/main.c:419 illegal operation: 0001 [#1] CPU: 0 Not tainted Process swapper (pid: 0, task: 0000000000464968, ksp: 00000000004be000) Krnl PSW : 0700000180000000 00000000000120b6 (rest_init+0x36/0x38) R:0 T:1 IO:1 EX:1 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 EA:3 Krnl GPRS: 0000000000000003 00000000004ba017 0000000000000022 0000000000000001 000000000003a5f6 0000000000000000 00000000004be6a8 0000000000000000 0000000000000000 00000000004b8200 0000000000003a50 0000000000008000 0000000000516368 000000000033d008 00000000000120b2 00000000004bdee0 Krnl Code: 00000000000120a6: e3e0f0980024 stg %r14,152(%r15) 00000000000120ac: c0e500014296 brasl %r14,3a5d8 00000000000120b2: a7f40001 brc 15,120b4 >00000000000120b6: 0707 bcr 0,%r7 00000000000120b8: eb7ff0500024 stmg %r7,%r15,80(%r15) 00000000000120be: c0d000195825 larl %r13,33d108 00000000000120c4: a7f13f00 tmll %r15,16128 00000000000120c8: a7840001 brc 8,120ca Call Trace: ([<00000000000120b2>] rest_init+0x32/0x38) [<00000000004be614>] start_kernel+0x37c/0x410 [<0000000000012020>] _ehead+0x20/0x80 Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 03ff9a235a0602724fc54916469b6e0939c62c9b Author: Martin Schwidefsky Date: Fri Apr 27 16:01:40 2007 +0200 [S390] System call cleanup. Remove system call glue for sys_clone, sys_fork, sys_vfork, sys_execve, sys_sigreturn, sys_rt_sigreturn and sys_sigaltstack. Call do_execve from kernel_execve directly, move pt_regs to the right place and branch to sysc_return to start the user space program. This removes the last in-kernel system call. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit ef99516c9646802c3d38c3eb83de302e05b3c1b5 Author: Cornelia Huck Date: Fri Apr 27 16:01:39 2007 +0200 [S390] cio: Unregister ccw devices directly. We used to unregister ccw devices not directly from the I/O subchannel remove function in order to avoid lifelocks on the css bus semaphore. This semaphore is gone, and there is no reason to not unregister the ccw device directly (it is even better since it is more in keeping with the goal of immediate disconnect). Signed-off-by: Cornelia Huck Signed-off-by: Martin Schwidefsky commit 8c4941c53b14e5a08ed2f270e9f087b410a9abcc Author: Cornelia Huck Date: Fri Apr 27 16:01:38 2007 +0200 [S390] cio: cm_enable memory leak. We allocage two pages when channel path measurements are enabled via cm_enable. We must not forget to free them again when channel path measurements are disabled again. Signed-off-by: Cornelia Huck Signed-off-by: Martin Schwidefsky commit d76123eb357a4baa653714183df286c1bb99f707 Author: Cornelia Huck Date: Fri Apr 27 16:01:37 2007 +0200 [S390] cio: ccwgroup register vs. unregister. Introduce a mutex for struct ccwgroup to prevent simuntaneous register/unregister on the same ccwgroup device. Signed-off-by: Cornelia Huck Signed-off-by: Martin Schwidefsky commit 82b7ac058f60e0c92f9237fbaf440671f437ecdf Author: Cornelia Huck Date: Fri Apr 27 16:01:36 2007 +0200 [S390] cio: Dont call css_update_ssd_info from interrupt context. Signed-off-by: Cornelia Huck Signed-off-by: Martin Schwidefsky commit 7ad6a24970325294a22a08446d473384c15b928e Author: Peter Oberparleiter Date: Fri Apr 27 16:01:35 2007 +0200 [S390] cio: fix subchannel channel-path data usage Ensure that channel-path related subchannel data is only retrieved and used when it is valid and that it is updated when it may have changed. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky commit 83b3370c79b91b9be3f6540c3c914e689134b45f Author: Peter Oberparleiter Date: Fri Apr 27 16:01:34 2007 +0200 [S390] cio: replace subchannel evaluation queue with bitmap Use a bitmap for indicating which subchannels require evaluation instead of allocating memory for each evaluation request. This approach reduces memory consumption during recovery in case of massive evaluation request occurrence and removes the need for memory allocation failure handling. Cc: Heiko Carstens Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky commit 387b734fc2b55f776b192c7afdfd892ba42347d4 Author: Stefan Bader Date: Fri Apr 27 16:01:33 2007 +0200 [S390] cio: Re-start path verification after aborting internal I/O. Path verification triggered by changes to the available CHPIDs will be interrupted by another change but not re-started. This results in an invalid path mask. To solve this make sure to completely re-start path verification when changing the available paths. Signed-off-by: Stefan Bader Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit cfbe9bb2fb5de1da58d351432a9465c22d6d3ee5 Author: Cornelia Huck Date: Fri Apr 27 16:01:32 2007 +0200 [S390] cio: Use add_uevent_var. Convert ccw_uevent to use add_uevent_var and adapt snprint_alias. Signed-off-by: Cornelia Huck Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit e5854a5839fa426a7873f038080f63587de5f1f1 Author: Peter Oberparleiter Date: Fri Apr 27 16:01:31 2007 +0200 [S390] cio: Channel-path configure function. Add a new attribute to the channel-path sysfs directory through which channel-path configure operations can be triggered. Also listen for hardware events requesting channel-path configure operations and process them accordingly. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit f5ba6c863617c15d22cce5f8666ff4c832773025 Author: Cornelia Huck Date: Fri Apr 27 16:01:30 2007 +0200 [S390] cio: Clean up online_store. Detangle the online_store code and make it more readable. Signed-off-by: Cornelia Huck Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky commit c9182e0f42c5646e670c2166b6d6638052d574af Author: Peter Oberparleiter Date: Fri Apr 27 16:01:29 2007 +0200 [S390] cio: observe chpid valid flag Check validity flag of CHPID description data before continuing with channel-path initialization. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky commit e6b6e10ac1de116fc6d2288f185393014851cccf Author: Peter Oberparleiter Date: Fri Apr 27 16:01:28 2007 +0200 [S390] cio: Introduce separate files for channel-path related code. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit d120b2a4e60cc9e62e7cc5dcf049100af3745cc4 Author: Peter Oberparleiter Date: Fri Apr 27 16:01:27 2007 +0200 [S390] cio: Allow 0 and 1 as input for channel path status attribute. Channel path status can now be modified by writing '0' and '1' to the sysfs status attribute in addition to 'offline' and 'online' respectively. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit f86635fad14c4a6810cf0e08488fc9129a3b3b32 Author: Peter Oberparleiter Date: Fri Apr 27 16:01:26 2007 +0200 [S390] cio: Introduce struct chp_id. Introduce data type for channel-path IDs. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 6fc321fd7dd91f0592f37503219196835314fbb7 Author: Heiko Carstens Date: Fri Apr 27 16:01:25 2007 +0200 [S390] cio/ipl: Clean interface between cio and ipl code. Clean interface between cio and ipl code, so Peter stops complaining. Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 29c380f5f06d0c5a320b9bb6f8987065e7b81c91 Author: Heiko Carstens Date: Fri Apr 27 16:01:04 2007 +0200 [S390] memory detection: stop at first memory hole. If both sclp and diag memory detection don't work stop at the first memory hole. Otherwise the code might loop forever... Signed-off-by: Martin Schwidefsky Signed-off-by: Heiko Carstens commit 8224ca195874525533665bbcd23b6da1e575aa4d Author: Haavard Skinnemoen Date: Fri Apr 27 14:21:47 2007 +0200 [AVR32] Fix compile error with gcc 4.1 gcc 4.1 doesn't seem to like const variables as inline assembly outputs. Drop support for reading 64-bit values using get_user() so that we can use an unsigned long to hold the result regardless of the actual size. This should be safe since many architectures, including i386, doesn't support reading 64-bit values with get_user(). Signed-off-by: Haavard Skinnemoen commit d468a030026017008286919aa6127b1190efb2c2 Author: Artem Bityutskiy Date: Fri Apr 27 15:11:44 2007 +0300 UBI: remove unused variable Signed-off-by: Artem Bityutskiy commit a4022b0d6005b117a985cec64559e048981a4244 Author: Mathieu Desnoyers Date: Tue Apr 10 18:23:09 2007 -0400 avr32: remove unneeded cast in atomic.h This int cast is superfluous since system.h cmpxchg already casts it in (typeof(*(ptr))). Signed-off-by: Mathieu Desnoyers Signed-off-by: Andrew Morton Signed-off-by: Haavard Skinnemoen commit 0277b378c3779e3c8a413afb7d4ee00fa24a5a26 Author: Robert P. J. Day Date: Thu Apr 26 08:53:38 2007 -0400 AVR32: Remove useless config option "GENERIC_BUST_SPINLOCK". Remove the clearly useless config option GENERIC_BUST_SPINLOCK, which is not used anywhere in the tree. Signed-off-by: Robert P. J. Day Signed-off-by: Haavard Skinnemoen commit c0c3e81608fc300027f2131e351e67ab118cf24c Author: Haavard Skinnemoen Date: Wed Mar 14 13:59:13 2007 +0100 [AVR32] Optimize the TLB miss handler Reorder some instructions and change the register usage to reduce the number of pipeline stalls. Also use the bfextu and bfins instructions for bitfield manipulations instead of shifting and masking. This makes gzipping a 80MB file approximately 2% faster. Signed-off-by: Haavard Skinnemoen commit 9ca20a8366462c553c27216161c735937f9de108 Author: Haavard Skinnemoen Date: Thu Apr 12 17:26:57 2007 +0200 [AVR32] Board code for ATNGW100 Add board code and defconfig for the ATNGW100 Network Gateway kit. For more information about this board, see http://www.atmel.com/dyn/products/tools_card.asp?tool_id=4102 Signed-off-by: Haavard Skinnemoen commit 2c1a2a3441a754a9b5a8e7184071154f8a9bd61b Author: Haavard Skinnemoen Date: Wed Mar 7 10:40:44 2007 +0100 [AVR32] Use memcpy/memset in memcpy_{from,to}_io and memset_io Using readb/writeb to implement these breaks NOR flash support. I can't see any reason why regular memcpy and memset shouldn't work. Signed-off-by: Haavard Skinnemoen commit d80e2bb12606906fd0b5b5592f519852de8b0113 Author: Haavard Skinnemoen Date: Wed Mar 21 16:23:41 2007 +0100 [AVR32] Get rid of board_setup_fbmem() Since the core setup code takes care of both allocation and reservation of framebuffer memory, there's no need for this board- specific hook anymore. Replace it with two global variables, fbmem_start and fbmem_size, which can be used directly. Signed-off-by: Haavard Skinnemoen commit f9692b9501c339ec90647d8cd6ee5c106f072f9f Author: Haavard Skinnemoen Date: Wed Mar 21 16:16:50 2007 +0100 [AVR32] Reserve framebuffer memory in early_parse_fbmem() With the current strategy of using the bootmem allocator to allocate or reserve framebuffer memory, there's a slight chance that the requested area has been taken by the boot allocator bitmap before we get around to reserving it. By inserting the framebuffer region as a reserved region as early as possible, we improve our chances for success and we make the region visible as a reserved region in dmesg and /proc/iomem without any extra work. Signed-off-by: Haavard Skinnemoen commit d8011768e6bdd0d9de5cc7bdbd3077b4b4fab8c7 Author: Haavard Skinnemoen Date: Wed Mar 21 16:02:57 2007 +0100 [AVR32] Simplify early handling of memory regions Use struct resource to specify both physical memory regions and reserved regions and push everything into the same framework, including kernel code/data and initrd memory. This allows us to get rid of many special cases in the bootmem initialization and will also make it easier to implement more robust handling of framebuffer memory later. Signed-off-by: Haavard Skinnemoen commit 5539f59ac40473730806580f212c4eac6e769f01 Author: Haavard Skinnemoen Date: Wed Mar 21 15:39:18 2007 +0100 [AVR32] Move setup_bootmem() from mm/init.c to kernel/setup.c Signed-off-by: Haavard Skinnemoen commit e3e7d8d4ea37b8372ee417452d03171c5dc55125 Author: Haavard Skinnemoen Date: Mon Feb 12 16:28:56 2007 +0100 [AVR32] Make I/O access macros work with external devices Fix the I/O access macros so that they work with externally connected devices accessed in little-endian mode over any bus width: * Use a set of macros to define I/O port- and memory operations borrowed from MIPS. * Allow subarchitecture to specify address- and data-mangling * Implement at32ap-specific port mangling (with build-time configurable bus width. Only one bus width at a time supported for now.) * Rewrite iowriteN and friends to use write[bwl] and friends (not the __raw counterparts.) This has been tested using pata_pcmcia to access a CompactFlash card connected to the EBI (16-bit bus width.) Signed-off-by: Haavard Skinnemoen commit 92b728c147adb8c690b520304f4c9ee3eee43c21 Author: Haavard Skinnemoen Date: Tue Mar 13 10:06:37 2007 +0100 [AVR32] Fix NMI handler Fix a problem with the NMI handler entry code related to the NMI handler sharing some code with the exception handlers. This is not a good idea because the RSR and RAR registers are not the same, and the NMI handler runs with interrupts masked the whole time so there's no need to check for pending work. Open-code the low-level NMI handling logic instead so that the pt_regs layout is actually correct when the higher-level handler is called. Signed-off-by: Haavard Skinnemoen commit 623b0355d5b1f9c6d05005b649a2f3a7b9fd7816 Author: Haavard Skinnemoen Date: Tue Mar 13 17:59:11 2007 +0100 [AVR32] Clean up exception handling code * Use generic BUG() handling * Remove some useless debug statements * Use a common function _exception() to send signals or oops when an exception can't be handled. This makes sure init doesn't enter an infinite exception loop as well. Borrowed from powerpc. * Add some basic exception tracing support to the page fault code. * Rework dump_stack(), show_regs() and friends and move everything into process.c * Print information about configuration options and chip type when oopsing Signed-off-by: Haavard Skinnemoen commit 3b328c98093702c584692bffabd440800b383d73 Author: Haavard Skinnemoen Date: Tue Mar 13 15:30:38 2007 +0100 [AVR32] Clean up cpu identification and add features bitmap Clean up the cpu identification code, using definitions from instead of hardcoded constants. Also, add a features bitmap to struct avr32_cpuinfo to allow other code to make decisions based upon what the running cpu is actually capable of. Signed-off-by: Haavard Skinnemoen commit 535c806c26ef602d578792083df52b31803b961e Author: Haavard Skinnemoen Date: Tue Mar 13 14:17:14 2007 +0100 [AVR32] Clean up asm/sysreg.h Fix indentation and remove spurious comments in asm-avr32/sysreg.h Signed-off-by: Haavard Skinnemoen commit 188ff65d49dadf7b0e9b6718abc3fe98a5098711 Author: Haavard Skinnemoen Date: Wed Mar 14 13:23:44 2007 +0100 [AVR32] Don't enable clocks with no users Bring the code that sets the initial PM clock masks in line with the comment preceding it by only enabling clocks that have users != 0. Fix SM clock definition and avr32_hpt_init() so that the SM and TC0 clocks keep ticking. Signed-off-by: Haavard Skinnemoen commit 19b7ce8bad718a2850ea19aeb7383f1728596c24 Author: Hans-Christian Egtvedt Date: Mon Feb 26 13:50:43 2007 +0100 [AVR32] Put cpu in sleep 0 when idle. This patch puts the CPU in sleep 0 when doing nothing, idle. This will turn of the CPU clock and thus save power. The CPU is waken again when an interrupt occurs. Signed-off-by: Hans-Christian Egtvedt Signed-off-by: Haavard Skinnemoen commit 7760989e5e2900e484e9115e6e690c6ce0b0221c Author: Hans-Christian Egtvedt Date: Mon Mar 12 18:15:16 2007 +0100 [AVR32] Change system timer from count-compare to Timer/Counter 0 Due to limitation of the count-compare system timer (not able to count when CPU is in sleep), the system timer had to be changed to use a peripheral timer/counter. The old COUNT-COMPARE code is still present in time.c as weak functions. The new timer is added to the architecture directory. This patch sets up TC0 as system timer The new timer has been tested on AT32AP7000/ATSTK1000 at 100 Hz, 250 Hz, 300 Hz and 1000 Hz. For more details about the timer/counter see the datasheet for AT32AP700x available at http://www.atmel.com/dyn/products/product_card.asp?part_id=3903 Signed-off-by: Hans-Christian Egtvedt Signed-off-by: Haavard Skinnemoen commit 228e845fd243bf42033998afab792357444e9e4a Author: Haavard Skinnemoen Date: Wed Mar 7 15:24:34 2007 +0100 [AVR32] Add mach-specific Kconfig Include at32ap-specific Kconfig file from top-level Kconfig file. The at32ap Kconfig is currently empty, but it will grow some machine- specific options soon. Signed-off-by: Haavard Skinnemoen commit 068d9f6eb9369a00eb45be91c07653cfef65f4a0 Author: Hans-Christian Egtvedt Date: Wed Jan 31 18:01:45 2007 +0100 [AVR32] Add nwait and tdf parameters to SMC configuration Complete the SMC configuration code by adding nwait and tdf parameter. After this change, we support the same parameters as the hardware. Signed-off-by: Haavard Skinnemoen commit 485764016d5accb813e8bdd076802a7e3318bb64 Author: Artem Bityutskiy Date: Tue Feb 13 17:11:10 2007 +0200 UBI: add me to MAINTAINERS Signed-off-by: Artem Bityutskiy commit 0029da3bf430eea498eee8cef5933f9214534b8a Author: Artem Bityutskiy Date: Wed Oct 4 19:15:21 2006 +0300 JFFS2: add UBI support This patch make JFFS2 able to work with UBI volumes via the emulated MTD devices which are directly mapped to these volumes. Signed-off-by: Artem Bityutskiy commit 801c135ce73d5df1caf3eca35b66a10824ae0707 Author: Artem B. Bityutskiy Date: Tue Jun 27 12:22:22 2006 +0400 UBI: Unsorted Block Images UBI (Latin: "where?") manages multiple logical volumes on a single flash device, specifically supporting NAND flash devices. UBI provides a flexible partitioning concept which still allows for wear-levelling across the whole flash device. In a sense, UBI may be compared to the Logical Volume Manager (LVM). Whereas LVM maps logical sector numbers to physical HDD sector numbers, UBI maps logical eraseblocks to physical eraseblocks. More information may be found at http://www.linux-mtd.infradead.org/doc/ubi.html Partitioning/Re-partitioning An UBI volume occupies a certain number of erase blocks. This is limited by a configured maximum volume size, which could also be viewed as the partition size. Each individual UBI volume's size can be changed independently of the other UBI volumes, provided that the sum of all volume sizes doesn't exceed a certain limit. UBI supports dynamic volumes and static volumes. Static volumes are read-only and their contents are protected by CRC check sums. Bad eraseblocks handling UBI transparently handles bad eraseblocks. When a physical eraseblock becomes bad, it is substituted by a good physical eraseblock, and the user does not even notice this. Scrubbing On a NAND flash bit flips can occur on any write operation, sometimes also on read. If bit flips persist on the device, at first they can still be corrected by ECC, but once they accumulate, correction will become impossible. Thus it is best to actively scrub the affected eraseblock, by first copying it to a free eraseblock and then erasing the original. The UBI layer performs this type of scrubbing under the covers, transparently to the UBI volume users. Erase Counts UBI maintains an erase count header per eraseblock. This frees higher-level layers (like file systems) from doing this and allows for centralized erase count management instead. The erase counts are used by the wear-levelling algorithm in the UBI layer. The algorithm itself is exchangeable. Booting from NAND For booting directly from NAND flash the hardware must at least be capable of fetching and executing a small portion of the NAND flash. Some NAND flash controllers have this kind of support. They usually limit the window to a few kilobytes in erase block 0. This "initial program loader" (IPL) must then contain sufficient logic to load and execute the next boot phase. Due to bad eraseblocks, which may be randomly scattered over the flash device, it is problematic to store the "secondary program loader" (SPL) statically. Also, due to bit-flips it may become corrupted over time. UBI allows to solve this problem gracefully by storing the SPL in a small static UBI volume. UBI volumes vs. static partitions UBI volumes are still very similar to static MTD partitions: * both consist of eraseblocks (logical eraseblocks in case of UBI volumes, and physical eraseblocks in case of static partitions; * both support three basic operations - read, write, erase. But UBI volumes have the following advantages over traditional static MTD partitions: * there are no eraseblock wear-leveling constraints in case of UBI volumes, so the user should not care about this; * there are no bit-flips and bad eraseblocks in case of UBI volumes. So, UBI volumes may be considered as flash devices with relaxed restrictions. Where can it be found? Documentation, kernel code and applications can be found in the MTD gits. What are the applications for? The applications help to create binary flash images for two purposes: pfi files (partial flash images) for in-system update of UBI volumes, and plain binary images, with or without OOB data in case of NAND, for a manufacturing step. Furthermore some tools are/and will be created that allow flash content analysis after a system has crashed.. Who did UBI? The original ideas, where UBI is based on, were developed by Andreas Arnez, Frank Haverkamp and Thomas Gleixner. Josh W. Boyer and some others were involved too. The implementation of the kernel layer was done by Artem B. Bityutskiy. The user-space applications and tools were written by Oliver Lohmann with contributions from Frank Haverkamp, Andreas Arnez, and Artem. Joern Engel contributed a patch which modifies JFFS2 so that it can be run on a UBI volume. Thomas Gleixner did modifications to the NAND layer. Alexander Schmidt made some testing work as well as core functionality improvements. Signed-off-by: Artem B. Bityutskiy Signed-off-by: Frank Haverkamp commit 9c8f8e752431f3f7ed6ea6ea6e491ce12057f572 Author: Haavard Skinnemoen Date: Thu Feb 1 16:34:10 2007 +0100 [AVR32] Add basic HMATRIX support This adds register and clock definitions for the High-speed bus Matrix (HMATRIX) as well as a function that can be used to configure special EBI functionality like CompactFlash and NAND flash support. Signed-off-by: Haavard Skinnemoen commit 912a41a4ab935ce8c4308428ec13fc7f8b1f18f4 Author: Sergey Vlasov Date: Fri Apr 27 02:17:19 2007 -0700 [IPV4] nl_fib_lookup: Initialise res.r before fib_res_put(&res) When CONFIG_IP_MULTIPLE_TABLES is enabled, the code in nl_fib_lookup() needs to initialize the res.r field before fib_res_put(&res) - unlike fib_lookup(), a direct call to ->tb_lookup does not set this field. Signed-off-by: Sergey Vlasov Signed-off-by: David S. Miller commit ebbd90a730711280142017e482f27ec3fbb4f227 Author: YOSHIFUJI Hideaki Date: Fri Apr 27 02:13:39 2007 -0700 [IPV6]: Fix thinko in ipv6_rthdr_rcv() changes. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 49e9f70f8e7a4df00a5185e7f5c91e3c583847db Author: David S. Miller Date: Fri Apr 27 01:04:23 2007 -0700 [IPV4]: Add multipath cached to feature-removal-schedule.txt Signed-off-by: David S. Miller commit cd9ad58d4061494e7fdd70ded7bcf2418daf356a Author: David S. Miller Date: Thu Apr 26 21:19:23 2007 -0700 [SCSI] SUNESP: Complete driver rewrite to version 2.0 Major features: 1) Tagged queuing support. 2) Will properly negotiate for synchronous transfers even on devices that reject the wide negotiation message, such as CDROMs 3) Significantly lower kernel stack usage in interrupt handler path by elimination of function vector arrays, replaced by a top-level switch statement state machine. 4) Uses generic scsi infrastructure as much as possible to avoid code duplication. 5) Automatic request of sense data in response to CHECK_CONDITION 6) Portable to other platforms using ESP such as DEC and Sun3 systems. Signed-off-by: David S. Miller commit 16ce82d846f2e6b652a064f91c5019cfe8682be4 Author: David S. Miller Date: Thu Apr 26 21:08:21 2007 -0700 [SPARC64]: Convert PCI over to generic struct iommu/strbuf. Signed-off-by: David S. Miller commit f16bfc1c0958ff340a02779ab139b03fb5ba6e82 Author: Johannes Berg Date: Thu Apr 26 20:51:12 2007 -0700 [WIRELESS] cfg80211: Clarify locking comment. This patch clarifies the comment about locking in wiphy_unregister. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit a4d73ee168eeaed3baea86542ad42e1fd7e192d3 Author: Johannes Berg Date: Thu Apr 26 20:50:35 2007 -0700 [WIRELESS] cfg80211: Fix locking in wiphy_new. This patch fixes the locking in wiphy new. Ingo Oeser noticed that locking in the error case was wrong and also suggested this fix. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit b86e0280bb5585a610783ff5392d9d439dee7ddd Author: Johannes Berg Date: Thu Apr 26 20:48:23 2007 -0700 [WEXT] net_device: Don't include wext bits if not required. This patch makes the wext bits in struct net_device depend on CONFIG_WIRELESS_EXT. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 4d44e0dfe961e02489d40d32334454ebe0e784e8 Author: Johannes Berg Date: Thu Apr 26 20:47:25 2007 -0700 [WEXT]: Misc code cleanups. Just a few things that didn't fit in with the other patches. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit bdf51894c1d7ce3fba8c8fdf485e85173ac60c6c Author: Johannes Berg Date: Thu Apr 26 20:46:55 2007 -0700 [WEXT]: Reduce inline abuse. This patch removes a bunch of inline abuse from wext. Most functions that were marked inline are only used once so the compiler will inline them anyway, others are used multiple times but there's no requirement for them to be inline since they aren't in any fast paths. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 7a9df167db0f200d5f8e393376dba8ceeae0fd53 Author: Johannes Berg Date: Thu Apr 26 20:46:23 2007 -0700 [WEXT]: Move EXPORT_SYMBOL statements where they belong. EXPORT_SYMBOL statements are supposed to go together with the symbol they're exporting. This patch moves them accordingly. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit dd8ceabcd10d47f6f28ecfaf2eac7beffca11b3c Author: Johannes Berg Date: Thu Apr 26 20:45:47 2007 -0700 [WEXT]: Cleanup early ioctl call path. This patch makes the code in wireless_process_ioctl somewhat more readable. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 4b1e255384570138c2a823904796d46f628e8350 Author: Johannes Berg Date: Thu Apr 26 20:45:14 2007 -0700 [WEXT]: Remove options. This patch kills the two options in wext that are required to be enabled anyway because they influence the userspace API. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 235c107ba08becb3ae6c3d3449c8b1053a5a9d75 Author: Johannes Berg Date: Thu Apr 26 20:44:35 2007 -0700 [WEXT]: Remove dead debug code. This patch kills a whole bunch of code that can only ever be used by defining some things in wext.c. Also, the things that are printed are mostly useless since the API is fairly well-tested. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 295f4a1fa3ecdf816b18393ef7bcd37c032df2fa Author: Johannes Berg Date: Thu Apr 26 20:43:56 2007 -0700 [WEXT]: Clean up how wext is called. This patch cleans up the call paths from the core code into wext. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 11433ee450eb4a320f46ce5ed51410b52803ffcc Author: Johannes Berg Date: Thu Apr 26 20:42:51 2007 -0700 [WEXT]: Move to net/wireless This patch moves dev/core/wireless.c to net/wireless/wext.c. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 39bf09493042200b967cdf2ee6e3f670b7963903 Author: David S. Miller Date: Thu Apr 26 20:39:14 2007 -0700 [AFS]: Eliminate cmpxchg() usage in vlocation code. cmpxchg() is not available on every processor so can't be used in generic code. Replace with spinlock protection on the ->state changes, wakeups, and wait loops. Add what appears to be a missing wakeup on transition to AFS_VL_VALID state in afs_vlocation_updater(). Signed-off-by: David S. Miller commit 68c708fd5e90f6d178c84bb7e641589eb2842319 Author: David S. Miller Date: Thu Apr 26 20:20:21 2007 -0700 [RXRPC]: Fix pointers passed to bitops. CC [M] net/rxrpc/ar-input.o net/rxrpc/ar-input.c: In function ‘rxrpc_fast_process_data’: net/rxrpc/ar-input.c:171: warning: passing argument 2 of ‘__test_and_set_bit’ from incompatible pointer type net/rxrpc/ar-input.c:180: warning: passing argument 2 of ‘__clear_bit’ from incompatible pointer type net/rxrpc/ar-input.c:218: warning: passing argument 2 of ‘__clear_bit’ from incompatible pointer type Signed-off-by: David S. Miller commit 411faf5810cdd0e4f5071a3805d8adb49d120a07 Author: David S. Miller Date: Thu Apr 26 20:18:17 2007 -0700 [RXRPC]: Remove bogus atomic_* overrides. These are done with CPP defines which several platforms use for their atomic.h implementation, which floods the build with warnings and breaks the build. Signed-off-by: David S. Miller commit ba3e0e1accd8d5bb12eaeb0977429d8dc04f6d1e Author: David S. Miller Date: Thu Apr 26 16:06:22 2007 -0700 [AFS]: Fix u64 printing in debug logging. Need 'unsigned long long' casts to quiet warnings on 64-bit platforms when using %ll on a u64. Signed-off-by: David S. Miller commit 260a980317dac80182dd76140cf67c6e81d6d3dd Author: David Howells Date: Thu Apr 26 15:59:35 2007 -0700 [AFS]: Add "directory write" support. Add support for the create, link, symlink, unlink, mkdir, rmdir and rename VFS operations to the in-kernel AFS filesystem. Also: (1) Fix dentry and inode revalidation. d_revalidate should only look at state of the dentry. Revalidation of the contents of an inode pointed to by a dentry is now separate. (2) Fix afs_lookup() to hash negative dentries as well as positive ones. Signed-off-by: David Howells Signed-off-by: David S. Miller commit c35eccb1f614954b10cba3f74b7c301993b2f42e Author: David Howells Date: Thu Apr 26 15:58:49 2007 -0700 [AFS]: Implement the CB.InitCallBackState3 operation. Implement the CB.InitCallBackState3 operation for the fileserver to call. This reduces the amount of network traffic because if this op is aborted, the fileserver will then attempt an CB.InitCallBackState operation. Signed-off-by: David Howells Signed-off-by: David S. Miller commit b908fe6b2d1294d93b0d0badf6bf4f9a2cd7d729 Author: David Howells Date: Thu Apr 26 15:58:17 2007 -0700 [AFS]: Add support for the CB.GetCapabilities operation. Add support for the CB.GetCapabilities operation with which the fileserver can ask the client for the following information: (1) The list of network interfaces it has available as IPv4 address + netmask plus the MTUs. (2) The client's UUID. (3) The extended capabilities of the client, for which the only current one is unified error mapping (abort code interpretation). To support this, the patch adds the following routines to AFS: (1) A function to iterate through all the network interfaces using RTNETLINK to extract IPv4 addresses and MTUs. (2) A function to iterate through all the network interfaces using RTNETLINK to pull out the MAC address of the lowest index interface to use in UUID construction. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 0795e7c031c4bda46fbdde678adf29de19bef7f4 Author: David Howells Date: Thu Apr 26 15:57:43 2007 -0700 [AFS]: Update the AFS fs documentation. Update the AFS fs documentation. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 00d3b7a4533e367b0dc2812a706db8f9f071c27f Author: David Howells Date: Thu Apr 26 15:57:07 2007 -0700 [AFS]: Add security support. Add security support to the AFS filesystem. Kerberos IV tickets are added as RxRPC keys are added to the session keyring with the klog program. open() and other VFS operations then find this ticket with request_key() and either use it immediately (eg: mkdir, unlink) or attach it to a file descriptor (open). Signed-off-by: David Howells Signed-off-by: David S. Miller commit 436058a49e0fb91c74454dbee9cfee6fb53b4336 Author: David Howells Date: Thu Apr 26 15:56:24 2007 -0700 [AFS]: Handle multiple mounts of an AFS superblock correctly. Handle multiple mounts of an AFS superblock correctly, checking to see whether the superblock is already initialised after calling sget() rather than just unconditionally stamping all over it. Also delete the "silent" parameter to afs_fill_super() as it's not used and can, in any case, be obtained from sb->s_flags. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 63b6be55e8b51cb718468794d343058e96c7462c Author: David Howells Date: Thu Apr 26 15:55:48 2007 -0700 [AF_RXRPC]: Delete the old RxRPC code. Delete the old RxRPC code as it's now no longer used. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 08e0e7c82eeadec6f4871a386b86bf0f0fbcb4eb Author: David Howells Date: Thu Apr 26 15:55:03 2007 -0700 [AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC. Make the in-kernel AFS filesystem use AF_RXRPC instead of the old RxRPC code. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 651350d10f93bed7003c9a66e24cf25e0f8eed3d Author: David Howells Date: Thu Apr 26 15:50:17 2007 -0700 [AF_RXRPC]: Add an interface to the AF_RXRPC module for the AFS filesystem to use Add an interface to the AF_RXRPC module so that the AFS filesystem module can more easily make use of the services available. AFS still opens a socket but then uses the action functions in lieu of sendmsg() and registers an intercept functions to grab messages before they're queued on the socket Rx queue. This permits AFS (or whatever) to: (1) Avoid the overhead of using the recvmsg() call. (2) Use different keys directly on individual client calls on one socket rather than having to open a whole slew of sockets, one for each key it might want to use. (3) Avoid calling request_key() at the point of issue of a call or opening of a socket. This is done instead by AFS at the point of open(), unlink() or other VFS operation and the key handed through. (4) Request the use of something other than GFP_KERNEL to allocate memory. Furthermore: (*) The socket buffer markings used by RxRPC are made available for AFS so that it can interpret the cooked RxRPC messages itself. (*) rxgen (un)marshalling abort codes are made available. The following documentation for the kernel interface is added to Documentation/networking/rxrpc.txt: ========================= AF_RXRPC KERNEL INTERFACE ========================= The AF_RXRPC module also provides an interface for use by in-kernel utilities such as the AFS filesystem. This permits such a utility to: (1) Use different keys directly on individual client calls on one socket rather than having to open a whole slew of sockets, one for each key it might want to use. (2) Avoid having RxRPC call request_key() at the point of issue of a call or opening of a socket. Instead the utility is responsible for requesting a key at the appropriate point. AFS, for instance, would do this during VFS operations such as open() or unlink(). The key is then handed through when the call is initiated. (3) Request the use of something other than GFP_KERNEL to allocate memory. (4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be intercepted before they get put into the socket Rx queue and the socket buffers manipulated directly. To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket, bind an addess as appropriate and listen if it's to be a server socket, but then it passes this to the kernel interface functions. The kernel interface functions are as follows: (*) Begin a new client call. struct rxrpc_call * rxrpc_kernel_begin_call(struct socket *sock, struct sockaddr_rxrpc *srx, struct key *key, unsigned long user_call_ID, gfp_t gfp); This allocates the infrastructure to make a new RxRPC call and assigns call and connection numbers. The call will be made on the UDP port that the socket is bound to. The call will go to the destination address of a connected client socket unless an alternative is supplied (srx is non-NULL). If a key is supplied then this will be used to secure the call instead of the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls secured in this way will still share connections if at all possible. The user_call_ID is equivalent to that supplied to sendmsg() in the control data buffer. It is entirely feasible to use this to point to a kernel data structure. If this function is successful, an opaque reference to the RxRPC call is returned. The caller now holds a reference on this and it must be properly ended. (*) End a client call. void rxrpc_kernel_end_call(struct rxrpc_call *call); This is used to end a previously begun call. The user_call_ID is expunged from AF_RXRPC's knowledge and will not be seen again in association with the specified call. (*) Send data through a call. int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg, size_t len); This is used to supply either the request part of a client call or the reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the data buffers to be used. msg_iov may not be NULL and must point exclusively to in-kernel virtual addresses. msg.msg_flags may be given MSG_MORE if there will be subsequent data sends for this call. The msg must not specify a destination address, control data or any flags other than MSG_MORE. len is the total amount of data to transmit. (*) Abort a call. void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code); This is used to abort a call if it's still in an abortable state. The abort code specified will be placed in the ABORT message sent. (*) Intercept received RxRPC messages. typedef void (*rxrpc_interceptor_t)(struct sock *sk, unsigned long user_call_ID, struct sk_buff *skb); void rxrpc_kernel_intercept_rx_messages(struct socket *sock, rxrpc_interceptor_t interceptor); This installs an interceptor function on the specified AF_RXRPC socket. All messages that would otherwise wind up in the socket's Rx queue are then diverted to this function. Note that care must be taken to process the messages in the right order to maintain DATA message sequentiality. The interceptor function itself is provided with the address of the socket and handling the incoming message, the ID assigned by the kernel utility to the call and the socket buffer containing the message. The skb->mark field indicates the type of message: MARK MEANING =============================== ======================================= RXRPC_SKB_MARK_DATA Data message RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call RXRPC_SKB_MARK_BUSY Client call rejected as server busy RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer RXRPC_SKB_MARK_NET_ERROR Network error detected RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance The remote abort message can be probed with rxrpc_kernel_get_abort_code(). The two error messages can be probed with rxrpc_kernel_get_error_number(). A new call can be accepted with rxrpc_kernel_accept_call(). Data messages can have their contents extracted with the usual bunch of socket buffer manipulation functions. A data message can be determined to be the last one in a sequence with rxrpc_kernel_is_data_last(). When a data message has been used up, rxrpc_kernel_data_delivered() should be called on it.. Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose of. It is possible to get extra refs on all types of message for later freeing, but this may pin the state of a call until the message is finally freed. (*) Accept an incoming call. struct rxrpc_call * rxrpc_kernel_accept_call(struct socket *sock, unsigned long user_call_ID); This is used to accept an incoming call and to assign it a call ID. This function is similar to rxrpc_kernel_begin_call() and calls accepted must be ended in the same way. If this function is successful, an opaque reference to the RxRPC call is returned. The caller now holds a reference on this and it must be properly ended. (*) Reject an incoming call. int rxrpc_kernel_reject_call(struct socket *sock); This is used to reject the first incoming call on the socket's queue with a BUSY message. -ENODATA is returned if there were no incoming calls. Other errors may be returned if the call had been aborted (-ECONNABORTED) or had timed out (-ETIME). (*) Record the delivery of a data message and free it. void rxrpc_kernel_data_delivered(struct sk_buff *skb); This is used to record a data message as having been delivered and to update the ACK state for the call. The socket buffer will be freed. (*) Free a message. void rxrpc_kernel_free_skb(struct sk_buff *skb); This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC socket. (*) Determine if a data message is the last one on a call. bool rxrpc_kernel_is_data_last(struct sk_buff *skb); This is used to determine if a socket buffer holds the last data message to be received for a call (true will be returned if it does, false if not). The data message will be part of the reply on a client call and the request on an incoming call. In the latter case there will be more messages, but in the former case there will not. (*) Get the abort code from an abort message. u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb); This is used to extract the abort code from a remote abort message. (*) Get the error number from a local or network error message. int rxrpc_kernel_get_error_number(struct sk_buff *skb); This is used to extract the error number from a message indicating either a local error occurred or a network error occurred. Signed-off-by: David Howells Signed-off-by: David S. Miller commit ec26815ad847dbf74a1e27aa5515fb7d5dc6ee6f Author: David Howells Date: Thu Apr 26 15:49:28 2007 -0700 [AFS]: Clean up the AFS sources Clean up the AFS sources. Also remove references to AFS keys. RxRPC keys are used instead. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 17926a79320afa9b95df6b977b40cca6d8713cea Author: David Howells Date: Thu Apr 26 15:48:28 2007 -0700 [AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both Provide AF_RXRPC sockets that can be used to talk to AFS servers, or serve answers to AFS clients. KerberosIV security is fully supported. The patches and some example test programs can be found in: http://people.redhat.com/~dhowells/rxrpc/ This will eventually replace the old implementation of kernel-only RxRPC currently resident in net/rxrpc/. Signed-off-by: David Howells Signed-off-by: David S. Miller commit e19dff1fdd99a25819af74cf0710e147fff4fd3a Author: David Howells Date: Thu Apr 26 15:46:56 2007 -0700 [AF_RXRPC]: Make it possible to merely try to cancel timers from a module Export try_to_del_timer_sync() for use by the AF_RXRPC module. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 7318226ea2931a627f3572e5f4804c91ca19ecbc Author: David Howells Date: Thu Apr 26 15:46:23 2007 -0700 [AF_RXRPC]: Key facility changes for AF_RXRPC Export the keyring key type definition and document its availability. Add alternative types into the key's type_data union to make it more useful. Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for example), so make it clear that it can be used in other ways. Signed-off-by: David Howells Signed-off-by: David S. Miller commit 071b638689464c6b39407025eedd810d5b5e6f5d Author: Oleg Nesterov Date: Thu Apr 26 15:45:32 2007 -0700 [WORKQUEUE]: cancel_delayed_work: use del_timer() instead of del_timer_sync() del_timer_sync() buys nothing for cancel_delayed_work(), but it is less efficient since it locks the timer unconditionally, and may wait for the completion of the delayed_work_timer_fn(). cancel_delayed_work() == 0 means: before this patch: work->func may still be running or queued after this patch: work->func may still be running or queued, or delayed_work_timer_fn->__queue_work() in progress. The latter doesn't differ from the caller's POV, delayed_work_timer_fn() is called with _PENDING bit set. cancel_delayed_work() == 1 with this patch adds a new possibility: delayed_work->work was cancelled, but delayed_work_timer_fn is still running (this is only possible for the re-arming works on single-threaded workqueue). In this case the timer was re-started by work->func(), nobody else can do this. This in turn means that delayed_work_timer_fn has already passed __queue_work() (and wont't touch delayed_work) because nobody else can queue delayed_work->work. Signed-off-by: Oleg Nesterov Signed-Off-By: David Howells Signed-off-by: David S. Miller commit 83418978827324918a8cd25ce5227312de1d4468 Author: Mark Fasheh Date: Mon Apr 23 18:53:12 2007 -0700 ocfs2: Cache extent records The extent map code was ripped out earlier because of an inability to deal with holes. This patch adds back a simpler caching scheme requiring far less code. Our old extent map caching was designed back when meta data block caching in Ocfs2 didn't work very well, resulting in many disk reads. These days our metadata caching is much better, resulting in no un-necessary disk reads. As a result, extent caching doesn't have to be as fancy, nor does it have to cache as many extents. Keeping the last 3 extents seen should be sufficient to give us a small performance boost on some streaming workloads. Signed-off-by: Mark Fasheh commit 7cdfc3a1c3971c9125c317cb8c2525745851798e Author: Mark Fasheh Date: Mon Apr 16 17:28:51 2007 -0700 ocfs2: Remember rw lock level during direct io Cluster locking might have been redone because a direct write won't complete, so this needs to be reflected in the iocb. Signed-off-by: Mark Fasheh commit 8110b073a9135acf0a71bccfc20c0d1023f179c6 Author: Mark Fasheh Date: Thu Mar 22 16:53:23 2007 -0700 ocfs2: Fix up i_blocks calculation to know about holes Older file systems which didn't support holes did a dumb calculation of i_blocks based on i_size. This is no longer accurate, so fix things up to take actual allocation into account. Signed-off-by: Mark Fasheh commit 4f902c37727bbedbc0508a1477874c58ddcc9af8 Author: Mark Fasheh Date: Fri Mar 9 16:26:50 2007 -0800 ocfs2: Fix extent lookup to return true size of holes Initially, we had wired things to return a size '1' of holes. Cook up a small amount of code to find the next extent and calculate the number of clusters between the virtual offset and the next allocated extent. Signed-off-by: Mark Fasheh commit 49cb8d2d496ce06869ccca2ab368ed6b0b5b979d Author: Mark Fasheh Date: Fri Mar 9 16:21:46 2007 -0800 ocfs2: Read from an unwritten extent returns zeros Return an optional extent flags field from our lookup functions and wire up callers to treat unwritten regions as holes for the purpose of returning zeros to the user. Signed-off-by: Mark Fasheh commit e48edee2d8eab812f31f0ff62c6ba635ca2e1e21 Author: Mark Fasheh Date: Wed Mar 7 16:46:57 2007 -0800 ocfs2: make room for unwritten extents flag Due to the size of our group bitmaps, we'll never have a leaf node extent record with more than 16 bits worth of clusters. Split e_clusters up so that leaf nodes can get a flags field where we can mark unwritten extents. Interior nodes whose length references all the child nodes beneath it can't split their e_clusters field, so we use a union to preserve sizing there. Signed-off-by: Mark Fasheh commit 6af67d8205cf65fbaaa743edc7ebb46e486e34ff Author: Mark Fasheh Date: Tue Mar 6 17:24:46 2007 -0800 ocfs2: Use own splice write actor We need to fill holes during a splice write. Provide our own splice write actor which can call ocfs2_file_buffered_write() with a splice-specific callback. Signed-off-by: Mark Fasheh commit fa41045fcbf78269991d5aebb1820fc51534f05d Author: Mark Fasheh Date: Thu Mar 1 11:22:19 2007 -0800 ocfs2: Use do_sync_mapping_range() in ocfs2_zero_tail_for_truncate() Do this instead of filemap_fdatawrite() - this way we sync only the range between i_size and the cluster boundary. Signed-off-by: Mark Fasheh commit 5b04aa3a64f854244bc40a6f528176ed50b5c4f6 Author: Mark Fasheh Date: Thu Mar 1 11:01:55 2007 -0800 [PATCH] Turn do_sync_file_range() into do_sync_mapping_range() do_sync_file_range() accepts a file * from which it takes an address_space to sync. Abstract out the bulk of the function into do_sync_mapping_range() which takes the address_space directly. This way callers who want to sync an address_space directly can take advantage of the functionality provided. do_sync_file_range() is preserved as a small wrapper around do_sync_mapping_range(). Ocfs2 in particular would like to use this to initiate a sync of a specific inode range during truncate, where a file * may not be available. Signed-off-by: Mark Fasheh Cc: Christoph Hellwig Signed-off-by: Andrew Morton commit 60b11392f1a09433740bda3048202213daa27736 Author: Mark Fasheh Date: Fri Feb 16 11:46:50 2007 -0800 ocfs2: zero tail of sparse files on truncate Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a subsequent extend could expose bad data. This introduced a new helper, which can be used in ocfs2_write(). Signed-off-by: Mark Fasheh commit 25baf2da1473d9dcde1a4c7b0ab26e7d67d9bf62 Author: Mark Fasheh Date: Wed Feb 14 15:30:30 2007 -0800 ocfs2: Teach ocfs2_get_block() about holes ocfs2_get_block() didn't understand sparse files, fix that. Also remove some code that isn't really useful anymore. We can fix up ocfs2_direct_IO_get_blocks() at the same time. Signed-off-by: Mark Fasheh commit 5069120b7227fd323152a3755a0aa6bdeb361310 Author: Mark Fasheh Date: Fri Feb 9 20:52:53 2007 -0800 ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write() These are no longer used, and can't handle file systems with sparse file allocation. Signed-off-by: Mark Fasheh commit 9517bac6cc7a7aa4fee63cb38a32cb6014e264c7 Author: Mark Fasheh Date: Fri Feb 9 20:24:12 2007 -0800 ocfs2: teach ocfs2_file_aio_write() about sparse files Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent to the I/O for cluster sizes greater than page size. Implement a custom file write here, which can order page locks for zeroing. This also has the advantage that cluster locks can easily be ordered outside of the page locks. Signed-off-by: Mark Fasheh commit 89488984ac23b0580f959b9ee549f2fcb1c2f194 Author: Mark Fasheh Date: Wed Jan 17 13:10:55 2007 -0800 ocfs2: Turn off shared writeable mmap for local files systems with holes. This will be turned back on once we can do allocation in ->page_mkwrite(). Signed-off-by: Mark Fasheh commit abf8b1569415bb4a8915a4884943ecd39c510957 Author: Mark Fasheh Date: Wed Jan 17 13:07:24 2007 -0800 ocfs2: abstract out allocation locking Right now, file allocation for ocfs2 is done within ocfs2_extend_file(), which is either called from ->setattr() (for an i_size change), or at the top of ocfs2_file_aio_write(). Inodes on file systems with sparse file support will want to do their allocation during the actual write call. In either case the cluster locking decisions are the same. We abstract out that code into a new function, ocfs2_lock_allocators() which will be used by a later patch to enable writing to sparse files. This also provides a nice cleanup of ocfs2_extend_allocation(). Signed-off-by: Mark Fasheh commit 3a0782d09c07aa3ec767ba6089cd15cfbfbfc508 Author: Mark Fasheh Date: Wed Jan 17 12:53:31 2007 -0800 ocfs2: teach extend/truncate about sparse files For ocfs2_truncate_file(), we eliminate the "simple" truncate case which no longer exists since i_size is not tied to i_clusters. In ocfs2_extend_file(), we skip the allocation / page zeroing code for file systems which understand sparse files. The core truncate code is changed to do a bottom up tree traversal. This gets abstracted out into it's own function. To make things more readable, most of the special case handling for in-inode extents from ocfs2_do_truncate() is also removed. Though write support for sparse files comes in a later patch, we at least update ocfs2_prepare_inode_for_write() to skip allocation for sparse files. Signed-off-by: Mark Fasheh commit 363041a5f74b953ab6b705ac9c88e5eda218a24b Author: Mark Fasheh Date: Wed Jan 17 12:31:35 2007 -0800 ocfs2: temporarily remove extent map caching The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by: Mark Fasheh commit dcd0538ff4e854fa9d7f4630b359ca8fdb5cb5a8 Author: Mark Fasheh Date: Tue Jan 16 11:32:23 2007 -0800 ocfs2: sparse b-tree support Introduce tree rotations into the b-tree code. This will allow ocfs2 to support sparse files. Much of the added code is designed to be generic (in the ocfs2 sense) so that it can later be re-used to implement large extended attributes. This patch only adds the rotation code and does minimal updates to callers of the extent api. Signed-off-by: Mark Fasheh commit 6f16bf655c5795586dd2ac96a7c70e0b9a378746 Author: Mark Fasheh Date: Tue Mar 20 17:17:54 2007 -0700 ocfs2: small cleanup of ocfs2_request_delete() There are two checks in there (one for inode newness, one for other mounted nodes) which are unnecessary, so remove them. The DLM will allow the trylock in either case without any messaging overhead. Removing these makes ocfs2_request_delete() a one liner function, so just move the trylock out one level into ocfs2_query_inode_wipe(). Signed-off-by: Mark Fasheh commit 68e2b740c4b5394680cfefccddbdb486c5866a4c Author: Tiger Yang Date: Tue Mar 20 16:42:10 2007 -0700 ocfs2: remove unused code Remove node messaging code that becomes unused with the delete inode vote removal. [Removed even more cruft which I spotted during review --Mark] Signed-off-by: Tiger Yang Signed-off-by: Mark Fasheh commit 500086300e6dc5308a7328990bd50d17e075162b Author: Tiger Yang Date: Tue Mar 20 16:01:38 2007 -0700 ocfs2: Remove delete inode vote Ocfs2 currently does cluster-wide node messaging to check the open state of an inode during delete. This patch removes that mechanism in favor of an inode cluster lock which is taken at shared read when an inode is first read and dropped in clear_inode(). This allows a deleting node to test the liveness of an inode by attempting to take an exclusive lock. Signed-off-by: Tiger Yang Signed-off-by: Mark Fasheh commit 566ec03448052c096dc3982fbe573522dc0ba479 Author: Jamal Hadi Salim Date: Thu Apr 26 14:12:15 2007 -0700 [XFRM]: Missing bits to SAD info. This brings the SAD info in sync with net-2.6.22/net-2.6 Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller commit a9f5f70739363ccca2e771c274c4f015c5fb7a88 Author: Mark Fasheh Date: Thu Apr 26 11:43:43 2007 -0700 ocfs2: filter more error prints We don't want to print anything at all in ocfs2_lookup() when getting an error from ocfs2_iget() - it could be something as innocuous as a signal being detected in the dlm. ocfs2_permission() should filter on -ENOENT which ocfs2_meta_lock() can return if the inode was deleted on another node. Signed-off-by: Mark Fasheh commit bebe6f120b036349f7212205eeaf8248d4820c4b Author: Sunil Mushran Date: Tue Apr 17 13:53:38 2007 -0700 ocfs2: Replace panic() with emergency_restart() when fencing We have noticed panic() hanging leading us to a situation in which the node, while otherwise dead, is still disk heartbeating. This leads to a hung cluster as the other nodes are waiting for this node to stop disk heartbeating. This situation is only resolved by power resetting the box. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 5d262cc7dd3d47784f8233ad4ec2cc5a08059b71 Author: Sunil Mushran Date: Tue Apr 17 13:49:19 2007 -0700 ocfs2: Silence compiler warnings Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit be9e986b824b41c9d5cc5eca34ee3424c35fd162 Author: Mark Fasheh Date: Wed Apr 18 15:22:08 2007 -0700 ocfs2: Local mounts should skip inode updates We don't want the extent map and uptodate cache destruction in ocfs2_meta_lock_update() on a local mount, so skip that. This fixes several bugs with uptodate being cleared on buffers and extent maps being corrupted. Signed-off-by: Mark Fasheh commit 0d01af6e5dd6bc7abbcb6331021f8fee18005540 Author: Sunil Mushran Date: Tue Apr 17 13:32:20 2007 -0700 ocfs2_dlm: Call cond_resched_lock() once per hash bucket scan In dlm_migrate_all_locks(), we currently call cond_resched_lock() after processing each lockres in a hash bucket. Move it outside the loop so as to call it only after the entire hash bucket has been processed. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 756a1501ddbbe73098aa031939460930f6edc9cd Author: Srinivas Eeda Date: Tue Apr 17 13:26:33 2007 -0700 ocfs2_dlm: fix race in dlm_remaster_locks There is a possibility that dlm_remaster_locks could overwride node->state with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock spinlock. Signed-off-by: Srinivas Eeda Signed-off-by: Mark Fasheh commit ee5ac9ddf2ea13be2418ac7d0ce5a930e78af013 Author: Stephen Rothwell Date: Thu Apr 26 00:03:53 2007 -0700 [SPARC]: device_node name constification fallout A couple of routines need their arguments to be const. Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 3e4d26508af6d03034a97583c895f33bef671d06 Author: David S. Miller Date: Wed Apr 25 15:58:22 2007 -0700 [SPARC64]: Convert SBUS over to generic iommu/strbuf structs. Signed-off-by: David S. Miller commit 66875088098f314af1a4d9e0cc47e617d643bffd Author: David S. Miller Date: Wed Apr 25 00:12:09 2007 -0700 [SPARC64]: Add generic iommu and strbuf structs to iommu.h Signed-off-by: David S. Miller commit 9b3627f389c07c5be9c86ac4d472a0d4fd47feac Author: David S. Miller Date: Tue Apr 24 23:51:18 2007 -0700 [SPARC64]: Consolidate {sbus,pci}_iommu_arena. Move to asm-sparc64/iommu.h and rename to plain "iommu_arena". Signed-off-by: David S. Miller commit 711b360d64418e88ed45f812e0ebd202073d888d Author: Stephen Rothwell Date: Thu Apr 12 14:38:34 2007 -0700 [SPARC]: Make device_node name and type const Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 3dfe10ee7caae9802d84a06fe7724274dea24020 Author: Stephen Rothwell Date: Thu Mar 29 11:22:57 2007 -0700 [SPARC64]: constify some paramaters of OF routines This starts bringing the PowerPC and Sparc64 implemetations back closer together. Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 374d4cac6283469f101282ca83ee008368bd8350 Author: David S. Miller Date: Thu Mar 29 01:57:57 2007 -0700 [TIGON3]: of_get_property() returns const. Signed-off-by: David S. Miller commit a165b4205e0097c7544ec3c59522a3b20ec14eb1 Author: David S. Miller Date: Thu Mar 29 01:50:16 2007 -0700 [SPARC64]: Fix PCI rework to adhere to of_get_property() const return. Signed-off-by: David S. Miller commit f1cfdb55f16596752e8a61a8570a90ee26af183a Author: David S. Miller Date: Thu Mar 15 22:52:18 2007 -0700 [SPARC64]: Document and fix calculation of pages_avail. It should be set to the total number of pages that the system will really have available after things like initmem, the bootmem map, and initrd are freed up. Signed-off-by: David S. Miller commit 0f3e25049e0a54916d0991c1eaa5f8df926c7f92 Author: David S. Miller Date: Thu Mar 15 21:44:03 2007 -0700 [SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c It needs to be ready before we invoke pci_determine_mem_io_space(). Signed-off-by: David S. Miller commit 3996465392fd1632b671707d16bbc96a9481cfe2 Author: David S. Miller Date: Thu Mar 15 19:36:53 2007 -0700 [SPARC64]: Use bootmem_bootmap_pages() in choose_bootmap_pfn(). Signed-off-by: David S. Miller commit b93f2620231d4641bdbaaa952d3e8890687124bb Author: David S. Miller Date: Thu Mar 15 18:29:13 2007 -0700 [SPARC64]: Add proper header file extern for cmdline_memory_size. Signed-off-by: David S. Miller commit 9753f0d6502acd65761ff15244d26d0e88f0820a Author: David S. Miller Date: Thu Mar 15 18:26:00 2007 -0700 [SPARC64]: Kill sparc_ultra_dump_{i,d}tlb() While useful in odd circumstances to debug something, they are normally totally unused and anyone can fetch this code out of the history if they really need it. And in any event, the person who needs this kind of code is usually me :-) Signed-off-by: David S. Miller commit 85f1e1f66011e67e68065f2db4cde499decb9c84 Author: David S. Miller Date: Thu Mar 15 17:51:26 2007 -0700 [SPARC64]: Use DECLARE_BITMAP and BITS_TO_LONGS in mm/init.c Signed-off-by: David S. Miller commit 5be4a963675d3270fab7f55e8c4a2e56afd408f6 Author: David S. Miller Date: Thu Mar 15 16:00:29 2007 -0700 [SPARC64]: Give move verbose show_mem() output just like i386. We now report everything i386 does except for highmem which doesn't apply. Signed-off-by: David S. Miller commit 28256ca2e04c72eee1e83524d7f78ce5646030e2 Author: David S. Miller Date: Thu Mar 15 15:56:07 2007 -0700 [SPARC64]: Mark show_mem() printk's with KERN_INFO. Signed-off-by: David S. Miller commit a94aa2530643f02a4b243f81b5f6354b9b958d7e Author: David S. Miller Date: Thu Mar 15 15:50:11 2007 -0700 [SPARC64]: Kill kvaddr_to_phys() and friends. Just inline it into flush_icache_range() which is the only user. Signed-off-by: David S. Miller commit 4be5c34dc47b5a9e6f91c8f5937a93c464870b8e Author: David S. Miller Date: Thu Mar 15 15:44:05 2007 -0700 [SPARC64]: Privatize sun4u_get_pte() and fix name. __get_phys is only called from init.c as is prom_virt_to_phys(), __get_iospace() is not called at all, and sun4u_get_pte() is largely misnamed. Privatize the implementation and helper functions of sun4u_get_phys() to mm/init.c, and rename to kvaddr_to_paddr(). The only used of this thing is flush_icache_range(), and thus things can be considerably further simplified. For example, we should only see module or PAGE_OFFSET kernel addresses here, so we don't need the OBP firmware range handling at all. Signed-off-by: David S. Miller commit a0963bdfb91ca97c2b0b6d4ca81ff557fac66901 Author: David S. Miller Date: Thu Mar 15 15:09:06 2007 -0700 [SPARC64]: Kill _start[]/_end[] declarations in mm/init.c We already get those from asm/sections.h Signed-off-by: David S. Miller commit 4e286d5be63c93b17f8a82d6f3618faa9c1b025c Author: David S. Miller Date: Thu Mar 15 00:21:45 2007 -0700 [SPARC64]: MAX_PHYSADDR_BITS et al. really need to be 42 bits not 41. Signed-off-by: David S. Miller commit 0015d3d68c84eb33e6b380802ad61b23f7eb6523 Author: David S. Miller Date: Thu Mar 15 00:06:34 2007 -0700 [SPARC64]: Simplify read_obp_memory(). Kick out empty entries as soon as we spot them, and use memmove() instead of a silly loop to make the operation more clear. Signed-off-by: David S. Miller commit d78d0891d3dd976a2fb707c6c691d9cd5ed60727 Author: David S. Miller Date: Wed Mar 14 22:47:01 2007 -0700 [SPARC64]: Use SPARSEMEM_STATIC Decrease the SECTION_SIZE_BITS --> MAX_PHYSADDR_BITS range a little bit. The cost of going to SPARSEMEM_STATIC becomes 8K of BSS space, and in return we save a pointer dereferences on every page struct lookup. Even better we hit the main kernel image for the base address which is in a hugepage locked TLB entry. Signed-off-by: David S. Miller commit 43bed127376ff2ef9c268cf6688a43d0fbed2ff4 Author: David S. Miller Date: Wed Mar 14 18:33:49 2007 -0700 [SPARC64]: Use DECLARE_BITMAP in struct pci_iommu. Signed-off-by: David S. Miller commit 28f57e774d91ce01e03ff65caa2313bc8786b66f Author: David S. Miller Date: Mon Mar 12 19:40:26 2007 -0700 [SPARC64]: Force dummy host controller onto bus zero. This helps deal with the invisible bridge that sits between the host controller and the top-most visisble PCI devices on hypervisor systems. For example, on T1000 the bus-range property says 2 --> 4 and so there is a PCI express bridge at bus 2, devfn 0, etc. So if we don't force the dummy host controller to bus zero, we'll try to create two devices with the same domain/bus/devfn triplet. Also, add some more log diagnostics to make debugging stuff like this easyer. Signed-off-by: David S. Miller commit 97b3cf050b467dda571943ceadff5452bed04549 Author: David S. Miller Date: Sun Mar 11 16:42:53 2007 -0700 [SPARC64]: Add dummy host controller to root of all PCI domains. We fake up a dummy one in all cases because that is the simplest thing to do and it happens to be necessary for hypervisor systems. Signed-off-by: David S. Miller commit c6e87566ea080bbbe926c0e429fed48e6f680d93 Author: David S. Miller Date: Fri Mar 9 16:58:43 2007 -0800 [SPARC64]: Const'ify pci_iommu_ops. Based upon a similar patch for x86_64 written by Stephen Hemminger. Signed-off-by: David S. Miller commit 0bba2dd823fd995ed805ae5cbd5a1c1381257a12 Author: David S. Miller Date: Thu Mar 8 23:06:39 2007 -0800 [SPARC64]: Kill pbm->pci_first_slot. Set but never used. Signed-off-by: David S. Miller commit 3875c5c02d7112aa85f815d65d8add2e39ae9e34 Author: David S. Miller Date: Thu Mar 8 22:52:11 2007 -0800 [SPARC64]: Kill pci_controller->pbms_same_domain We don't do the "Simba APB is a PBM" bogosity for Sabre controllers any longer, so this pbms_same_domain thing is no longer necessary. Signed-off-by: David S. Miller commit 8d3aee937596d2ca6676c2c27789751445bf0bc9 Author: David S. Miller Date: Thu Mar 8 22:46:02 2007 -0800 [SPARC64]: Kill pci_controller->base_address_update(). Implemented but never actually used. Signed-off-by: David S. Miller commit 0bae5f81b6f8130f5197e59b0e2ad6820c766b2b Author: David S. Miller Date: Thu Mar 8 22:42:19 2007 -0800 [SPARC64]: Kill pci_controller->resource_adjust() All the implementations can be identical and generic, so no need for controller specific methods. Signed-off-by: David S. Miller commit 3487a1f9e719d36c9b2d4d492994b2dd815a58b7 Author: David S. Miller Date: Thu Mar 8 22:28:17 2007 -0800 [SPARC64]: Kill PBM ranges software state. It is only used in one spot and we can just fetch the OF property right there. Signed-off-by: David S. Miller commit 229177c7f38d6a2b1285b42da4b19d76346b4bac Author: David S. Miller Date: Thu Mar 8 22:11:00 2007 -0800 [SPARC64]: Kill PBM intmap software state. Set but never used. Signed-off-by: David S. Miller commit 9fd8b64761d3fe7e4ef567161be57e4234af5c1c Author: David S. Miller Date: Thu Mar 8 21:55:49 2007 -0800 [SPARC64]: Consolidate PCI mem/io resource determination. It can be done for every PCI configuration using OF properties. Signed-off-by: David S. Miller commit 01f94c4a6ced476ce69b895426fc29bfc48c69bd Author: David S. Miller Date: Sun Mar 4 12:53:19 2007 -0800 [SPARC64]: Fix sabre pci controllers with new probing scheme. The SIMBA APB bridge is strange, it is a PCI bridge but it lacks some standard OF properties, in particular it lacks a 'ranges' property. What you have to do is read the IO and MEM range registers in the APB bridge to determine the ranges handled by each bridge. So fill in the bus resources by doing that. Since we now handle this quirk in the generic PCI and OF device probing layers, we can flat out eliminate all of that code from the sabre pci controller driver. In fact we can thus eliminate completely another quirk of the sabre driver. It tried to make the two APB bridges look like PBMs but that makes zero sense now (and it's questionable whether it ever made sense). So now just use pbm_A and probe the whole PCI hierarchy using that as the root. This simplification allows many future cleanups to occur. Also, I've found yet another quirk that needs to be worked around while testing this. You can't use the 'class-code' OF firmware property, especially for IDE controllers. We have to read the value out of PCI config space or else we'll see the value the device was showing before it was programmed into native mode. I'm starting to think it might be wise to just read all of the values out of PCI config space instead of using the OF properties. :-/ Signed-off-by: David S. Miller commit a378fd0ee8ea6af5dafd0ab3d634f22b926b5ac4 Author: David S. Miller Date: Thu Mar 1 11:46:13 2007 -0800 [SPARC64]: Fix obppath pci device sysfs creation. Need to traverse recursively down child busses else we only get the file created under devices at the top-level. Signed-off-by: David S. Miller commit bc606f3c917aa453fca62b76c8e9998b4171f4fa Author: David S. Miller Date: Thu Mar 1 11:20:37 2007 -0800 [SPARC64]: Minor cleanups to schizo pci controller driver. Signed-off-by: David S. Miller commit 1e8a8cc52daa95e702303ca3ce67955a4c051d7d Author: David S. Miller Date: Wed Feb 28 23:38:38 2007 -0800 [SPARC64]: Internalize pci_memspace_mask. The only user was bus_dvma_to_mem() which is no longer used by any driver, so kill that, and the export of pci_memspace_mask. The only user now is the PCI mmap support code. Signed-off-by: David S. Miller commit a2fb23af1c31ad6e0c281e56d385f803229d57fa Author: David S. Miller Date: Wed Feb 28 23:35:04 2007 -0800 [SPARC64]: Probe PCI bus using OF device tree. Almost entirely taken from the 64-bit PowerPC PCI code. This allowed to eliminate a ton of cruft from the sparc64 PCI layer. Signed-off-by: David S. Miller commit deb66c4521e119442aa266553e8cbfc86eb71232 Author: David S. Miller Date: Wed Feb 28 18:01:38 2007 -0800 [SPARC64] isa: Convert to use pci_device_to_OF_node(). Also, do not try to compute resources by hand, instead use the pre-computed ones in the of_device. Signed-off-by: David S. Miller commit 1327e9b62fc88e64ffbbd42d61fccd34e521bb86 Author: David S. Miller Date: Wed Feb 28 17:55:46 2007 -0800 [SPARC64] ebus: Convert to use pci_device_to_OF_node(). Also, we don't need to store or use the PBM so kill that from the linux_ebus. Signed-off-by: David S. Miller commit 9b1caafe09ccec8e0103e9375b711e3a0c838260 Author: David S. Miller Date: Wed Feb 28 17:05:06 2007 -0800 [IGAFB]: Use pci_device_to_OF_node() on sparc. Also __sparc__ --> CONFIG_SPARC Signed-off-by: David S. Miller commit a02079cdb74dde27391d019abca4a37988504b4e Author: David S. Miller Date: Wed Feb 28 17:02:45 2007 -0800 [ATYFB]: Use pci_device_to_OF_node() in sparc. Signed-off-by: David S. Miller commit fa449bd602c8871da48e6dbadfa0faaf4d33d32e Author: David S. Miller Date: Wed Apr 25 16:01:51 2007 -0700 [OPENPROM]: Use pci_device_to_OF_node(). Signed-off-by: David S. Miller commit d297c31fd101473983c17734a7e8a3752da1880f Author: David S. Miller Date: Thu Mar 29 01:41:28 2007 -0700 [TULIP]: Use pci_device_to_OF_node() on sparc. Signed-off-by: David S. Miller commit 49345103fef36617abc9a649dfc34f7e921c6878 Author: David S. Miller Date: Thu Mar 29 01:39:44 2007 -0700 [TULIP]: Use CONFIG_SPARC consistently in ifdef tests. Signed-off-by: David S. Miller commit 49b6e95ff6d05722bcf7a52b00454566ce0c44eb Author: David S. Miller Date: Thu Mar 29 01:38:42 2007 -0700 [TG3]: Use pci_device_to_OF_node() on sparc. And use CONFIG_SPARC instead of CONFIG_SPARC64 as the test. Signed-off-by: David S. Miller commit 6f85a8597d1d0d8ceeec5a82881c6ddf5cfb45e5 Author: David S. Miller Date: Wed Feb 28 16:40:57 2007 -0800 [SUNHME]: Use pci_device_to_OF_node(). Signed-off-by: David S. Miller commit 457e1a8afbcf5deffa501f2e9829526c18ed55b5 Author: David S. Miller Date: Thu Mar 29 01:36:44 2007 -0700 [SUNGEM]: Consolidate powerpc and sparc MAC probing code. Signed-off-by: David S. Miller commit dadb830dac401c4b1420ee2fd6c7559871b43319 Author: David S. Miller Date: Wed Feb 28 15:42:50 2007 -0800 [SUNGEM]: __sparc__ --> CONFIG_SPARC Signed-off-by: David S. Miller commit 9f47df264fa53e562cafa0de4a405d0846a81fbd Author: David S. Miller Date: Thu Mar 29 01:33:46 2007 -0700 [RADEON]: Probe clocks and monitor using OF properties on sparc. Just like powerpc does. Signed-off-by: David S. Miller commit a8b8814bdfe3bb2bdfa23722de947bad8283037c Author: David S. Miller Date: Thu Mar 29 01:28:51 2007 -0700 [SPARC]: Use strcasecmp for OFW property name comparisons. This allows us to simplify sharing code with powerpc which has properties that have various forms of capitalization when on the sparc64 side the property is all lower-case. Signed-off-by: David S. Miller commit ded220bd8f0823771fc0a9bdf7f5bcbe543197b6 Author: David S. Miller Date: Thu Mar 29 01:18:42 2007 -0700 [STRING]: Move strcasecmp/strncasecmp to lib/string.c We have several platforms using local copies of identical code. Signed-off-by: David S. Miller commit 357418e7cac16fed4ca558c6037d189d2109c9c2 Author: Stephen Rothwell Date: Thu Mar 29 00:54:04 2007 -0700 [SPARC]: constify some paramaters of OF routines This starts bringing the PowerPC and Sparc implemetations back closer together. Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 64b94701c0714f814e640ff351d5f784fdc0381e Author: Stephen Rothwell Date: Thu Mar 29 00:53:28 2007 -0700 [SPARC/64]: constify of_get_property return Finally, we actually change the functions themselves. Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 3198514d2d10fb3ce5e49ba0c611764ad8a214d0 Author: Stephen Rothwell Date: Thu Mar 29 00:50:57 2007 -0700 [SPARC/64] constify of_get_property return: sound Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 66f3cb7ccfe6d735bd1fa435aebc9b985ac74e07 Author: Stephen Rothwell Date: Thu Mar 29 00:50:29 2007 -0700 [SPARC64] constify of_get_property return: include Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit ccf0dec6fcadb4e1c877b9bafb031a6bdb7112b9 Author: Stephen Rothwell Date: Thu Mar 29 00:49:54 2007 -0700 [SPARC/64] constify of_get_property return: drivers The only unfortunate bit here is that the name field of struct map_info is not const, so for now we put a cast on the assignment of it. Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 6a23acf3905287eb952a6f1dbbc8fb3e4eeae2f6 Author: Stephen Rothwell Date: Mon Apr 23 15:53:27 2007 -0700 [SPARC64]: constify of_get_property return: arch/sparc64 Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 8271f04242af8ddf8390f289cd6ef78fb3e3c6d9 Author: Stephen Rothwell Date: Thu Mar 29 00:47:23 2007 -0700 [SPARC]: constify of_get_property return: arch/sparc Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller commit 644923d4a5f117d437aefd47688d1141cc8361ed Author: Tony Breeds Date: Wed Mar 28 19:10:12 2007 -0700 [SPARC64]: Small cleanups time.c - Removes days_in_mo[], as it's almost identical to month_days[] - Use the leapyear() macro - Line length wrapping. Signed-off-by: Tony Breeds Signed-off-by: David S. Miller commit d62c6f093a1ef8fa5f8951e8da93c8ddd3ce193a Author: David S. Miller Date: Tue Mar 27 01:20:14 2007 -0700 [SPARC64]: Fix sparc64_next_event() error return. It should return an error code not a boolean. Based upon an hpet timer fix by Thomas Gleixner. Signed-off-by: David S. Miller commit 112f48716d9f292c92a033cff9e3ce7405ed4280 Author: David S. Miller Date: Mon Mar 5 15:28:37 2007 -0800 [SPARC64]: Add clocksource/clockevents support. I'd like to thank John Stul and others for helping me along the way. A lot of cleanups fell out of this. For example, the get_compare() tick_op was totally unused, so was deleted. And the most often used tick_op members were grouped together for cache-friendlyness. The sparc64 TSC is given to the kernel as a one-shot timer. tick_ops->init_timer() simply turns off the privileged bit in the tick register (when possible), and disables the interrupt by setting bit 63 in the compare register. The ->disable_irq() op also sets this bit. tick_ops->add_compare() is changed to: 1) Add the given delta to "tick" not to "compare" 2) Return a boolean which, if true, means that the tick value read after writing the compare value was found to have incremented past the initial tick value. This mirrors logic used in the HPET driver's ->next_event() method. Each tick_ops implementation also now provides a name string. And we feed this into the clocksource and clockevents layers. Signed-off-by: David S. Miller commit 038cb01ea69cb24ecf30e3ec882e429c84badbeb Author: David S. Miller Date: Thu Feb 22 06:24:45 2007 -0800 [SPARC64]: Add tick_nohz_{stop,restart}_sched_tick() calls to cpu_idle(). Signed-off-by: David S. Miller commit 777a447529ad138f5fceb9c9ad28bab19848f277 Author: David S. Miller Date: Thu Feb 22 06:24:10 2007 -0800 [SPARC64]: Unify timer interrupt handler. Things were scattered all over the place, split between SMP and non-SMP. Unify it all so that dyntick support is easier to add. Signed-off-by: David S. Miller commit a58c9f3c1e929c3c323c26dbdafef46373a719d4 Author: David S. Miller Date: Thu Feb 22 04:16:21 2007 -0800 [SPARC64]: Synchronize RTC clock via timer just like x86. Signed-off-by: David S. Miller commit bfbf3c0968498f5232c02965cf41695edae1bc4d Author: Matthias Kaehlcke Date: Thu Apr 26 01:41:49 2007 -0700 [ATM]: Use mutex instead of binary semaphore in FORE Systems 200E-series driver (akpm: remove CVS control string too) Signed-off-by: Matthias Kaehlcke Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 74da9d88bf5ffd31aed61a0b19519684ad744ded Author: Andrew Morton Date: Thu Apr 26 01:41:01 2007 -0700 [BLUETOOTH] rfcomm_worker(): fix wakeup race Set TASK_INTERRUPTIBLE prior to testing the flag to avoid missed wakeups. Signed-off-by: Andrew Morton Acked-by: Marcel Holtmann Signed-off-by: David S. Miller commit 9198d2220d29b87ac3a05a3b791c50bb8a014d63 Author: Alexandra N. Kossovsky Date: Thu Apr 26 01:40:13 2007 -0700 [NET]: bonding documentation fix for multiple bonding interfaces Fix bonding driver documentation for the case of multiple bonding interfaces. Signed-off-by: "Alexandra N. Kossovsky" Acked-by: Jay Vosburgh Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 4ef8d0aeafda8388dd51f2671b7059192b1e5a5f Author: Milind Arun Choudhary Date: Thu Apr 26 01:37:44 2007 -0700 [NET]: SPIN_LOCK_UNLOCKED cleanup in drivers/atm, net SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead Signed-off-by: Milind Arun Choudhary Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 1c8ea5aee0b16409295d96a5e8984bd902f06a77 Author: Andrew Morton Date: Thu Apr 26 01:36:49 2007 -0700 [IRDA] irda_device_dongle_init: fix kzalloc(GFP_KERNEL) in spinlock Fix http://bugzilla.kernel.org/show_bug.cgi?id=8343 Signed-off-by: Andrew Morton Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit 14690fc649f4c59712f497135f7323eb8ceceaaf Author: Martin Peschke Date: Thu Apr 26 01:03:43 2007 -0700 [SUNRPC]: cleanup: use seq_release_private() where appropriate We can save some lines of code by using seq_release_private(). Signed-off-by: Martin Peschke Acked-by: Neil Brown Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit f8a6d97043f9adc25889876b681998b77f543bfa Author: Alexey Dobriyan Date: Thu Apr 26 01:02:51 2007 -0700 [AF_IUCV]: Fix compilation on s390-up CC [M] net/iucv/iucv.o net/iucv/iucv.c: In function 'iucv_init': net/iucv/iucv.c:1556: error: 'iucv_cpu_notifier' undeclared (first use in this function) Signed-off-by: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 57cd5f754e04240ee587c51b7be8d3b7793542ae Author: Milind Arun Choudhary Date: Thu Apr 26 01:01:53 2007 -0700 [NET]: ROUND_UP macro cleanup in drivers/net/ppp_generic.c ROUND_UP macro cleanup use DIV_ROUND_UP Signed-off-by: Milind Arun Choudhary Acked-by: Paul Mackerras Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 36226a8ded46b89a94f9de5976f554bb5e02d84c Author: Brian Braunstein Date: Thu Apr 26 01:00:55 2007 -0700 [NET] tun/tap: fixed hw address handling Fixed tun/tap driver's handling of hw addresses. The hw address is stored in both the net_device.dev_addr and tun.dev_addr fields. These fields were not kept synchronized, and in fact weren't even initialized to the same value. Now during both init and when performing SIOCSIFHWADDR on the tun device these values are both updated. However, if SIOCSIFHWADDR is performed on the net device directly (for instance, setting the hw address using ifconfig), the tun device does not get updated. Perhaps the tun.dev_addr field should be removed completely at some point, as it is redundant and net_device.dev_addr can be used anywhere it is used. Signed-off-by: Brian Braunstein Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 48491e6bdb8fa73751cc95f740175ec799db5d55 Author: Robert P. J. Day Date: Thu Apr 26 00:59:27 2007 -0700 [NET]: Delete unused header file linux/if_wanpipe_common.h Delete the unreferenced header file include/linux/if_wanpipe_common.h, as well as the reference to it in the Doc file. Signed-off-by: Robert P. J. Day Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit c1a068f6b0c38665c079e8d4ca241e24020eff36 Author: Robert P. J. Day Date: Thu Apr 26 00:58:39 2007 -0700 [NET]: Delete unused header file linux/sdla_fr.h. Delete the unreferenced header file include/linux/sdla_fr.h. Signed-off-by: Robert P. J. Day Signed-off-by: Andrew Morton commit 42bad1da506cafa7041a02ab84033a724afe88ac Author: Adrian Bunk Date: Thu Apr 26 00:57:41 2007 -0700 [NETLINK]: Possible cleanups. - make the following needlessly global variables static: - core/rtnetlink.c: struct rtnl_msg_handlers[] - netfilter/nf_conntrack_proto.c: struct nf_ct_protos[] - make the following needlessly global functions static: - core/rtnetlink.c: rtnl_dump_all() - netlink/af_netlink.c: netlink_queue_skip() Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 55404bca6c45595fee1a546f1a0cc616aeef0b00 Author: Andrew Morton Date: Thu Apr 26 00:55:53 2007 -0700 [NET]: Fix yam.c drivers/net/hamradio/yam.c: In function `yam_tx_byte': drivers/net/hamradio/yam.c:643: warning: passing arg 1 of `skb_copy_from_linear_data_offset' from incompatible pointer type Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit eefa3906283a2b60a6d02a2cda593a7d7d7946c5 Author: Jean Delvare Date: Thu Apr 26 00:44:22 2007 -0700 [NET]: Clean up sk_buff walkers. I noticed recently that, in skb_checksum(), "offset" and "start" are essentially the same thing and have the same value throughout the function, despite being computed differently. Using a single variable allows some cleanups and makes the skb_checksum() function smaller, more readable, and presumably marginally faster. We appear to have many other "sk_buff walker" functions built on the exact same model, so the cleanup applies to them, too. Here is a list of the functions I found to be affected: net/appletalk/ddp.c:atalk_sum_skb() net/core/datagram.c:skb_copy_datagram_iovec() net/core/datagram.c:skb_copy_and_csum_datagram() net/core/skbuff.c:skb_copy_bits() net/core/skbuff.c:skb_store_bits() net/core/skbuff.c:skb_checksum() net/core/skbuff.c:skb_copy_and_csum_bit() net/core/user_dma.c:dma_skb_copy_datagram_iovec() net/xfrm/xfrm_algo.c:skb_icv_walk() net/xfrm/xfrm_algo.c:skb_to_sgvec() OTOH, I admit I'm a bit surprised, the cleanup is rather obvious so I'm really wondering if I am missing something. Can anyone please comment on this? Signed-off-by: Jean Delvare Signed-off-by: David S. Miller commit 28d8909bc790d936ce33f4402adf7577533bbd4b Author: Jamal Hadi Salim Date: Thu Apr 26 00:10:29 2007 -0700 [XFRM]: Export SAD info. On a system with a lot of SAs, counting SAD entries chews useful CPU time since you need to dump the whole SAD to user space; i.e something like ip xfrm state ls | grep -i src | wc -l I have seen taking literally minutes on a 40K SAs when the system is swapping. With this patch, some of the SAD info (that was already being tracked) is exposed to user space. i.e you do: ip xfrm state count And you get the count; you can also pass -s to the command line and get the hash info. Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller commit f50393fe869ba457cd75569c74c0f9bd2e7f7a0f Author: Mark Huth Date: Tue Mar 6 08:57:26 2007 -0800 e1000: FIX: Stop raw interrupts disabled nag from RT Current e1000_xmit_frame spews raw interrupt disabled nag messages when used with RT kernel patches. This patch uses spin_trylock_irqsave, which allows RT patches to properly manage the irq semantics. Signed-off-by: Mark Huth Signed-off-by: Auke Kok Signed-off-by: Jeff Garzik commit 31d76442f719af834718cbf5bf866370acc36093 Author: Bruce Allan Date: Tue Mar 6 08:57:24 2007 -0800 e1000: FIX: firmware handover bits Upon code inspection it was spotted that the firmware handover bit get/set mismatched, which may have resulted in management issues on PCI-E adapters. Setting them correctly may fix some management issues such as arp routing etc. Signed-off-by: Auke Kok Signed-off-by: Bruce Allan Signed-off-by: Jeff Garzik commit e0aac5a289b1dacbc94bd9ae8c449bcdf9ab508c Author: Auke Kok Date: Tue Mar 6 08:57:21 2007 -0800 e1000: FIX: be ready for incoming irq at pci_request_irq DEBUG_SHIRQ code exposed that e1000 was not ready for incoming interrupts after having called pci_request_irq. This obviously requires us to finish our software setup which assigns the irq handler before we request the irq. Signed-off-by: Auke Kok Signed-off-by: Jeff Garzik commit e900a7d90ae1486ac95c10e0b7337fc2c2eda529 Author: Stephen Smalley Date: Thu Apr 19 14:16:19 2007 -0400 selinux: preserve boolean values across policy reloads At present, the userland policy loading code has to go through contortions to preserve boolean values across policy reloads, and cannot do so atomically. As this is what we always want to do for reloads, let the kernel preserve them instead. Signed-off-by: Stephen Smalley Acked-by: Karl MacMillan Signed-off-by: James Morris commit bce34bc0eef03c68b5c49a3cc5bc77c84760cfe2 Author: James Carter Date: Wed Apr 4 16:18:50 2007 -0400 selinux: change numbering of boolean directory inodes in selinuxfs Change the numbering of the booleans directory inodes in selinuxfs to provide more room for new inodes without a conflict in inode numbers and to be consistent with how inode numbering is done in the initial_contexts directory. Signed-off-by: James Carter Acked-by: Eric Paris Acked-by: Stephen Smalley Signed-off-by: James Morris commit 68b00df9bb5f38e87c102b3179a18eba9c9937a8 Author: James Carter Date: Wed Apr 4 16:18:43 2007 -0400 selinux: remove unused enumeration constant from selinuxfs Remove the unused enumeration constant, SEL_AVC, from the sel_inos enumeration in selinuxfs. Signed-off-by: James Carter Acked-by: Eric Paris Acked-by: Stephen Smalley Signed-off-by: James Morris commit 6174eafce3a38114adc6058e2872434c53feae87 Author: James Carter Date: Wed Apr 4 16:18:39 2007 -0400 selinux: explicitly number all selinuxfs inodes Explicitly number all selinuxfs inodes to prevent a conflict between inodes numbered using last_ino when created with new_inode() and those labeled explicitly. Signed-off-by: James Carter Acked-by: Eric Paris Acked-by: Stephen Smalley Signed-off-by: James Morris commit f0ee2e467ffa68c3122128b704c1540ee294b748 Author: James Carter Date: Wed Apr 4 10:11:29 2007 -0400 selinux: export initial SID contexts via selinuxfs Make the initial SID contexts accessible to userspace via selinuxfs. An initial use of this support will be to make the unlabeled context available to libselinux for use for invalidated userspace SIDs. Signed-off-by: James Carter Acked-by: Stephen Smalley Signed-off-by: James Morris commit a764ae4b0781fac75f9657bc737c37ae59888389 Author: Stephen Smalley Date: Mon Mar 26 13:36:26 2007 -0400 selinux: remove userland security class and permission definitions Remove userland security class and permission definitions from the kernel as the kernel only needs to use and validate its own class and permission definitions and userland definitions may change. Signed-off-by: Stephen Smalley Signed-off-by: James Morris commit 4f6a993f96a256e83b9be7612f958c7bc4ca9f00 Author: Paul Moore Date: Thu Mar 1 14:35:22 2007 -0500 SELinux: move security_skb_extlbl_sid() out of the security server As suggested, move the security_skb_extlbl_sid() function out of the security server and into the SELinux hooks file. Signed-off-by: Paul Moore Acked-by: Stephen Smalley Signed-off-by: James Morris commit 588a31577f86a5cd8b0bcde6026e4e6dcac8c383 Author: Stephen Smalley Date: Fri Feb 23 09:20:09 2007 -0500 MAINTAINERS: update selinux entry Add Eric Paris as an SELinux maintainer. Signed-off-by: James Morris commit c60475bf35fc5fa10198df89187ab148527e72f7 Author: Paul Moore Date: Wed Feb 28 15:14:23 2007 -0500 SELinux: rename selinux_netlabel.h to netlabel.h In the beginning I named the file selinux_netlabel.h to avoid potential namespace colisions. However, over time I have realized that there are several other similar cases of multiple header files with the same name so I'm changing the name to something which better fits with existing naming conventions. Signed-off-by: Paul Moore Signed-off-by: James Morris commit 5778eabd9cdbf16ea3e40248c452b4fd25554d11 Author: Paul Moore Date: Wed Feb 28 15:14:22 2007 -0500 SELinux: extract the NetLabel SELinux support from the security server Up until this patch the functions which have provided NetLabel support to SELinux have been integrated into the SELinux security server, which for various reasons is not really ideal. This patch makes an effort to extract as much of the NetLabel support from the security server as possibile and move it into it's own file within the SELinux directory structure. Signed-off-by: Paul Moore Signed-off-by: James Morris commit 128c6b6cbffc8203e13ea5712a8aa65d2ed82e4e Author: Paul Moore Date: Wed Feb 28 15:14:21 2007 -0500 NetLabel: convert a BUG_ON in the CIPSO code to a runtime check This patch changes a BUG_ON in the CIPSO code to a runtime check. It should also increase the readability of the code as it replaces an unexplained constant with a well defined macro. Signed-off-by: Paul Moore Signed-off-by: James Morris commit f998e8cb52396c6a197d14f6afb07144324aea6d Author: Paul Moore Date: Wed Feb 28 15:14:20 2007 -0500 NetLabel: cleanup and document CIPSO constants This patch collects all of the CIPSO constants and puts them in one place; it also documents each value explaining how the value is derived. Signed-off-by: Paul Moore Signed-off-by: James Morris commit 98486fa2f4894e2b01e325c659635596bdec1614 Author: Stephen Hemminger Date: Wed Apr 25 22:08:46 2007 -0700 [BRIDGE]: Missing rtnl. Writing to /sys/class/net/brX/bridge/stp_state causes a warning because RTNL is not held when call br_stp_if.c Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit c2886d6259b8faac4c05ffd9c3c401ac84478de0 Author: Stephen Hemminger Date: Wed Apr 25 22:07:58 2007 -0700 [BRIDGE]: if no STP then forward all BPDUs If a bridge is not running STP, then it has no way to detect a cycle in the network. But if it is not running STP and some other machine or device is running STP, then if STP BPDU's get forwarded to it can detect the cycle. This is how the old 2.4 and early 2.6 code worked. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 2111f8b9e58fd04b87b8b07d66485f255a57b0bb Author: Stephen Hemminger Date: Wed Apr 25 22:05:55 2007 -0700 [BRIDGE]: drop PAUSE frames Pause frames should never make it out of the network device into the stack. But if a device was misconfigured, it might happen. So drop pause frames in bridge. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 83aa0938ff59e8ef6d0b99260063ebe84fc84a16 Author: Stephen Hemminger Date: Wed Apr 25 22:03:10 2007 -0700 [BRIDGE]: don't change packet type The change to forward STP bpdu's (for usermode STP) through normal path, changed the packet type in the process. Since link local stuff is multicast, it should stay pkt_type = PACKET_MULTICAST. The code was probably copy/pasted incorrectly from the bridge pseudo-device receive path. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit e1ec7842df5db897516d73c76bd2a568b4abc33b Author: YOSHIFUJI Hideaki Date: Tue Apr 24 20:44:52 2007 +0900 [IPV6] NDISC: Unify main process of sending ND messages. Because ndisc_send_na(), ndisc_send_ns() and ndisc_send_rs() are almost identical, so let's unify their common part. With gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) on i386, Before: text data bss dec hex filename 14689 364 24 15077 3ae5 net/ipv6/ndisc.o After: text data bss dec hex filename 12317 364 24 12705 31a1 net/ipv6/ndisc.o Signed-off-by: YOSHIFUJI Hideaki commit c53b3590bb294a42121b640e8309379752482b38 Author: YOSHIFUJI Hideaki Date: Tue Apr 24 20:44:50 2007 +0900 [IPV6] XFRM: Use ip6addr_any where applicable. Signed-off-by: YOSHIFUJI Hideaki commit df8981dc1928f3a231d91f27c2b3dc373fb4d410 Author: YOSHIFUJI Hideaki Date: Tue Apr 24 20:44:49 2007 +0900 [IPV6]: Export in6addr_any for future use. Signed-off-by: YOSHIFUJI Hideaki commit 5056a1ef9e2597cff7b15904fbc74193f316fc40 Author: YOSHIFUJI Hideaki Date: Tue Apr 24 20:44:48 2007 +0900 [IPV4] IP_GRE: Unify code path to get hash array index. Signed-off-by: YOSHIFUJI Hideaki commit 87d1a164df0b5e297cda698724ea7984d8392b06 Author: YOSHIFUJI Hideaki Date: Tue Apr 24 20:44:47 2007 +0900 [IPV4] IPIP: Unify code path to get hash array index. Signed-off-by: YOSHIFUJI Hideaki commit 420fe234ad7adaa5a5445e5fab83b1485ed9e0f3 Author: YOSHIFUJI Hideaki Date: Tue Apr 24 20:44:47 2007 +0900 [IPV6] SIT: Unify code path to get hash array index. Signed-off-by: YOSHIFUJI Hideaki commit 30041e4af426bc9ab7a73440ce4a7c78881b6001 Author: David S. Miller Date: Tue Apr 24 22:15:40 2007 -0700 [IPV6]: Fix Makefile thinko. obj-$(CONFIG_PROC_FS) --> ipv6-$(CONFIG_PROC_FS) Signed-off-by: David S. Miller commit 7f7d9a6b96c5708c5184cbed61bbc15b163a0f08 Author: Herbert Xu Date: Tue Apr 24 21:54:09 2007 -0700 [IPV6]: Consolidate common SNMP code This patch moves the non-proc SNMP code into addrconf.c and reuses IPv4 SNMP code where applicable. As a result we can skip proc.o if /proc is disabled. Note that I've made a number of functions static since they're only used by addrconf.c for now. If they ever get used elsewhere we can always remove the static. Signed-off-by: Herbert Xu Acked-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 5e0f04351d11e07a23b5ab4914282cbb78027e50 Author: Herbert Xu Date: Tue Apr 24 21:53:35 2007 -0700 [IPV4]: Consolidate common SNMP code This patch moves the SNMP code shared between IPv4/IPv6 from proc.c into net/ipv4/af_inet.c. This makes sense because these functions aren't specific to /proc. As a result we can again skip proc.o if /proc is disabled. Signed-off-by: Herbert Xu Acked-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit bb7ec6dfb5aa32b5b4d7d6388b4098b33cd01e8c Author: YOSHIFUJI Hideaki Date: Tue Apr 24 16:22:42 2007 -0700 [IPV4]: Fix build without procfs. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 84299b3bc4eaedc0734fcc9052b01291e44445fc Author: YOSHIFUJI Hideaki Date: Tue Apr 24 16:21:38 2007 -0700 [TCP]: Fix linkage errors on i386. To avoid raw division, use ktime_to_timeval() to get usec. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 1f9eda7e2b67898fb8e79b3aa3880211b51235e6 Author: Allan Stephens Date: Tue Apr 24 14:51:55 2007 -0700 [TIPC]: Enhancements to msg_set_bits() routine This patch makes two enhancements to msg_set_bits(): 1) It now ignores any bits of the new field value that are not covered by the mask being used. (Previously, if the new value exceeded the size of the mask the extra bits could corrupt other fields in the message header word being updated.) 2) The code has been optimized to minimize the number of run-time endianness conversion operations by leveraging the fact that the mask (and, in some cases, the value as well) is constant and the necessary conversion can be performed by the compiler. Signed-off-by: Allan Stephens Signed-off-by: Jon Paul Maloy Signed-off-by: David S. Miller commit 43fb45cb79e9441a79ece206cf741774500dd627 Author: Johannes Berg Date: Tue Apr 24 14:07:27 2007 -0700 [WIRELESS] cfg80211: Update comment for locking. This patch adds a comment that was part of my rtnl locking patch for cfg80211 but which I forgot for the merge. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit f9d106a6d53b57b78eae5544f9582c643343a764 Author: Herbert Xu Date: Mon Apr 23 22:36:13 2007 -0700 [NET]: Warn about GSO/checksum abuse Now that Patrick has added the code to deal with GSO in netfilter, we no longer need the crutch that computes partial checksums just before transmission. This patch turns this into a warning again. If this goes OK, we can then turn it into a BUG_ON and remove the gso_send_check cruft. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit 7752237e9f07b316f81aebdc43f0d7c9a4ba0acf Author: Stephen Hemminger Date: Mon Apr 23 22:28:23 2007 -0700 [TCP] TCP YEAH: Use vegas dont copy it. Rather than using a copy of vegas code, the YEAH code should just have it exported so there is common code. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 164891aadf1721fca4dce473bb0e0998181537c6 Author: Stephen Hemminger Date: Mon Apr 23 22:26:16 2007 -0700 [TCP]: Congestion control API update. Do some simple changes to make congestion control API faster/cleaner. * use ktime_t rather than timeval * merge rtt sampling into existing ack callback this means one indirect call versus two per ack. * use flags bits to store options/settings Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 65d1b4a7e73fe0e1f5275ad7d2d3547981480886 Author: Stephen Hemminger Date: Mon Apr 23 22:24:32 2007 -0700 [TCP]: TCP Illinois update. This version more closely matches the paper, and fixes several math errors. The biggest difference is that it updates alpha/beta once per RTT Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 42431592e74a968d919a46baf0515a2ee6978dac Author: John W. Linville Date: Mon Apr 23 13:28:49 2007 -0700 [WIRELESS] drivers/net/wireless/Kconfig: correct minor typo Correct minor typo in drivers/net/wireless/Kconfig identified by Stefano Brivio . Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 9e101eab153073d8a1fc7ea22b20af65de8ab44b Author: Johannes Berg Date: Mon Apr 23 12:20:55 2007 -0700 [WIRELESS]: Remove wext over netlink. As scheduled, this patch removes the pointless wext over netlink code. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 704232c2718c9d4b3375ec15a14fc0397970c449 Author: Johannes Berg Date: Mon Apr 23 12:20:05 2007 -0700 [WIRELESS] cfg80211: New wireless config infrastructure. This patch creates the core cfg80211 code along with some sysfs bits. This is a stripped down version to allow mac80211 to function, but doesn't include any configuration yet except for creating and removing virtual interfaces. This patch includes the nl80211 header file but it only contains the interface types which the cfg80211 interface for creating virtual interfaces relies on. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 2a5e1c0eb9efe26eed1dd072fe08de5797a7efd5 Author: Johannes Berg Date: Mon Apr 23 12:19:12 2007 -0700 [WIRELESS]: Refactor wireless Kconfig. This patch refactors the wireless Kconfig all over and already introduces net/wireless/Kconfig with just the WEXT bit for now, the cfg80211 patch will add to that as well. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 724c6b35ecff0fb68bbb315a34b2f9cb694865d3 Author: Johannes Berg Date: Mon Apr 23 12:18:20 2007 -0700 [WIRELESS]: Update MAINTAINERS for wireless mailing list. This patch adds the linux-wireless mailing list to all appropriate entries in the MAINTAINERS file. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville Signed-off-by: David S. Miller commit 372cc74c8b41d808af0a3fa8b11795cba79e7299 Author: Andrew Morton Date: Sun Apr 22 23:22:24 2007 -0700 [NET]: Prevent much sadness in qdisc_lock_tree(). Signed-off-by: Andrew Morton Signed-off-by: David S. Miller commit 97fc8d0bc58cd09e62dc06ea5a64b58841738934 Author: YOSHIFUJI Hideaki Date: Sat Apr 21 19:52:04 2007 -0700 [IPV6] SNMP: Use put_unaligned() instead of memcpy(). Hint from David Miller . Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 952a10be3272c4b5b7839b09cb0483dc72137101 Author: YOSHIFUJI Hideaki Date: Sat Apr 21 20:13:44 2007 +0900 [IPV6] SNMP: Fix several warnings without procfs. Signed-off-by: YOSHIFUJI Hideaki commit 2334e973559e119fa4161047035f03ad97a8676a Author: YOSHIFUJI Hideaki Date: Sat Apr 21 20:12:43 2007 +0900 [IPV6] SNMP: Avoid unaligned accesses. Because stats pointer may not be aligned for u64, use memcpy to fill u64 values. Issue reported by David Miller . Signed-off-by: YOSHIFUJI Hideaki commit 9e412ba7632f71259a53085665d4983b78257b7c Author: Ilpo Järvinen Date: Fri Apr 20 22:18:02 2007 -0700 [TCP]: Sed magic converts func(sk, tp, ...) -> func(sk, ...) This is (mostly) automated change using magic: sed -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N' -e 's|struct sock \*sk,[\n\t ]*struct tcp_sock \*tp\([^{]*\n{\n\)| struct sock \*sk\1\tstruct tcp_sock *tp = tcp_sk(sk);\n|g' -e 's|struct sock \*sk, struct tcp_sock \*tp| struct sock \*sk|g' -e 's|sk, tp\([^-]\)|sk\1|g' Fixed four unused variable (tp) warnings that were introduced. In addition, manually added newlines after local variables and tweaked function arguments positioning. $ gcc --version gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1) ... $ codiff -fV built-in.o.old built-in.o.new net/ipv4/route.c: rt_cache_flush | +14 1 function changed, 14 bytes added net/ipv4/tcp.c: tcp_setsockopt | -5 tcp_sendpage | -25 tcp_sendmsg | -16 3 functions changed, 46 bytes removed net/ipv4/tcp_input.c: tcp_try_undo_recovery | +3 tcp_try_undo_dsack | +2 tcp_mark_head_lost | -12 tcp_ack | -15 tcp_event_data_recv | -32 tcp_rcv_state_process | -10 tcp_rcv_established | +1 7 functions changed, 6 bytes added, 69 bytes removed, diff: -63 net/ipv4/tcp_output.c: update_send_head | -9 tcp_transmit_skb | +19 tcp_cwnd_validate | +1 tcp_write_wakeup | -17 __tcp_push_pending_frames | -25 tcp_push_one | -8 tcp_send_fin | -4 7 functions changed, 20 bytes added, 63 bytes removed, diff: -43 built-in.o.new: 18 functions changed, 40 bytes added, 178 bytes removed, diff: -138 Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 38b4da383705394788aa09208917ba200792de4b Author: Borislav Petkov Date: Fri Apr 20 22:14:10 2007 -0700 [NET]: Fix comments for register_netdev(). Correct the function name in the comments supplied with register_netdev() Signed-off-by: Borislav Petkov Signed-off-by: David S. Miller commit b450777a572d68975c8748b0d48d517dd3468ea6 Author: G. Liakhovetski Date: Fri Apr 20 22:12:48 2007 -0700 [IrDA]: Misc spelling corrections. Spelling corrections, from "to" to "too". Signed-off-by: G. Liakhovetski Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit 599b1fa91439cff8605a71f1a2b5bb42c177b667 Author: Samuel Ortiz Date: Fri Apr 20 22:12:07 2007 -0700 [IrDA]: Adding carriage returns to mcs7780 debug statements Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit c3ea9fa2741320f9cade15efe10559b549af4ebf Author: Samuel Ortiz Date: Fri Apr 20 22:10:13 2007 -0700 [IrDA] af_irda: IRDA_ASSERT cleanups In af_irda.c, the multiple IRDA_ASSERT() are either hiding bugs, useless, or returning the wrong value. Let's clean that up. Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit d7f48d1a9398a3bd7bb6f4774640b24a0294cda3 Author: Samuel Ortiz Date: Fri Apr 20 22:09:33 2007 -0700 [IrDA] af_irda: irda_accept cleanup This patch removes a cut'n'paste copy of wait_event_interruptible from irda_accept. Signed-off-by: Samuel Ortiz Acked-by: Olaf Kirch Signed-off-by: David S. Miller commit 6e66aa15d8873ae7418d5afc6476daec466ff93b Author: Olaf Kirch Date: Fri Apr 20 22:08:15 2007 -0700 [IrDA] af_irda: Silence kernel message in irda_recvmsg_stream This patch silences an IRDA_ASSERT in irda_recvmsg_stream, as described in http://bugzilla.kernel.org/show_bug.cgi?id=7512 irda_disconnect_indication would set sk->sk_err to ECONNRESET, and a subsequent call to recvmsg would print an irritating kernel message and return -1. When a connected socket is closed by the peer, recvmsg should return 0 rather than an error. This patch fixes this. Signed-off-by: Olaf Kirch Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit 305f2aa18214555e611ad05e586dd385e64ab665 Author: Olaf Kirch Date: Fri Apr 20 22:05:27 2007 -0700 [IrDA] af_irda: irda_recvmsg_stream cleanup This patch cleans up some code in irda_recvmsg_stream, replacing some homebrew code with prepare_to_wait/finish_wait, and by making the code honor sock_rcvtimeo. Signed-off-by: Olaf Kirch Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit 9958089a43ae8a9af07402461c0b2b7548c7341e Author: Andi Kleen Date: Fri Apr 20 17:12:43 2007 -0700 [NET]: Move sk_setup_caps() out of line. It is far too large to be an inline and not in any hot paths. Signed-off-by: Andi Kleen Signed-off-by: David S. Miller commit 4ac02bab77438b484a5cf855a002fb6a1d592894 Author: Andi Kleen Date: Fri Apr 20 17:11:46 2007 -0700 [TCP]: Uninline tcp_done(). The function is quite big and has several call sites and nothing to collapse by compiler optimization on inlining. Besides it's nicer to read in a in .c file. Signed-off-by: Andi Kleen Signed-off-by: David S. Miller commit 3ff50b7997fe06cd5d276b229967bb52d6b3b6c1 Author: Stephen Hemminger Date: Fri Apr 20 17:09:22 2007 -0700 [NET]: cleanup extra semicolons Spring cleaning time... There seems to be a lot of places in the network code that have extra bogus semicolons after conditionals. Most commonly is a bogus semicolon after: switch() { } Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit c462238d6a6d8ee855bda10f9fff442971540ed2 Author: Stephen Hemminger Date: Fri Apr 20 17:07:51 2007 -0700 [TCP]: TCP Illinois congestion control (rev3) This is an implementation of TCP Illinois invented by Shao Liu at University of Illinois. It is a another variant of Reno which adapts the alpha and beta parameters based on RTT. The basic idea is to increase window less rapidly as delay approaches the maximum. See the papers and talks to get a more complete description. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 9be9a6b983314dd57e2c5ba548dee8b53d338ac3 Author: Stephen Hemminger Date: Fri Apr 20 17:02:45 2007 -0700 [NET]: Get rid of netdev_nit It isn't any faster to test a boolean global variable than do a simple check for empty list. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 42dc9cd54b7290f862874a2544e50395e5719985 Author: Michal Ostrowski Date: Fri Apr 20 16:59:24 2007 -0700 [PPPOE]: Fix device tear-down notification. pppoe_flush_dev() kicks all sockets bound to a device that is going down. In doing so, locks must be taken in the right order consistently (sock lock, followed by the pppoe_hash_lock). However, the scan process is based on us holding the sock lock. So, when something is found in the scan we must release the lock we're holding and grab the sock lock. This patch fixes race conditions between this code and pppoe_release(), both of which perform similar functions but would naturally prefer to grab locks in opposing orders. Both code paths are now going after these locks in a consistent manner. pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside inside the hash. Thus, NULL'ing out the pppoe_dev field should be done under the protection of this lock. Signed-off-by: Michal Ostrowski Signed-off-by: David S. Miller commit 202a03acf9994076055df40ae093a5c5474ad0bd Author: Florian Zumbiehl Date: Fri Apr 20 16:58:14 2007 -0700 [PPPOE]: memory leak when socket is release()d before PPPIOCGCHAN has been called on it below you find a patch that fixes a memory leak when a PPPoE socket is release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl ever has been called on it. This is somewhat of a security problem, too, since PPPoE sockets can be created by any user, so any user can easily allocate all the machine's RAM to non-swappable address space and thus DoS the system. Is there any specific reason for PPPoE sockets being available to any unprivileged process, BTW? After all, you need a packet socket for the discovery stage anyway, so it's unlikely that any unprivileged process will ever need to create a PPPoE socket, no? Allocating all session IDs for a known AC is a kind of DoS, too, after all - with Juniper ERXes, this is really easy, actually, since they don't ever assign session ids above 8000 ... Signed-off-by: Florian Zumbiehl Acked-by: Michal Ostrowski Signed-off-by: David S. Miller commit 74b885cf86def9bc836772e3c1788c00b72a35c9 Author: Florian Zumbiehl Date: Fri Apr 20 16:57:27 2007 -0700 [PPPOE]: race between interface going down and connect() below you find a patch that (hopefully) fixes a race between an interface going down and a connect() to a peer on that interface. Before, connect() would determine that an interface is up, then the interface could go down and all entries referring to that interface in the item_hash_table would be marked as ZOMBIEs and their references to the device would be freed, and after that, connect() would put a new entry into the hash table referring to the device that meanwhile is down already - which also would cause unregister_netdevice() to wait until the socket has been release()d. This patch does not suffice if we are not allowed to accept connect()s referring to a device that we already acked a NETDEV_GOING_DOWN for (that is: all references are only guaranteed to be freed after NETDEV_DOWN has been acknowledged, not necessarily after the NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid looking through the hash table upon NETDEV_GOING_DOWN completely and only do that once we get the NETDEV_DOWN ... mostrows: pppoe_flush_dev is called on NETDEV_GOING_DOWN and NETDEV_DOWN to deal with this "late connect" issue. Ideally one would hope to notify users at the "NETDEV_GOING_DOWN" phase (just to pretend to be nice). However, it is the NETDEV_DOWN scan that takes all the responsibility for ensuring nobody is hanging around at that time. Signed-off-by: Florian Zumbiehl Acked-by: Michal Ostrowski Signed-off-by: David S. Miller commit bfafb26e11849fe99e03cc1902a91f7f65354e47 Author: Florian Zumbiehl Date: Fri Apr 20 16:56:31 2007 -0700 [PPPoE]: miscellaneous smaller cleanups below is a patch that just removes dead code/initializers without any effect (first access is an assignment) that I stumbled accross while reading the source. Signed-off-by: Florian Zumbiehl Acked-by: Michal Ostrowski Signed-off-by: David S. Miller commit 0c6fcc8a8cfcc737d05b6be8b2c3e931ef99cfc2 Author: Stephen Hemminger Date: Fri Apr 20 16:40:01 2007 -0700 [NET] skbuff: skb_store_bits const is backwards Getting warnings becuase skb_store_bits has skb as constant, but the function overwrites it. Looks like const was on the wrong side. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 3e6cf558b0098a15d8c360c4eaad3e4d719a555a Author: Stephen Hemminger Date: Fri Apr 20 16:39:17 2007 -0700 [BRIDGE]: Fix warning in net-2.6.22 The following is leftover from earlier change in net-2.6.22. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 75606dc69adcfff433bca0ff747538d8495da0ab Author: Ralf Baechle Date: Fri Apr 20 16:06:45 2007 -0700 [AX25/NETROM/ROSE]: Convert to use modern wait queue API Signed-off-by: Ralf Baechle Signed-off-by: David S. Miller commit 80feaacb8a6400a9540a961b6743c69a5896b937 Author: Peter P. Waskiewicz Jr Date: Fri Apr 20 16:05:39 2007 -0700 [AF_PACKET]: Add option to return orig_dev to userspace. Add a packet socket option to allow the orig_dev index to be returned to userspace when passing traffic through a decapsulated device, such as the bonding driver. This is very useful for layer 2 traffic being able to report which physical device actually received the traffic, instead of having the encapsulating device hide that information. The new option is called PACKET_ORIGDEV. Signed-off-by: Peter P. Waskiewicz Jr. Signed-off-by: David S. Miller commit 1370b5a59b941ac3873b5e8614d496e9f481d670 Author: YOSHIFUJI Hideaki Date: Fri Apr 20 15:57:45 2007 -0700 [IPV6] SNMP: Export statistics via netlink without CONFIG_PROC_FS. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 334901700f9f58993ebd7f6136d3f9062460d34d Author: YOSHIFUJI Hideaki Date: Fri Apr 20 15:57:15 2007 -0700 [IPV4] SNMP: Move some statistic bits to net/ipv4/proc.c. This also fixes memory leak in error path. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 49ed67a9eee3c756263feed4474e4fcf5c8eaed2 Author: YOSHIFUJI Hideaki Date: Fri Apr 20 15:56:48 2007 -0700 [IPV6] SNMP: Move some statistic bits to net/ipv6/proc.c. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit bf99f1bde3b3009af74874f3465f6861431fbb66 Author: YOSHIFUJI Hideaki Date: Fri Apr 20 15:56:20 2007 -0700 [IPV6] SNMP: Netlink interface. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 628a5c561890a9a9a74dea017873530584aab06e Author: John Heffner Date: Fri Apr 20 15:53:27 2007 -0700 [INET]: Add IP(V6)_PMTUDISC_RPOBE Add IP(V6)_PMTUDISC_PROBE value for IP(V6)_MTU_DISCOVER. This option forces us not to fragment, but does not make use of the kernel path MTU discovery. That is, it allows for user-mode MTU probing (or, packetization-layer path MTU discovery). This is particularly useful for diagnostic utilities, like traceroute/tracepath. Signed-off-by: John Heffner Signed-off-by: David S. Miller commit b881ef7603230550aa0150b22af94089f07ab00d Author: John Heffner Date: Fri Apr 20 15:52:39 2007 -0700 [IPV6]: MTU discovery check in ip6_fragment() Adds a check in ip6_fragment() mirroring ip_fragment() for packets that we can't fragment, and sends an ICMP Packet Too Big message in response. Signed-off-by: John Heffner Signed-off-by: David S. Miller commit fd44de7cc1d430caef91ad9aecec9ff000fe86f8 Author: Patrick McHardy Date: Mon Apr 16 17:07:08 2007 -0700 [NET_SCHED]: ingress: switch back to using ingress_lock Switch ingress queueing back to use ingress_lock. qdisc_lock_tree now locks both the ingress and egress qdiscs on the device. All changes to data that might be used on both ingress and egress needs to be protected by using qdisc_lock_tree instead of manually taking dev->queue_lock. Additionally the qdisc stats_lock needs to be initialized to ingress_lock for ingress qdiscs. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 0463d4ae25771aaf3379bb6b2392f6edf23c2828 Author: Patrick McHardy Date: Mon Apr 16 17:02:10 2007 -0700 [NET_SCHED]: Eliminate qdisc_tree_lock Since we're now holding the rtnl during the entire dump operation, we can remove qdisc_tree_lock, whose only purpose is to protect dump callbacks from concurrent changes to the qdisc tree. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit ffa4d7216e848fbfdcb8e6f0bb66abeaa1888964 Author: Patrick McHardy Date: Wed Apr 25 14:01:17 2007 -0700 [NETLINK]: don't reinitialize callback mutex Don't reinitialize the callback mutex the netlink_kernel_create caller handed in, it is supposed to already be initialized and could already be held by someone. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 6313c1e0992feaee56bc09b85042b3186041fa3c Author: Patrick McHardy Date: Mon Apr 16 17:00:53 2007 -0700 [RTNETLINK]: Remove unnecessary locking in dump callbacks Since we're now holding the rtnl during the entire dump operation, we can remove additional locking for rtnl protected data. This patch does that for all simple cases (dev_base_lock for dev_base walking, RCU protection for FIB rule dumping). Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 1c2d670f3660e9103fdcdca702f6dbf8ea7d6afb Author: Patrick McHardy Date: Mon Apr 16 16:59:10 2007 -0700 [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks Hold rtnl_mutex during the entire netlink dump operation. This allows to simplify locking in the dump callbacks, since they can now rely on that no concurrent changes happen. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit af65bdfce98d7965fbe93a48b8128444a2eea024 Author: Patrick McHardy Date: Fri Apr 20 14:14:21 2007 -0700 [NETLINK]: Switch cb_lock spinlock to mutex and allow to override it Switch cb_lock to mutex and allow netlink kernel users to override it with a subsystem specific mutex for consistent locking in dump callbacks. All netlink_dump_start users have been audited not to rely on any side-effects of the previously used spinlock. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit b076deb8498e26c9aa2f44046fe5e9936ae2fb5a Author: Patrick McHardy Date: Thu Apr 12 22:17:05 2007 -0700 [NETFILTER]: ipt_ULOG: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit d3a2c3ca8e7d908824701db978b936d115aea506 Author: Patrick McHardy Date: Thu Apr 12 22:16:38 2007 -0700 [NETFILTER]: nfnetlink_log: remove fallback to group 0 Don't fallback to group 0 if no instance can be found for the given group. This potentially confuses the listener and is not what the user configured. Also remove the ring buffer spamming that happens when rules are set up before the logging daemon is started. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 3b5018d6766186474366f26cc87fba81407b9089 Author: Patrick McHardy Date: Thu Apr 12 22:16:18 2007 -0700 [NETFILTER]: {eb,ip6,ip}t_LOG: remove remains of LOG target overloading All LOG targets always use their internal logging function nowadays, so remove the incorrect error message and handle real errors (!= -EEXIST) by failing to load. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit fe6092ea0019cbba5263a915c9ce9f2bf383209e Author: Patrick McHardy Date: Thu Apr 12 22:15:50 2007 -0700 [NETFILTER]: nf_nat: use HW checksumming when possible When mangling packets forwarded to a HW checksumming capable device, offload recalculation of the checksum instead of doing it in software. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit c15bf6e699f4c366f2d1e19ac5d7add21c6b5a19 Author: Bart De Schuymer Date: Thu Apr 12 22:15:06 2007 -0700 [NETFILTER]: ebt_arp: add gratuitous arp filtering The attached patch adds gratuitous arp filtering, more precisely: it allows checking that the IPv4 source address matches the IPv4 destination address inside the ARP header. It also adds a check for the hardware address type when matching MAC addresses (nothing critical, just for better consistency). Signed-off-by: Bart De Schuymer Acked-by: Carl-Daniel Hailfinger Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 516299d2f5b6f9703b9b388faf91898dc636a678 Author: Michael Milner Date: Thu Apr 12 22:14:23 2007 -0700 [NETFILTER]: bridge-nf: filter bridged IPv4/IPv6 encapsulated in pppoe traffic The attached patch by Michael Milner adds support for using iptables and ip6tables on bridged traffic encapsulated in ppoe frames, similar to what's already supported for vlan. Signed-off-by: Michael Milner Signed-off-by: Bart De Schuymer Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 91d73c15cb165195bc8c3d6a35e30df454b1485b Author: Gerrit Renker Date: Fri Apr 20 13:57:21 2007 -0700 [DCCP]: Complete documentation of dccp_sock This fills in missing documentation for dccp_sock fields. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit f73f7097c986aab159491dcded7fc918e76e9ec3 Author: Gerrit Renker Date: Fri Apr 20 13:56:47 2007 -0700 [DCCP]: Debug statements for Elapsed Time option This prints the value of the parsed Elapsed Time when received via a Timestamp Echo option [RFC 4342, 13.3]. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit b2449fdc30ccac550344df5e60d38bb8427b109c Author: Gerrit Renker Date: Fri Apr 20 13:02:55 2007 -0700 [DCCP]: Fix bug in the calculation of very low sending rates This fixes an error in the calculation of t_ipi when X converges towards very low sending rates (between 1 and 64 bytes per second). Although this case may not sound likely, it can be reproduced by connecting, hitting enter (1 byte sent) and waiting for some time, during which the nofeedback timer halves the sending rate until finally it reaches the region 1..64 bytes/sec. Computing X is handled correctly (tested separately); but by dividing X _before_ entering the calculation of t_ipi, X becomes zero as a result. This in turn triggers a BUG condition caught in scaled_div(). Fixed by replacing with equivalent statement and explicit typecast for good measure. Calculation verified and effect of patch tested - reduced never below 1 byte per 64 seconds afterwards, i.e. not allowing divide-by-zero. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit cb8c181f288a1157bc717cc7a02412d4d1dc19bc Author: David S. Miller Date: Tue Apr 10 22:10:39 2007 -0700 [S390]: Fix build on 31-bit. Allow s390 to properly override the generic __div64_32() implementation by: 1) Using obj-y for div64.o in s390's makefile instead of lib-y 2) Adding the weak attribute to the generic implementation. Signed-off-by: David S. Miller commit efd1e8d569b3d35a3a636683c2a9ebffec9c1fcf Author: Patrick McHardy Date: Tue Apr 10 18:30:09 2007 -0700 [SK_BUFF]: Fix missing offset adjustment in skb_copy_expand skb_copy_expand changes the headroom, so it needs to adjust the header offsets by the difference between the old and the new value. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 33036807b32d5ed1f4fdfe2d5e6bab2bb260b9f7 Author: Eric Dumazet Date: Tue Apr 10 13:25:40 2007 -0700 [NET]: loopback driver can use loopback_dev integrated net_device_stats Rusty added a new 'stats' field to struct net_device. loopback driver can use it instead of declaring another struct net_device_stats This saves some memory. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit 87a596e0b8bc344bd6bfebe83b56d11fb79ee23a Author: Akinobu Mita Date: Sat Apr 7 18:57:07 2007 +0900 bridge: check kmem_cache_create() error This patch checks kmem_cache_create() error and aborts loading module on failure. Signed-off-by: Akinobu Mita Signed-off-by: Stephen Hemminger commit ffe1d49cc300f3dff990093aa952a2fbb371c1b6 Author: Stephen Hemminger Date: Mon Apr 9 11:49:58 2007 -0700 bridge: allow changing hardware address to any valid address For case of bridging pseudo devices, the get created/destroyed (Xen) need to allow setting address to any valid value. Signed-off-by: Stephen Hemminger commit b86c45035c439cfa6ef5b2e4bf080b24bd8765f1 Author: Stephen Hemminger Date: Thu Mar 22 14:08:46 2007 -0700 bridge: change when netlink events go to STP Need to tell STP daemon about more events, like any time a device is added even when it is down. Signed-off-by: Stephen Hemminger commit 9cde070874b822d4677f4f01fe146991785813b1 Author: Stephen Hemminger Date: Wed Mar 21 14:22:44 2007 -0700 bridge: add support for user mode STP This patchset based on work by Aji_Srinivas@emc.com provides allows spanning tree to be controled from userspace. Like hotplug, it uses call_usermodehelper when spanning tree is enabled so there is no visible API change. If call to start usermode STP fails it falls back to existing kernel STP. Signed-off-by: Stephen Hemminger commit 9cf637473c8535b5abe27fee79254c2d552e042a Author: Stephen Hemminger Date: Mon Apr 9 12:57:54 2007 -0700 bridge: add sysfs hook to flush forwarding table The RSTP daemon needs to be able to flush all dynamic forwarding entries in the case of topology change. This is a temporary interface. It will change to a netlink interface before RSTP daemon is officially released. Signed-off-by: Stephen Hemminger commit 3f890923182aeebc572f3818dd51c9014827e0ec Author: Stephen Hemminger Date: Wed Mar 21 13:42:33 2007 -0700 bridge: simpler hash with salt Instead of hashing the whole Ethernet address, it should be faster to just use the last 4 bytes. Add a random salt value to the hash to make it more difficult to construct worst case DoS hash chains. Signed-off-by: Stephen Hemminger commit 467aea0ddfd1f0f1158c57cbef0e8941dd63374c Author: Stephen Hemminger Date: Wed Mar 21 13:42:06 2007 -0700 bridge: don't route packets while learning While in the STP learning state, don't route packets; wait until forwarding delay has expired. The purpose of the forwarding delay is to detect loops in the network, and if a brouter started up and started forwarding, it could cause a flood. Signed-off-by: Stephen Hemminger commit 6229e362dd49b9e8387126bd4483ab0574d23e9c Author: Stephen Hemminger Date: Wed Mar 21 13:38:47 2007 -0700 bridge: eliminate call by reference Change the bridging hook to be simple function with return value rather than modifying the skb argument. This could generate better code and is cleaner. Signed-off-by: Stephen Hemminger commit 604763722c655c7e3f31ecf6f7b4dafcd26a7a15 Author: Herbert Xu Date: Mon Apr 9 11:59:39 2007 -0700 [NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY When a transmitted packet is looped back directly, CHECKSUM_PARTIAL maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should treat it as such in the stack. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit 628592ccdbbb5bb751217cf02e2e7abb500d7ffe Author: Herbert Xu Date: Mon Apr 23 17:06:40 2007 -0700 [NETDRV]: Perform missing csum_offset conversions When csum_offset was introduced we did a conversion from csum to csum_offset where applicable. A couple of drivers were missed in this process. It was harmless to begin with since the two fields coincided. Now that we've made them different with the addition of csum_start, the missed drivers must be converted or they can't send packets out at all that require checksum offload. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit 663ead3bb8d5b561e70fc3bb3861c9220b5a77eb Author: Herbert Xu Date: Mon Apr 9 11:59:07 2007 -0700 [NET]: Use csum_start offset instead of skb_transport_header The skb transport pointer is currently used to specify the start of the checksum region for transmit checksum offload. Unfortunately, the same pointer is also used during receive side processing. This creates a problem when we want to retransmit a received packet with partial checksums since the skb transport pointer would be overwritten. This patch solves this problem by creating a new 16-bit csum_start offset value to replace the skb transport header for the purpose of checksums. This offset is calculated from skb->head so that it does not have to change when skb->data changes. No extra space is required since csum_offset itself fits within a 16-bit word so we can use the other 16 bits for csum_start. For backwards compatibility, just before we push a packet with partial checksums off into the device driver, we set the skb transport header to what it would have been under the old scheme. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit ac758e3c55c529714354fc268892ca4d23ca1e99 Author: Patrick McHardy Date: Mon Apr 9 11:47:58 2007 -0700 [XFRM]: beet: fix worst case header_len calculation esp_init_state doesn't account for the beet pseudo header in the header_len calculation, which may result in undersized skbs hitting xfrm4_beet_output, causing unnecessary reallocations in ip_finish_output2. The skbs should still always have enough room to avoid causing skb_under_panic in skb_push since we have at least 16 bytes available from LL_RESERVED_SPACE in xfrm_state_check_space. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit c5c2523893747f88a83376abad310c8ad13f7197 Author: Patrick McHardy Date: Mon Apr 9 11:47:18 2007 -0700 [XFRM]: Optimize MTU calculation Replace the probing based MTU estimation, which usually takes 2-3 iterations to find a fitting value and may underestimate the MTU, by an exact calculation. Also fix underestimation of the XFRM trailer_len, which causes unnecessary reallocations. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 557922584d9c5b6b990bcfb2fec3134f0e73a05d Author: Patrick McHardy Date: Mon Apr 9 11:46:17 2007 -0700 [XFRM]: esp: fix skb_tail_pointer conversion bug Fix incorrect switch of "trailer" skb by "skb" during skb_tail_pointer conversion: - *(u8*)(trailer->tail - 1) = top_iph->protocol; + *(skb_tail_pointer(skb) - 1) = top_iph->protocol; - *(u8 *)(trailer->tail - 1) = *skb_network_header(skb); + *(skb_tail_pointer(skb) - 1) = *skb_network_header(skb); Signed-off-by: Patrick McHardy Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 56eb88828b78f6f3b11a2996350092a40745301f Author: Patrick McHardy Date: Mon Apr 9 11:45:04 2007 -0700 [SK_BUFF]: Fix missing offset adjustment in pskb_expand_head Since we're increasing the headroom, the header offsets need to be increased by the same amount as well. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 29f6af7712c40045e7886d0fa356d97a6f9aba49 Author: YOSHIFUJI Hideaki Date: Fri Apr 6 11:45:39 2007 -0700 [IPV6] FIB6RULE: Find source address during looking up route. When looking up route for destination with rules with source address restrictions, we may need to find a source address for the traffic if not given. Based on patch from Noriaki TAKAMIYA . Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit ea2f10a3c81724701fe6a754789eafd50b33909f Author: Patrick McHardy Date: Thu Apr 5 16:04:04 2007 -0700 [XFRM]: beet: minor cleanups Remove unnecessary initialization/variable. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 038890fed8d1fa95bbbdeb517f5710eb75fa9e2e Author: Thomas Graf Date: Thu Apr 5 14:35:52 2007 -0700 [RTNL]: Improve error codes for unsupported operations The most common trigger of these errors is that the config option hasn't been enable wich would make the functionality available. Therefore returning EOPNOTSUPP gives a better idea on what is going wrong. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 716ea3a7aae3a2bfc44cb97b5419c1c9868c7bc9 Author: David Howells Date: Mon Apr 2 20:19:53 2007 -0700 [NET]: Move generic skbuff stuff from XFRM code to generic code Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can use it too. The kdoc comments I've attached to the functions needs to be checked by whoever wrote them as I had to make some guesses about the workings of these functions. Signed-off-By: David Howells Signed-off-by: David S. Miller commit 926554c4b74e53d5da4cefdc3bbd7e92427fb1a9 Author: Arnaldo Carvalho de Melo Date: Sat Mar 31 12:05:49 2007 -0300 [CREDITS]: Update Arnaldo entry Signed-off-by: Arnaldo Carvalho de Melo commit 1a4e2d093fd5f3eaf8cffc04a1b803f8b0ddef6d Author: Arnaldo Carvalho de Melo Date: Sat Mar 31 11:55:45 2007 -0300 [SK_BUFF]: Some more conversions to skb_copy_from_linear_data Signed-off-by: Arnaldo Carvalho de Melo commit 27d7ff46a3498d3debc6ba68fb8014c702b81170 Author: Arnaldo Carvalho de Melo Date: Sat Mar 31 11:55:19 2007 -0300 [SK_BUFF]: Introduce skb_copy_to_linear_data{_offset} To clearly state the intent of copying to linear sk_buffs, _offset being a overly long variant but interesting for the sake of saving some bytes. Signed-off-by: Arnaldo Carvalho de Melo commit 3dbad80ac7632f243b824d469301abb97ec634a1 Author: David S. Miller Date: Thu Mar 29 19:16:03 2007 -0700 [NET]: Fix warnings in 3c523.c and ni52.c We have to put back the cast to "char *" because these pointers are volatile. Reported by Andrew Morton. Signed-off-by: David S. Miller commit c45d286e72dd72c0229dc9e2849743ba427fee84 Author: Rusty Russell Date: Wed Mar 28 14:29:08 2007 -0700 [NET]: Inline net_device_stats Network drivers which keep stats allocate their own stats structure then write a get_stats() function to return them. It would be nice if this were done by default. 1) Add a new "stats" field to "struct net_device". 2) Add a new feature field to say "this driver uses the internal one" 3) Have a default "get_stats" which returns NULL if that feature not set. 4) Change callers to check result of get_stats call for NULL, not if ->get_stats is set. This should not break backwards compatibility with older drivers, yet allow modern drivers to shed some boilerplate code. Lightly tested: works for a modified lguest network driver. Signed-off-by: Rusty Russell Signed-off-by: David S. Miller commit f85958151900f9d30fa5ff941b0ce71eaa45a7de Author: Eric Dumazet Date: Wed Mar 28 14:22:33 2007 -0700 [NET]: random functions can use nsec resolution instead of usec In order to get more randomness for secure_tcpv6_sequence_number(), secure_tcp_sequence_number(), secure_dccp_sequence_number() functions, we can use the high resolution time services, providing nanosec resolution. I've also done two kmalloc()/kzalloc() conversions. Signed-off-by: Eric Dumazet Acked-by: James Morris Signed-off-by: David S. Miller commit 4b19ca44cbafabfe0b7b98e2e24b21a96198f509 Author: Thomas Graf Date: Wed Mar 28 14:18:52 2007 -0700 [NET] fib_rules: delay route cache flush by ip_rt_min_delay Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit d626f62b11e00c16e81e4308ab93d3f13551812a Author: Arnaldo Carvalho de Melo Date: Tue Mar 27 18:55:52 2007 -0300 [SK_BUFF]: Introduce skb_copy_from_linear_data{_offset} To clearly state the intent of copying from linear sk_buffs, _offset being a overly long variant but interesting for the sake of saving some bytes. Signed-off-by: Arnaldo Carvalho de Melo commit 2a123b86e2b242a4a6db990d2851d45e192f88e5 Author: Arnaldo Carvalho de Melo Date: Tue Mar 27 18:38:07 2007 -0300 [BLUETOOTH]: Introduce skb->data accessor methods for hci_{acl,event,sco}_hdr For consistency with other skb data accessors, reducing the number of direct accesses to skb->data. Signed-off-by: Arnaldo Carvalho de Melo commit 03d4f879b9ddf7d5c1f788792247e62450342eed Author: Eric Dumazet Date: Tue Mar 27 14:18:34 2007 -0700 [IPV4]: align inet_protos[] on SMP As IPPROTO_TCP is 6, it makes sense to make sure inet_protos[] array is properly cache line aligned to avoid false sharing on SMP. c0680540 b peer_total c0680544 b inet_peer_unused_head c0680560 B inet_protos On i386 this example, we can see that inet_protos[IPPROTO_TCP] shares a potentially hot (and modified) cache line. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit 4103f8cd5c1f260d674a7b426ed221812de54d47 Author: Eric Dumazet Date: Tue Mar 27 13:58:31 2007 -0700 [TCP]: tcp_memory_pressure and tcp_socket are__read_mostly candidates tcp_memory_pressure and tcp_socket currently share a cache line with tcp_memory_allocated, tcp_sockets_allocated. (Very hot cache line) It makes sense to declare these variables as __read_mostly, to avoid false sharing on SMP. ffffffff8081d9c0 B tcp_orphan_count ffffffff8081d9c4 B tcp_memory_allocated ffffffff8081d9c8 B tcp_sockets_allocated ffffffff8081d9cc B tcp_memory_pressure ffffffff8081d9d0 b tcp_md5sig_users ffffffff8081d9d8 b tcp_md5sig_pool ffffffff8081d9e0 b warntime.31570 ffffffff8081d9e8 b tcp_socket Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit 73417f617a93cf30342c3ea41abc38927bd467aa Author: Thomas Graf Date: Tue Mar 27 13:56:52 2007 -0700 [NET] fib_rules: Flush route cache after rule modifications The results of FIB rules lookups are cached in the routing cache except for IPv6 as no such cache exists. So far, it was the responsibility of the user to flush the cache after modifying any rules. This lead to many false bug reports due to misunderstanding of this concept. This patch automatically flushes the route cache after inserting or deleting a rule. Thanks to Muli Ben-Yehuda for catching a bug in the previous patch. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit be776281aee54626a474ba06f91926b98bdd180d Author: Eric Dumazet Date: Tue Mar 27 13:53:04 2007 -0700 [NET]: inet_ehash_secret should be __read_mostly and set only once There is a very tiny probability that build_ehash_secret() is called at the same time by different CPUS. Also, using __read_mostly is a must for inet_ehash_secret Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit 35fc92a9deee0da6e35fdc3150bb134e58f2fd63 Author: Herbert Xu Date: Mon Mar 26 23:22:20 2007 -0700 [NET]: Allow forwarding of ip_summed except CHECKSUM_COMPLETE Right now Xen has a horrible hack that lets it forward packets with partial checksums. One of the reasons that CHECKSUM_PARTIAL and CHECKSUM_COMPLETE were added is so that we can get rid of this hack (where it creates two extra bits in the skbuff to essentially mirror ip_summed without being destroyed by the forwarding code). I had forgotten that I've already gone through all the deivce drivers last time around to make sure that they're looking at ip_summed == CHECKSUM_PARTIAL rather than ip_summed != 0 on transmit. In any case, I've now done that again so it should definitely be safe. Unfortunately nobody has yet added any code to update CHECKSUM_COMPLETE values on forward so we I'm setting that to CHECKSUM_NONE. This should be safe to remove for bridging but I'd like to check that code path first. So here is the patch that lets us get rid of the hack by preserving ip_summed (mostly) on forwarded packets. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit 2d771cd86d4c3af26f34a7bcdc1b87696824cad9 Author: Janusz Krzysztofik Date: Mon Mar 26 18:03:44 2007 -0700 [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed this is a small patch by Janusz Krzysztofik to ip_route_output_slow() that allows VIP-less LVS linux director to generate packets originating >From VIP if sysctl_ip_nonlocal_bind is set. In a nutshell, the intention is for an LVS linux director to be able to send ICMP unreachable responses to end-users when real-servers are removed. http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00106.html Signed-off-by: Simon Horman Signed-off-by: David S. Miller commit fa0b2d1d2196dd46527a8d028797e2bca5930a92 Author: Thomas Graf Date: Mon Mar 26 17:38:53 2007 -0700 [NET] fib_rules: Add no-operation action The use of nop rules simplifies the usage of goto rules and adds more flexibility as they allow targets to remain while the actual content of the branches can change easly. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 2b44368307cd06c5614d7b53801f516c0654020b Author: Thomas Graf Date: Mon Mar 26 17:37:59 2007 -0700 [NET] fib_rules: Mark rules detached from the device Rules which match against device names in their selector can remain while the device itself disappears, in fact the device doesn't have to present when the rule is added in the first place. The device name is resolved by trying when the rule is added and later by listening to NETDEV_REGISTER/UNREGISTER notifications. This patch adds the flag FIB_RULE_DEV_DETACHED which is set towards userspace when a rule contains a device match which is unresolved at the moment. This eases spotting the reason why certain rules seem not to function properly. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 0947c9fe56d9cf7ad0bc3a03ccd30446cde698e4 Author: Thomas Graf Date: Mon Mar 26 17:14:15 2007 -0700 [NET] fib_rules: goto rule action This patch adds a new rule action FR_ACT_GOTO which allows to skip a set of rules by jumping to another rule. The rule to jump to is specified via the FRA_GOTO attribute which carries a rule preference. Referring to a rule which doesn't exists is explicitely allowed. Such goto rules are marked with the flag FIB_RULE_UNRESOLVED and will act like a rule with a non-matching selector. The rule will become functional as soon as its target is present. The goto action enables performance optimizations by reducing the average number of rules that have to be passed per lookup. Example: 0: from all lookup local 40: not from all to 192.168.23.128 goto 32766 41: from all fwmark 0xa blackhole 42: from all fwmark 0xff blackhole 32766: from all lookup main Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 2f7826c02447480c7c1b5500b34fc783f1ed8145 Author: David S. Miller Date: Mon Mar 26 02:00:58 2007 -0700 [WAN] cosa.c: Build fix. Caused by skb_reset_mac_header() changes, missing semicolon. Signed-off-by: David S. Miller commit 85795d64eddd4546375f5ee37bedd88cb5bc4ece Author: Stephen Hemminger Date: Sat Mar 24 21:35:33 2007 -0700 [TCP] tcp_probe: improvements for net-2.6.22 Change tcp_probe to use ktime (needed to add one export). Add option to only get events when cwnd changes - from Doug Leith Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit e1c3e7ab6de9711d2e0e9daf369c6638582eb7e7 Author: Stephen Hemminger Date: Sat Mar 24 21:34:38 2007 -0700 [TCP]: cubic update for net-2.6.22 The following update received from Injong updates TCP cubic to the latest version. I am running more complete tests and will have results after 4/1. According to Injong: the new version improves on its scalability, fairness and stability. So in all properties, we confirmed it shows better performance. NCSU results (for 2.6.18 and 2.6.20) available: http://netsrv.csc.ncsu.edu/wiki/index.php/TCP_Testing This version is described in a new Internet draft for CUBIC. http://www.ietf.org/internet-drafts/draft-rhee-tcp-cubic-00.txt Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 9af3912ec9e30509b76cb376abb65a4d8af27df3 Author: John Heffner Date: Sun Mar 25 23:32:29 2007 -0700 [NET] Move DF check to ip_forward Do fragmentation check in ip_forward, similar to ipv6 forwarding. Signed-off-by: John Heffner Signed-off-by: David S. Miller commit b3da2cf37c5c6e47698957a25ab43a7223dbb90f Author: David S. Miller Date: Fri Mar 23 11:40:27 2007 -0700 [INET]: Use jhash + random secret for ehash. The days are gone when this was not an issue, there are folks out there with huge bot networks that can be used to attack the established hash tables on remote systems. So just like the routing cache and connection tracking hash, use Jenkins hash with random secret input. Signed-off-by: David S. Miller commit d30045a0bcf144753869175dd9d840f7ceaf4aba Author: Johannes Berg Date: Fri Mar 23 11:37:48 2007 -0700 [NETLINK]: introduce NLA_BINARY type This patch introduces a new NLA_BINARY attribute policy type with the verification of simply checking the maximum length of the payload. It also fixes a small typo in the example. Signed-off-by: Johannes Berg Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 703315712cfccfe0b45ef4aa6994527d8ee95e33 Author: Vlad Yasevich Date: Fri Mar 23 11:34:36 2007 -0700 [SCTP]: Implement SCTP_MAX_BURST socket option. Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller commit a5a35e76753d27e782028843a5186f176b50dd16 Author: Vlad Yasevich Date: Fri Mar 23 11:34:08 2007 -0700 [SCTP]: Implement sac_info field in SCTP_ASSOC_CHANGE notification. As stated in the sctp socket api draft: sac_info: variable If the sac_state is SCTP_COMM_LOST and an ABORT chunk was received for this association, sac_info[] contains the complete ABORT chunk as defined in the SCTP specification RFC2960 [RFC2960] section 3.3.7. We now save received ABORT chunks into the sac_info field and pass that to the user. Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller commit bdf3092af601ccad765974652ab103162fbe14f4 Author: Vlad Yasevich Date: Fri Mar 23 11:33:12 2007 -0700 [SCTP]: Honor flags when setting peer address parameters Parameters only take effect when a corresponding flag bit is set and a value is specified. This means we need to check the flags in addition to checking for non-zero value. Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller commit 1ae4114dce35dd1d32ed847f60b599dbbdfd5829 Author: Vlad Yasevich Date: Fri Mar 23 11:32:26 2007 -0700 [SCTP]: Implement SCTP_ADDR_CONFIRMED state for ADDR_CHNAGE event Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller commit d49d91d79a8dc5e85108a5ae1c8eef23dec135c1 Author: Vlad Yasevich Date: Fri Mar 23 11:32:00 2007 -0700 [SCTP]: Implement SCTP_PARTIAL_DELIVERY_POINT option. This option induces partial delivery to run as soon as the specified amount of data has been accumulated on the association. However, we give preference to fully reassembled messages over PD messages. In any case, window and buffer is freed up. Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller commit b6e1331f3ce25a56edb956054eaf8011654686cb Author: Vlad Yasevich Date: Fri Apr 20 12:23:15 2007 -0700 [SCTP]: Implement SCTP_FRAGMENT_INTERLEAVE socket option This option was introduced in draft-ietf-tsvwg-sctpsocket-13. It prevents head-of-line blocking in the case of one-to-many endpoint. Applications enabling this option really must enable SCTP_SNDRCV event so that they would know where the data belongs. Based on an earlier patch by Ivan Skytte Jørgensen. Additionally, this functionality now permits multiple associations on the same endpoint to enter Partial Delivery. Applications should be extra careful, when using this functionality, to track EOR indicators. Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller commit c95e939508e64863a1c5c73a9e1a908784e06820 Author: Patrick McHardy Date: Fri Mar 23 11:30:04 2007 -0700 [NET_SCHED]: qdisc: remove unnecessary memory barriers We're holding dev->queue_lock in qdisc_watchdog_schedule and qdisc_watchdog_cancel, no need for the barriers. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit a48b5a61448899040dfbd2e0cd55b06a2bd2466c Author: Patrick McHardy Date: Fri Mar 23 11:29:43 2007 -0700 [NET_SCHED]: Unline tcf_destroy Uninline tcf_destroy and add a helper function to destroy an entire filter chain. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 3bebcda28077375470dd60545b71bba2f83335fd Author: Patrick McHardy Date: Fri Mar 23 11:29:25 2007 -0700 [NET_SCHED]: turn PSCHED_GET_TIME into inline function Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 03cc45c0a5b9b7f74768feb43b9a2525d203bbdb Author: Patrick McHardy Date: Fri Mar 23 11:29:11 2007 -0700 [NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function Also rename to psched_tdiff_bounded. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 8edc0c31d6b7849b0fb50db86824830769241939 Author: Patrick McHardy Date: Fri Mar 23 11:28:55 2007 -0700 [NET_SCHED]: kill PSCHED_TDIFF Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit a084980dcbf56c896e4b6c19aff2b082d5db7006 Author: Patrick McHardy Date: Fri Mar 23 11:28:30 2007 -0700 [NET_SCHED]: kill PSCHED_SET_PASTPERFECT/PSCHED_IS_PASTPERFECT Use direct assignment and comparison instead. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 104e0878984bb467e3f54d61105d8903babb4ec1 Author: Patrick McHardy Date: Fri Mar 23 11:28:07 2007 -0700 [NET_SCHED]: kill PSCHED_TLESS Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 7c59e25f3186f26e85b13a318dbc4482d1d363e9 Author: Patrick McHardy Date: Fri Mar 23 11:27:45 2007 -0700 [NET_SCHED]: kill PSCHED_TADD/PSCHED_TADD2 Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 26e252df1e6e5b68eb790e4a4baf745aa3870038 Author: Patrick McHardy Date: Fri Mar 23 11:27:29 2007 -0700 [NET_SCHED]: kill PSCHED_AUDIT_TDIFF Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 76d643cd3bd2b4a1e27e3eafee8e37be9c681792 Author: Patrick McHardy Date: Fri Mar 23 11:27:04 2007 -0700 [NET_SCHED]: sch_netem: fix off-by-one in send time comparison netem checks PSCHED_TLESS(cb->time_to_send, now) to find out whether it is allowed to send a packet, which is equivalent to cb->time_to_send < now. Use !PSCHED_TLESS(now, cb->time_to_send) instead to properly handle cb->time_to_send == now. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit c7bf5f9dc2f78ae8ebbfffc5f17becd0d9e6ba9e Author: Thomas Graf Date: Fri Mar 23 11:17:57 2007 -0700 [NETFILTER] nfnetlink: netlink_run_queue() already checks for NLM_F_REQUEST Patrick has made use of netlink_run_queue() in nfnetlink while my patches have been waiting for net-2.6.22 to open. So this check for NLM_F_REQUEST can go as well. Signed-off-by: Thomas Graf Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit de6e05c49f8b4ed63224c5d38891f531ecc4eabb Author: Yasuyuki Kozakai Date: Fri Mar 23 11:17:27 2007 -0700 [NETFILTER]: nf_conntrack: kill destroy() in struct nf_conntrack for diet The destructor per conntrack is unnecessary, then this replaces it with system wide destructor. Signed-off-by: Yasuyuki Kozakai Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 5f79e0f916a3bdeccc910fdf466bca582a9b2cca Author: Yasuyuki Kozakai Date: Fri Mar 23 11:17:07 2007 -0700 [NETFILTER]: nf_conntrack: don't use nfct in skb if conntrack is disabled Signed-off-by: Yasuyuki Kozakai Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit e6f689db51a789807edede411b32eb7c9e457948 Author: Patrick McHardy Date: Fri Mar 23 11:16:30 2007 -0700 [NETFILTER]: Use setup_timer Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 9afdb00c80b0b9c20435ce690b5287fa2434ef44 Author: Patrick McHardy Date: Fri Mar 23 11:12:50 2007 -0700 [NETFILTER]: nfnetlink_log: remove conditional locking This is gross, have the wrapper function take the lock. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 370e6a878962cad614eb8c7c5a22240e5cd316bb Author: Michal Miroslaw Date: Fri Mar 23 11:12:21 2007 -0700 [NETFILTER]: nfnetlink_log: micro-optimization: inst->skb != NULL in __nfulnl_send() No other function calls __nfulnl_send() with inst->skb == NULL than nfulnl_timer(). Signed-off-by: Michal Miroslaw Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit f76cdcee5ba4a3fb41de93d5f1c17fb6ab4d0820 Author: Michal Miroslaw Date: Fri Mar 23 11:12:03 2007 -0700 [NETFILTER]: nfnetlink_log: iterator functions need iter_state * only get_*() don't need access to seq_file - iter_state is enough for them. Signed-off-by: Michal Miroslaw Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 9a36e8c2b337c424ed77f5dea0a67dc8039d351b Author: Michal Miroslaw Date: Fri Mar 23 11:11:48 2007 -0700 [NETFILTER]: nfnetlink_log: micro-optimization: don't modify destroyed instance Simple micro-optimization: Don't change any options if the instance is being destroyed. Signed-off-by: Michal Miroslaw Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit f414c16c04b1c998e90370791f9a728e292146ea Author: Michal Miroslaw Date: Fri Mar 23 11:11:31 2007 -0700 [NETFILTER]: nfnetlink_log: micro-optimization for inst==NULL in nfulnl_recv_config() Simple micro-optimization: don't call instance_put() on known NULL pointers. Signed-off-by: Michal Miroslaw Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 55b5a91e1723280570957990a0c5ab8c3ec4090a Author: Michal Miroslaw Date: Fri Mar 23 11:11:05 2007 -0700 [NETFILTER]: nfnetlink_log: kill duplicate code Kill some duplicate code in nfulnl_log_packet(). Signed-off-by: Michal Miroslaw Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 09972d6f968d67dd82cbd403d5aa42c241a8d0cb Author: Michal Miroslaw Date: Fri Mar 23 11:10:47 2007 -0700 [NETFILTER]: nfnetlink_log: don't count max(a,b) twice We don't need local nlbufsiz (skb size) as nfulnl_alloc_skb() takes the maximum anyway. Signed-off-by: Michal Miroslaw Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 1b53d9042c04b8eb875d02e65792e9884efc3784 Author: Patrick McHardy Date: Fri Mar 23 11:10:13 2007 -0700 [NETFILTER]: Remove changelogs and CVS IDs Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit bb2f8cc0ecf025d6d3947e0389434650023f432e Author: Stephen Hemminger Date: Fri Mar 23 00:12:09 2007 -0700 [NETEM]: spelling errors Get rid of some of my creative spelling. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit c702e8047fe74648f7852a9c1de781b0d5a98402 Author: Thomas Graf Date: Thu Mar 22 23:30:55 2007 -0700 [NETLINK]: Directly return -EINTR from netlink_dump_start() Now that all users of netlink_dump_start() use netlink_run_queue() to process the receive queue, it is possible to return -EINTR from netlink_dump_start() directly, therefore simplying the callers. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit ead592ba246dfcc643b3f0f0c8c03f7bc898a59f Author: Thomas Graf Date: Thu Mar 22 23:30:35 2007 -0700 [IPv4] diag: Use netlink_run_queue() to process the receive queue Makes use of netlink_run_queue() to process the receive queue and converts inet_diag_rcv_msg() to use the type safe netlink interface. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 1d00a4eb42bdade33a6ec0961cada93577a66ae6 Author: Thomas Graf Date: Thu Mar 22 23:30:12 2007 -0700 [NETLINK]: Remove error pointer from netlink message handler The error pointer argument in netlink message handlers is used to signal the special case where processing has to be interrupted because a dump was started but no error happened. Instead it is simpler and more clear to return -EINTR and have netlink_run_queue() deal with getting the queue right. nfnetlink passed on this error pointer to its subsystem handlers but only uses it to signal the start of a netlink dump. Therefore it can be removed there as well. This patch also cleans up the error handling in the affected message handlers to be consistent since it had to be touched anyway. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 45e7ae7f716086994e4e747226881f901c67b031 Author: Thomas Graf Date: Thu Mar 22 23:29:10 2007 -0700 [NETLINK]: Ignore control messages directly in netlink_run_queue() Changes netlink_rcv_skb() to skip netlink controll messages and don't pass them on to the message handler. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit d35b685640aeb39eb4f5e98c75e8e001e406f9a3 Author: Thomas Graf Date: Thu Mar 22 23:28:46 2007 -0700 [NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue() netlink_rcv_skb() is changed to skip messages which don't have the NLM_F_REQUEST bit to avoid every netlink family having to perform this check on their own. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 33a0543cd9e090d2c6759e0ed85c3049c6efcc06 Author: Thomas Graf Date: Thu Mar 22 23:27:39 2007 -0700 [NETLINK]: Remove unused groups variable Leftover from dynamic multicast groups allocation work. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 267281058c4cfd6a9a173aa957bffa58239f9656 Author: Thomas Graf Date: Thu Mar 22 23:27:19 2007 -0700 [TCP] westwood: Use type safe netlink interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit e9195d677d6f06730edd5c2a3fe3283564e39c51 Author: Thomas Graf Date: Thu Mar 22 23:27:01 2007 -0700 [TCP] vegas: Use type safe netlink interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 51057f2fecff1c520b083c5ac9229e7aebce9e01 Author: Thomas Graf Date: Thu Mar 22 21:41:06 2007 -0700 [RTNL]: Properly return rntl message handler Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 1936502d00ae6c2aa3931c42f6cf54afaba094f2 Author: Stephen Hemminger Date: Thu Mar 22 12:18:35 2007 -0700 [NET_SCHED] qdisc: avoid transmit softirq on watchdog wakeup If possible, avoid having to do a transmit softirq when a qdisc watchdog decides to re-enable. The watchdog routine runs off a timer, so it is already in the same effective context as the softirq. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 11274e5a43266d531140530adebead6903380caf Author: Stephen Hemminger Date: Thu Mar 22 12:17:42 2007 -0700 [NETEM]: avoid excessive requeues The netem code would call getnstimeofday() and dequeue/requeue after every packet, even if it was waiting. Avoid this overhead by using the throttled flag. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 075aa573b74a732aeff487ab77d3fbd627c10856 Author: Stephen Hemminger Date: Thu Mar 22 12:17:05 2007 -0700 [NETEM]: Optimize tfifo In most cases, the next packet will be sent after the last one. So optimize that case. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit b407621c35ed5f9a0734e57472e9539117963768 Author: Stephen Hemminger Date: Thu Mar 22 12:16:21 2007 -0700 [NETEM]: use better types for time values The random number generator always generates 32 bit values. The time values are limited by psched_tdiff_t Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit a362e0a7890c735a3ef63aab12d71ecfc6e6f4a5 Author: Stephen Hemminger Date: Thu Mar 22 12:15:45 2007 -0700 [NETEM]: report reorder percent correctly. If you setup netem to just delay packets; "tc qdisc ls" will report the reordering as 100%. Well it's a lie, reorder isn't used unless gap is set, so just set value to 0 so the output of utility is correct. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 7e58886b45bc4a309aeaa8178ef89ff767daaf7f Author: Stephen Hemminger Date: Thu Mar 22 12:10:58 2007 -0700 [TCP]: cubic optimization Use willy's work in optimizing cube root by having table for small values. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 22b9a0a3a49ab1a856e0853b3f3dd2abd156bd7c Author: Stephen Hemminger Date: Thu Mar 22 12:10:18 2007 -0700 [LIB]: div64_64 optimization Minor optimization of div64_64. do_div() already does optimization for the case of 32 by 32 divide, so no need to do it here. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit c454673da7c1d6533f40ec2f788023df9af56ebf Author: Thomas Graf Date: Sun Mar 25 23:24:24 2007 -0700 [NET] rules: Unified rules dumping Implements a unified, protocol independant rules dumping function which is capable of both, dumping a specific protocol family or all of them. This speeds up dumping as less lookups are required. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 687ad8cc640fd1f1619cc44a9ab274dabd48c758 Author: Thomas Graf Date: Thu Mar 22 11:59:42 2007 -0700 [RTNL]: Use rtnl registration interface for dump-all aliases Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 32fe21c0c0a3091552fea8f2f7e4905f547a3433 Author: Thomas Graf Date: Thu Mar 22 11:59:03 2007 -0700 [BRIDGE]: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit c127ea2c45d1b13a672fde254679721bb282e90a Author: Thomas Graf Date: Thu Mar 22 11:58:32 2007 -0700 [IPv6]: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit fa34ddd739cecf3999ec0b7562618e8321829d41 Author: Thomas Graf Date: Thu Mar 22 11:57:46 2007 -0700 [DECNet]: Use rtnl registration interface Signed-off-by: Thomas Graf Acked-by: Steven Whitehouse Signed-off-by: David S. Miller commit 708914cc5e1657eb1a1f9eefc6333dfd2df8c73a Author: Thomas Graf Date: Thu Mar 22 11:56:59 2007 -0700 [PKT_SCHED] act: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 82623c0d73bd111cad26e501e509966b2455b0e0 Author: Thomas Graf Date: Thu Mar 22 11:56:22 2007 -0700 [PKT_SCHED] cls: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit be577ddc2b4aca0849f701222f5bc13cf1b79c9a Author: Thomas Graf Date: Thu Mar 22 11:55:50 2007 -0700 [PKT_SCHED] qdisc: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 63f3444fb9a54c024d55f1205f8b94e7d2786595 Author: Thomas Graf Date: Thu Mar 22 11:55:17 2007 -0700 [IPv4]: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 9d9e6a5819230b5a5cc036f213135cb123ab1e50 Author: Thomas Graf Date: Sun Mar 25 23:20:05 2007 -0700 [NET] rules: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit c8822a4e00442e65d42d50db8e529d75c2025630 Author: Thomas Graf Date: Thu Mar 22 11:50:06 2007 -0700 [NEIGH]: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 340d17fc9d577c93678850e46963e9b19b92db7e Author: Thomas Graf Date: Thu Mar 22 11:49:22 2007 -0700 [NET] link: Use rtnl registration interface Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit e284986385b6420a5f30f2dcd743512bbe1a3202 Author: Thomas Graf Date: Thu Mar 22 11:48:11 2007 -0700 [RTNL]: Message handler registration interface This patch adds a new interface to register rtnetlink message handlers replacing the exported rtnl_links[] array which required many message handlers to be exported unnecessarly. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller commit 30833ffead66e1f0052150a51db0b45151189ac1 Author: Gerrit Renker Date: Tue Mar 20 15:31:56 2007 -0300 [CCID3]: Use initial RTT sample from SYN exchange The patch follows the following recommendation made in an erratum to RFC 4342: "Senders MAY additionally make use of other available RTT measurements, including those from the initial Request-Response packet exchange." It implements larger initial windows with regard to this inital RTT measurement, using the mechanism suggested in draft-ietf-dccp-rfc3448bis, section 4.2. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 89560b53b92a07c529e13a462aa7fd87a844f1f5 Author: Gerrit Renker Date: Tue Mar 20 15:27:17 2007 -0300 [DCCP]: Sample RTT from SYN exchange Function: commit 7dfee1a9c07f80a82aa5fbad340146f2b5c794b4 Author: Gerrit Renker Date: Tue Mar 20 15:24:37 2007 -0300 [CCID3]: Use function for RTT sampling This replaces the existing occurrences of RTT sampling with the use of the new function dccp_sample_rtt. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 4712a792ee661921374c163eb6a4d06e33fd305f Author: Gerrit Renker Date: Tue Mar 20 15:23:18 2007 -0300 [DCCP]: Provide function for RTT sampling A recurring problem, in particular in the CCID code, is that RTT samples from packets with timestamp echo and elapsed time options need to be taken. This service is provided via a new function dccp_sample_rtt in this patch. Furthermore, to protect against `insane' RTT samples, the sampled value is bounded between 100 microseconds and 4 seconds - for which u32 is sufficient. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0c150efb280986db7958cf2a559b91d826241e59 Author: Gerrit Renker Date: Tue Mar 20 15:19:07 2007 -0300 [CCID3]: Handle Idle and Application-Limited periods This updates the code with regard to handling idle and application-limited periods as specified in [RFC 4342, 5.1]. Background: commit a21f9f96cd035b0d9aec32d80ea0152672fbed42 Author: Gerrit Renker Date: Tue Mar 20 15:12:10 2007 -0300 [CCID3]: Wrap computation of RFC3390-initial rate into separate function The CCID 3 and TFRC specs (RFC 4342, RFC 3448, draft-3448bis) make frequent reference to the computation of the RFC-3390 initial sending rate: 1. Initial sending rate when RTT is known (RFC 4342, p. 6) 2. Response to Idle/Application-Limited periods (RFC 4342, 5.1) This warrants putting the code into its own function, for later code reuse. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 1761f7d7fea32c2290710f5c0afa0c3d93220593 Author: Gerrit Renker Date: Tue Mar 20 15:04:30 2007 -0300 [CCID3]: Remove build warnings for 64bit This clears the following sparc64 build warnings: 1) warning: format "%ld" expects type "long int", but argument 3 has type "suseconds_t" 2) warning: format "%llu" expects type "long long unsigned int", but argument 3 has type "__u64" Fixed by using typecast to unsigned. This is argued to be safe, since the quantities, after de-scaling (factor 2^6) fit all in u32. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit fddc2feb94c1f734dc27d44d166e97ab2e005ec1 Author: Gerrit Renker Date: Tue Mar 20 15:02:10 2007 -0300 [CCID3]: More to see in dccp_probe This adds a few more fields of interest to /proc/net/dccpprobe, the following output ensues: 1 2 3 4 5 6 7 8 9 10 11 sec.usec src:sport dst:dport size s rtt p X_calc X_recv X t_ipi Also made the formatting consistent. Scripts that go with this can be downloaded from http://139.133.210.30/users/gerrit/dccp/dccp_probe/ Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit f2645101350c6db66f0a1e72648909cc411f2b38 Author: Gerrit Renker Date: Tue Mar 20 15:01:14 2007 -0300 [CCID3]: Add documentation for socket options This updates the documentation on CCID3-specific options. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 6626e3628fe42837f733d103e194c6b4473d8669 Author: Gerrit Renker Date: Tue Mar 20 15:00:28 2007 -0300 [DCCP]: More debug information for dccp_wait_for_ccid This adds more detail in the wait_for_ccid packet scheduling loop. In particular, it informs about (i) when delay is used and (ii) why a packet is discarded. Signed-off-by: Gerrit Renker Signed-off-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit ac12b0c49571fe4c3a2f4957ed494da316d558be Author: Gerrit Renker Date: Tue Mar 20 14:59:23 2007 -0300 [DCCP]: Always use debug-toggle parameters Currently debugging output (when configured) is automatically enabled when DCCP modules are compiled into the kernel rather than built as loadable modules. This is not necessary, since the module parameters in this case become kernel commandline parameters, e.g. DCCP or CCID3 debug output can be enabled for a static build by appending the following at the boot prompt: dccp.dccp_debug=1 dccp_ccid3.ccid3_debug=1 This patch therefore does away with the more complicated way of always enabling debug output for static builds Signed-off-by: Gerrit Renker Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 1266adee12d25385a25e1c57b1e3ff05a90bb4d7 Author: Gerrit Renker Date: Tue Mar 20 14:56:11 2007 -0300 [CCID3]: Remove race condition and update t_ipi when `s' changes This: 1. removes a race condition in the access to the scheduled send time t_nom which results from allowing asynchronous r/w access to t_nom without locks; 2. updates the inter-packet interval t_ipi = s/X when `s' changes, following a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 8699be7d240e37c91a84bdf32e79941d72bc7bd5 Author: Ian McDonald Date: Tue Mar 20 14:49:20 2007 -0300 [CCID3]: More verbose debugging This adds a few debugging statements to ccid3.c Signed-off-by: Ian McDonald Signed-off-by: Gerrit Renker Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 551dc5f7a11cfb66685bfd36cbbdb209c5a11d14 Author: Ian McDonald Date: Tue Mar 20 14:46:52 2007 -0300 [CCID3]: Fix use of invalid loss intervals This fixes a bug which uses an invalid comparison. The bug resulted in the use of invalid loss intervals. Signed-off-by: Ian McDonald Acked-by: Gerrit Renker Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 371fe7779cad6557a58df9a1b5543652e067400f Author: Gerrit Renker Date: Tue Mar 20 14:28:44 2007 -0300 [CCID3]: Use MSS for larger initial windows This improves the slow-start phase by using the MSS (as suggested in RFC 4342, sec. 5) instead of the packet size s. Also figured out that __u32 is ample resource enough. After applying, I got the following in the logs: ccid3_hc_tx_packet_recv: client(f7421700), s=6, MSS=1424, w_init=4380, R_sample=176us, X=24886363 Had the previous variant been used, w_init would have been as low as 24. Committer note: removed unneeded cast to unsigned long long that was causing a compiler warning on 64bit architectures. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 9bf17475eb658a920125bd8f05edf9c57c2dd950 Author: Gerrit Renker Date: Tue Mar 20 13:11:24 2007 -0300 [CCID3]: Re-order CCID 3 source file No code change at all. This splits ccid3.c into a RX and a TX section, so that the file has an organisation similar to the other ones (e.g. packet_history.{h,c}). Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 353b13e10a3f1a18c6b33858fb3337bcd2692eb5 Author: Gerrit Renker Date: Tue Mar 20 13:10:15 2007 -0300 [CCID3]: Remove redundant `len' test Since CCID3 avoids sending 0-byte data packets (cf. ccid3_hc_tx_send_packet), testing for zero-payload length, as performed by ccid3_hc_tx_update_s, is redundant - hence removed by this patch. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 8d13bf9a0bd4984756e234ce54299b92acefab99 Author: Gerrit Renker Date: Tue Mar 20 13:08:19 2007 -0300 [DCCP]: Remove ambiguity in the way before48 is used This removes two ambiguities in employing the new definition of before48, following the analysis on http://www.mail-archive.com/dccp@vger.kernel.org/msg01295.html (1) Updating GSR when P.seqno >= S.SWL With the old definition we did not update when P.seqno and S.SWL are 2^47 apart. To ensure the same behaviour as with the old definition, this is replaced with the equivalent condition dccp_delta_seqno(S.SWL, P.seqno) >= 0 (2) Sending SYNC when P.seqno >= S.OSR Here it is debatable whether the new definition causes an ambiguity: the case is similar to (1); and to have consistency with the case (1), we use the equivalent condition dccp_delta_seqno(S.OSR, P.seqno) >= 0 Detailed Justification commit b16be51b5e5d75cec71b18ebc75f15a4734c62ad Author: Gerrit Renker Date: Tue Mar 20 13:03:47 2007 -0300 [DCCP]: Fix for follows48 The follows48 relation identifies whether 48-bit sequence number x is the direct successor of y. Currently, it does not handle cases of the following type correctly: follows48(0x(prefix)10000LL, 0x(prefix)0FFFFLL) where prefix is an arbitrary hex sequence of up to 7 digits. This is fixed by reusing the new dccp_delta_seqno function. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit d52de17b8cf36d43a9d6977e7861a9f415541c6b Author: Gerrit Renker Date: Tue Mar 20 13:00:26 2007 -0300 [DCCP]: Make `before' relation unambiguous Problem: commit 0aec51c86986f61de26dd04913667af544a8b8eb Author: Gerrit Renker Date: Tue Mar 20 12:45:59 2007 -0300 [DCCP]: Make dccp_delta_seqno return signed numbers Problem: commit 6b811d43f6cc9eccdfc011a99f8571df2abc46d1 Author: Gerrit Renker Date: Tue Mar 20 12:26:51 2007 -0300 [DCCP]: 48-bit sequence number arithmetic This patch * organizes the sequence arithmetic functions into one corner of dccp.h * performs a small modification of dccp_set_seqno to make it more widely reusable (now it is safe to use any number, since it performs modulo-2^48 assignment) * adds functions and generic macros for 48-bit sequence arithmetic: --48 bit complement --modulo-48 addition and modulo-48 subtraction --dccp_inc_seqno now a special case of add48 Constants renamed following a suggestion by Arnaldo. Signed-off-by: Gerrit Renker Acked-by: Ian McDonald Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 8b5be26831b973d8013e8b4c9860d9694310cdc6 Author: Arnaldo Carvalho de Melo Date: Tue Mar 20 12:08:20 2007 -0300 [FORCEDETH]: Use skb_tailroom where appropriate Reducing the number of skb->data direct accesses. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit d004b8d4903180c111e114726982c194adf2a04f Author: Arnaldo Carvalho de Melo Date: Tue Mar 20 12:00:44 2007 -0300 [LMC]: lmc_main wants to use skb_tailroom At that point it is equivalent to what was being used, skb->end - skb->data, and the need is clearly the one skb_tailroom satisfies. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit f2adc9866742e7904f0268824edc53c948741415 Author: Arnaldo Carvalho de Melo Date: Tue Mar 20 11:52:34 2007 -0300 [ATM] idt77252: Fix double kfree_skb on failure in push_rx_skb Signed-off-by: Arnaldo Carvalho de Melo commit 6b88dd966b42e374dc783c397efc15f5c1458265 Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 22:29:03 2007 -0300 [SK_BUFF] ipv6: Use skb_network_offset in some more places So that we reduce the number of direct accesses to skb->data. Signed-off-by: Arnaldo Carvalho de Melo commit dc5fc579b90ed0a9a4e55b0218cdbaf0a8cf2e67 Author: Arnaldo Carvalho de Melo Date: Sun Mar 25 23:06:12 2007 -0700 [NETLINK]: Use nlmsg_trim() where appropriate Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit a36ca733375860b389c15ffdf6a5f92df64a33b6 Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 22:28:08 2007 -0300 [NETLINK]: Remove NLMSG_{NEW_ANSWER,CANCEL,END} Not used anywhere and defined inside __KERNEL__, Thomas acked this on irc. Signed-off-by: Arnaldo Carvalho de Melo commit 897933bcdf31c372e029dd4e2ecd573ebe6cfd9c Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 22:27:36 2007 -0300 [SK_BUFF]: Remove skb_add_mtu() leftovers Signed-off-by: Arnaldo Carvalho de Melo commit b529ccf2799c14346d1518e9bdf1f88f03643e99 Author: Arnaldo Carvalho de Melo Date: Wed Apr 25 19:08:35 2007 -0700 [NETLINK]: Introduce nlmsg_hdr() helper For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the number of direct accesses to skb->data and for consistency with all the other cast skb member helpers. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 965ffea43d4ebe8cd7b9fee78d651268dd7d23c5 Author: Robert Olsson Date: Mon Mar 19 16:29:58 2007 -0700 [IPV4]: fib_trie root node settings The threshold for root node can be more aggressive set to get better tree compression. The new setting mekes the root grow from 16 to 19 bits and substansial improvemnt in Aver depth this with the current table of 214393 prefixes But really the dynamic resize should need more investigation both in terms convergence and performance and maybe it should be possible to change... Maybe just for the brave to start with or we may have to back this out. commit 05eee48c5af8213a71bd908ce17f577b2b776f79 Author: Robert Olsson Date: Mon Mar 19 16:27:37 2007 -0700 [IPV4]: fib_trie resize break The patch below adds break condition for the resize operations. If we don't achieve the desired fill factor a warning is printed. Trie should still be operational but new thresholds should be considered. Signed-off-by: Robert Olsson Signed-off-by: David S. Miller commit ca0605a7c8a42379c695308944b3ae82a85479f1 Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 10:48:59 2007 -0300 [SK_BUFF]: Adjust the zeroing up to tail in __alloc_skb too I did it just in alloc_skb_from_cache, forgot __alloc_skb, fixed now. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 4305b541357ddbd205aa145dc378926b7cb12283 Author: Arnaldo Carvalho de Melo Date: Thu Apr 19 20:43:29 2007 -0700 [SK_BUFF]: Convert skb->end to sk_buff_data_t Now to convert the last one, skb->data, that will allow many simplifications and removal of some of the offset helpers. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 27a884dc3cb63b93c2b3b643f5b31eed5f8a4d26 Author: Arnaldo Carvalho de Melo Date: Thu Apr 19 20:29:13 2007 -0700 [SK_BUFF]: Convert skb->tail to sk_buff_data_t So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit be8bd86321fa7f06359d866ef61fb4d2f3e9dce9 Author: David S. Miller Date: Thu Apr 19 20:34:51 2007 -0700 [VLAN] vlan_dev: Use skb_reset_network_header(). Signed-off-by: David S. Miller commit afdf27c95629634ea40703197b6788e454d31609 Author: Peter Kovar Date: Fri Mar 16 20:39:25 2007 -0700 [IrDA]: SMC SuperIO Chip LPC47N227 not identified properly SMC SuperIO Chip LPC47N227 used for IrDA is not detected because its device identification byte can be 0x7A instead of 0x5A. Patch from Peter Kovar Cc: Jean Delvare Signed-off-by: Andrew Morton Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit c7630a4b932af254d61947a3a7e3831de92c7fb5 Author: Samuel Ortiz Date: Fri Mar 16 20:38:23 2007 -0700 [IrDA]: irda lockdep annotation Rmmoding irda triggers a lockdep false positive. Reported-by: Dave Jones Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit 5c81cd75fa63eaf2df0b8904508e53e953f316cf Author: Samuel Ortiz Date: Fri Mar 16 20:35:25 2007 -0700 [IrDA]: removing stir4200 useless include stir4200 doesn't need to include irlap.h Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller commit 2e07fa9cd3bac1e28cfe3131ed86b053afb02fc9 Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 21:22:35 2007 -0700 [SK_BUFF]: Use offsets for skb->{mac,network,transport}_header on 64bit architectures With this we save 8 bytes per network packet, leaving a 4 bytes hole to be used in further shrinking work, likely with the offsetization of other pointers, such as ->{data,tail,end}, at the cost of adds, that were minimized by the usual practice of setting skb->{mac,nh,n}.raw to a local variable that is then accessed multiple times in each function, it also is not more expensive than before with regards to most of the handling of such headers, like setting one of these headers to another (transport to network, etc), or subtracting, adding to/from it, comparing them, etc. Now we have this layout for sk_buff on a x86_64 machine: [acme@mica net-2.6.22]$ pahole vmlinux sk_buff struct sk_buff { struct sk_buff * next; /* 0 8 */ struct sk_buff * prev; /* 8 8 */ struct rb_node rb; /* 16 24 */ struct sock * sk; /* 40 8 */ ktime_t tstamp; /* 48 8 */ struct net_device * dev; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct net_device * input_dev; /* 64 8 */ sk_buff_data_t transport_header; /* 72 4 */ sk_buff_data_t network_header; /* 76 4 */ sk_buff_data_t mac_header; /* 80 4 */ /* XXX 4 bytes hole, try to pack */ struct dst_entry * dst; /* 88 8 */ struct sec_path * sp; /* 96 8 */ char cb[48]; /* 104 48 */ /* cacheline 2 boundary (128 bytes) was 24 bytes ago*/ unsigned int len; /* 152 4 */ unsigned int data_len; /* 156 4 */ unsigned int mac_len; /* 160 4 */ union { __wsum csum; /* 4 */ __u32 csum_offset; /* 4 */ }; /* 164 4 */ __u32 priority; /* 168 4 */ __u8 local_df:1; /* 172 1 */ __u8 cloned:1; /* 172 1 */ __u8 ip_summed:2; /* 172 1 */ __u8 nohdr:1; /* 172 1 */ __u8 nfctinfo:3; /* 172 1 */ __u8 pkt_type:3; /* 173 1 */ __u8 fclone:2; /* 173 1 */ __u8 ipvs_property:1; /* 173 1 */ /* XXX 2 bits hole, try to pack */ __be16 protocol; /* 174 2 */ void (*destructor)(struct sk_buff *); /* 176 8 */ struct nf_conntrack * nfct; /* 184 8 */ /* --- cacheline 3 boundary (192 bytes) --- */ struct sk_buff * nfct_reasm; /* 192 8 */ struct nf_bridge_info *nf_bridge; /* 200 8 */ __u16 tc_index; /* 208 2 */ __u16 tc_verd; /* 210 2 */ dma_cookie_t dma_cookie; /* 212 4 */ __u32 secmark; /* 216 4 */ __u32 mark; /* 220 4 */ unsigned int truesize; /* 224 4 */ atomic_t users; /* 228 4 */ unsigned char * head; /* 232 8 */ unsigned char * data; /* 240 8 */ unsigned char * tail; /* 248 8 */ /* --- cacheline 4 boundary (256 bytes) --- */ unsigned char * end; /* 256 8 */ }; /* size: 264, cachelines: 5 */ /* sum members: 260, holes: 1, sum holes: 4 */ /* bit holes: 1, sum bit holes: 2 bits */ /* last cacheline: 8 bytes */ On 32 bits nothing changes, and pointers continue to be used with the compiler turning all this abstraction layer into dust. But there are some sk_buff validation tricks that are now possible, humm... :-) Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit b0e380b1d8a8e0aca215df97702f99815f05c094 Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 21:21:55 2007 -0700 [SK_BUFF]: unions of just one member don't get anything done, kill them Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and skb->mac to skb->mac_header, to match the names of the associated helpers (skb[_[re]set]_{transport,network,mac}_header). Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit cfe1fc7759fdacb0c650b575daed1692bf3eaece Author: Arnaldo Carvalho de Melo Date: Fri Mar 16 17:26:39 2007 -0300 [SK_BUFF]: Introduce skb_network_header_len For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len, that is precalculated tho, don't think we need to bloat skb with one more member, so just use this new helper, reducing the number of non-skbuff.h references to the layer headers even more. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit bff9b61ce330df04c6830d823c30c04203543f01 Author: Arnaldo Carvalho de Melo Date: Fri Mar 16 17:19:57 2007 -0300 [SK_BUFF]: Use the helpers to get the layer header pointer Some more cases... Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 514bca322cb9220308d22691ac1e74038bfabac3 Author: Patrick McHardy Date: Fri Mar 16 12:34:52 2007 -0700 [NET_SCHED]: Fix warning net/sched/sch_api.c: In function 'psched_show': net/sched/sch_api.c:1219: warning: format '%08x' expects type 'unsigned int', but argument 6 has type 's64' Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit bb239acf5679ee1936f6b1b034ad260c4fec89c8 Author: Patrick McHardy Date: Fri Mar 16 12:31:28 2007 -0700 [NET_SCHED]: sch_cbq: fix watchdog scheduled too late q->now is increased during dequeue and doesn't contain the current time afterwards, resulting in a too large timeout value for the qdisc watchdog. Use "now" instead, which still contains the current time. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 4361cb17f0df5491fe6e2c3ae1defc98e9a64a79 Author: Patrick McHardy Date: Fri Mar 16 01:23:28 2007 -0700 [NET_SCHED]: Export real timer resolution in /proc/net/psched The timer resolution exported in /proc/net/psched is used by userspace to calculate HTB's burst values. Currently it is set to HZ, since we're now using hrtimers, use KTIME_MONOTONIC_RES, which makes HTB use smaller burst values. This patch also affects libnl, which incorrectly uses this value for the SFQ perturbation parameter, which is always in seconds, and some routing cache values, which are in USER_HZ, so both cases are broken anyway. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 00c04af9df3d26e5a8093da850e982a7b6aeada7 Author: Patrick McHardy Date: Fri Mar 16 01:23:02 2007 -0700 [NET_SCHED]: kill jiffie conversion macros Now that all packet schedulers have been converted to hrtimers most users of PSCHED_JIFFIE2US and PSCHED_US2JIFFIE are gone. The remaining users use it to convert external time units to packet scheduler clock ticks, so use PSCHED_TICKS_PER_SEC instead. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit fb983d4578e238b7f483b4f8f39f3a0f35d34d16 Author: Patrick McHardy Date: Fri Mar 16 01:22:39 2007 -0700 [NET_SCHED]: sch_htb: use hrtimer based watchdog Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 1a13cb63d679da328cfa339c89b8b2d0eba3b81e Author: Patrick McHardy Date: Fri Mar 16 01:22:20 2007 -0700 [NET_SCHED]: sch_cbq: use hrtimer for delay_timer Switch delay_timer to hrtimer. The class penalty parameter is changed to use psched ticks as units. Since iproute never supported using this and the only existing user (libnl) incorrectly assumes psched ticks as units anyway, this shouldn't break anything. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit e9054a339eb275c756efeeaee42af484ac72a3f4 Author: Patrick McHardy Date: Fri Mar 16 01:21:40 2007 -0700 [NET_SCHED]: sch_cbq: fix cbq_undelay_prio for non-active priorites cbq_undelay_prio is supposed to return a time delta, but returns the current time for non-active priorities, causing cbq_undelay to mark the priority as active and schedule a timer for twice the current time. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 88a993540a65c38865f83961520494b4ad5d0363 Author: Patrick McHardy Date: Fri Mar 16 01:21:11 2007 -0700 [NET_SCHED]: sch_cbq: use hrtimer based watchdog Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 59cb5c6734021acc68590c7c2e0e92ad9a4952c6 Author: Patrick McHardy Date: Fri Mar 16 01:20:31 2007 -0700 [NET_SCHED]: sch_netem: use hrtimer based watchdog Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit f7f593e383145931cb2a65df62c31ce1bcc0cffc Author: Patrick McHardy Date: Fri Mar 16 01:20:07 2007 -0700 [NET_SCHED]: sch_tbf: use hrtimer based watchdog Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit ed2b229a97fd537857ad8441ab8b5996b15eadfd Author: Patrick McHardy Date: Fri Mar 16 01:19:33 2007 -0700 [NET_SCHED]: sch_hfsc: use hrtimer based watchdog Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 4179477f637caa730626bd597fdf28c5bad73565 Author: Patrick McHardy Date: Fri Mar 16 01:19:15 2007 -0700 [NET_SCHED]: Add hrtimer based qdisc watchdog Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 641b9e0e8b7f96425da6ce98f3361e3af0baee29 Author: Patrick McHardy Date: Fri Mar 16 01:18:42 2007 -0700 [NET_SCHED]: Use ktime as clocksource Get rid of the manual clock source selection mess and use ktime. Also use a scalar representation, which allows to clean up pkt_sched.h a bit more and results in less ktime_to_ns() calls in most cases. The PSCHED_US2JIFFIE/PSCHED_JIFFIE2US macros are implemented quite inefficient by this patch, following patches will convert all qdiscs to hrtimers and get rid of them entirely. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit ddc7b8e32b22fe8b45d306b7d99472d4b560add6 Author: Arnaldo Carvalho de Melo Date: Thu Mar 15 21:42:27 2007 -0300 [SK_BUFF]: Some more layer header conversions Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0a6114d94b6d6f82e81cb8e0d8b0d4cf50739fec Author: Arnaldo Carvalho de Melo Date: Thu Mar 15 21:08:55 2007 -0300 [KBUILD]: Unifdef headers changed by the skb layer header refactorings Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit d10ba34b001944a8d1c8adb5646140ef089c432b Author: Arnaldo Carvalho de Melo Date: Wed Mar 14 21:05:37 2007 -0300 [SK_BUFF]: More skb_put related skb_reset_transport_header This time we have to set it to skb->tail that is not anymore equal to skb->data, so we either add a new helper or just add the skb->tail - skb->data offset, for now do the later. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 55f79cc0c02f9ce8f85e965e9679796f62b790f5 Author: Arnaldo Carvalho de Melo Date: Wed Mar 14 21:05:03 2007 -0300 [IPV6]: Reset the network header in ip6_nd_hdr ip6_nd_hdr is always called immediately after a alloc_skb + skb_reserve sequence, i.e. when skb->tail is equal to skb->data, making it correct to use skb_reset_network_header(). Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit eeeb03745bf9ea352df2373b9cb5fa14e60a2de0 Author: Arnaldo Carvalho de Melo Date: Wed Mar 14 21:04:34 2007 -0300 [SK_BUFF]: More skb_put related conversions to skb_reset_transport_header This is similar to the skb_reset_network_header(), i.e. at the point we reset the transport header pointer/offset skb->tail is equal to skb->data. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit ac6d141dc7d1d0eeec850d1b451dca83ce649684 Author: Pablo Neira Ayuso Date: Wed Mar 14 16:45:39 2007 -0700 [NETFILTER]: nfnetlink: parse attributes with nfattr_parse in nfnetlink_check_attribute Use nfattr_parse to parse attributes, this patch also modifies the default behaviour since unknown attributes will be ignored instead of returning EINVAL. This ensure backward compatibility: new libraries with new attributes and old kernels can work. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit c8e2078cfe414a99cf6f2f2f1d78c7e75392e9d4 Author: Pablo Neira Ayuso Date: Wed Mar 14 16:45:19 2007 -0700 [NETFILTER]: ctnetlink: add support for internal tcp connection tracking flags handling This patch let userspace programs set the IP_CT_TCP_BE_LIBERAL flag to force the pickup of established connections. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 5c8ce7c92106434d2bdc9d5dfa5f62bf4546b296 Author: Willy Tarreau Date: Wed Mar 14 16:44:53 2007 -0700 [NETFILTER]: TCP conntrack: factorize out the PUSH flag The PUSH flag is accepted with every other valid combination. Let's get it out of the tcp_valid_flags table and reduce the number of combinations we have to handle. This does not significantly reduce the table size however (8 bytes). Signed-off-by: Willy Tarreau Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 8f5bd99071212cd16b3449d16639971a44540d51 Author: Willy Tarreau Date: Wed Mar 14 16:44:31 2007 -0700 [NETFILTER]: TCP conntrack: accept RST|PSH as valid This combination has been encountered on an IBM AS/400 in response to packets sent to a closed session. There is no particular reason to mark it invalid. Signed-off-by: Willy Tarreau Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit e7ac05f3407a3fb5a1b2ff5d5554899eaa0a10a3 Author: Yasuyuki Kozakai Date: Wed Mar 14 16:44:01 2007 -0700 [NETFILTER]: nf_conntrack: add nf_copy() to safely copy members in skb This unifies the codes to copy netfilter related datas. Before copying, nf_copy() puts original members in destination skb. Signed-off-by: Yasuyuki Kozakai Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit edda553c324bdc5bb5c2d553b524cab37058a855 Author: Yasuyuki Kozakai Date: Wed Mar 14 16:43:37 2007 -0700 [NETFILTER]: nf_conntrack: add __nf_copy() to copy members in skb This unifies the codes to copy netfilter related datas. Note that __nf_copy() assumes destination skb doesn't have any netfilter related members. Signed-off-by: Yasuyuki Kozakai Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 9b88790972498d235a2a4d2b66640c3c5b70bb7c Author: Sami Farin Date: Wed Mar 14 16:43:00 2007 -0700 [NETFILTER]: nf_conntrack: use jhash2 in __hash_conntrack Now it uses jhash, but using jhash2 would be around 3-4 times faster (on P4). Signed-off-by: Sami Farin Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 8e87e014ec881ce353e1f43340157f519b5d9f30 Author: Patrick McHardy Date: Wed Mar 14 16:42:29 2007 -0700 [JHASH]: Use const in jhash2 Use const to avoid forcing users to cast const data. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit f4bc177f0ff0bf41b178452877762a9f0184d1a1 Author: Pablo Neira Ayuso Date: Wed Mar 14 16:42:11 2007 -0700 [NETFILTER]: nfnetlink: move EXPORT_SYMBOL declarations next to the exported symbol Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 8a2e89533a9b06bc960445dd6034eeab76117424 Author: Pablo Neira Ayuso Date: Wed Mar 14 16:41:47 2007 -0700 [NETFILTER]: nfnetlink: remove unused includes in nfnetlink.c Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit ac0f1d9894650d900af99bdaed83e110d9dce025 Author: Pablo Neira Ayuso Date: Wed Mar 14 16:41:28 2007 -0700 [NETFILTER]: nfnetlink: remove unrequired check in nfnetlink_get_subsys subsys_table is initialized to NULL, therefore just returns NULL in case that it is not set. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit d9e6d029498ab9e943c70f24c027aeda5602196d Author: Pablo Neira Ayuso Date: Wed Mar 14 16:41:03 2007 -0700 [NETFILTER]: nfnetlink: remove duplicate checks in nfnetlink_check_attributes Remove nfnetlink_check_attributes duplicates message size and callback id checks. nfnetlink_find_client and nfnetlink_rcv_msg already do such checks. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 67ca396606432aae3b747d5e6bb61d0c297eb782 Author: Pablo Neira Ayuso Date: Wed Mar 14 16:40:38 2007 -0700 [NETFILTER]: nfnetlink: remove early debugging messages from nfnetlink Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 010c7d6f867e98c86723f420d485583464fbab45 Author: Patrick McHardy Date: Wed Mar 14 16:40:10 2007 -0700 [NETFILTER]: nf_conntrack: uninline notifier registration functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 73c361862c2be2e4ed6019da283fe1b422107f16 Author: Patrick McHardy Date: Wed Mar 14 16:39:45 2007 -0700 [NETFILTER]: nfnetlink: use netlink_run_queue() Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit a3c5029cf7a96da3acdf6884a21581b5bef310c3 Author: Patrick McHardy Date: Wed Mar 14 16:39:25 2007 -0700 [NETFILTER]: nfnetlink: use mutex instead of semaphore Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit c6a1e615d1ba942b9e783079d53f741e4a8e1c89 Author: Patrick McHardy Date: Wed Mar 14 16:39:07 2007 -0700 [NETFILTER]: nf_conntrack: simplify l4 protocol array allocation The retrying after an allocation failure is not necessary anymore since we're holding the mutex the entire time, for the same reason the double allocation race can't happen anymore. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 0661cca9c216322e77dca7f47df107c02ce4e70c Author: Patrick McHardy Date: Wed Mar 14 16:38:48 2007 -0700 [NETFILTER]: nf_conntrack: simplify protocol locking Now that we don't use nf_conntrack_lock anymore but a single mutex for all protocol handling, no need to release and grab it again for sysctl registration. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit ac5357ebac43e191003c2cd0722377dccfa01a84 Author: Patrick McHardy Date: Wed Mar 14 16:38:25 2007 -0700 [NETFILTER]: nf_conntrack: remove ugly hack in l4proto registration Remove ugly special-casing of nf_conntrack_l4proto_generic, all it wants is its sysctl tables registered, so do that explicitly in an init function and move the remaining protocol initialization and cleanup code to nf_conntrack_proto.c as well. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit b19caa0ca071dce76b0e81e957e7eb7c03d72cf5 Author: Patrick McHardy Date: Wed Mar 14 16:37:52 2007 -0700 [NETFILTER]: nf_conntrack: switch protocol registration/unregistration to mutex The protocol lookups done by nf_conntrack are already protected by RCU, there is no need to keep taking nf_conntrack_lock for registration and unregistration. Switch to a mutex. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 587aa64163bb14f70098f450abab9410787fce9d Author: Patrick McHardy Date: Wed Mar 14 16:37:25 2007 -0700 [NETFILTER]: Remove IPv4 only connection tracking/NAT Remove the obsolete IPv4 only connection tracking/NAT as scheduled in feature-removal-schedule. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit ce18afe57bf53477f133208856dd2b7e6b5db5e3 Author: Tobias Klauser Date: Wed Mar 14 16:36:16 2007 -0700 [NETFILTER]: x_tables: remove duplicate of xt_prefix Remove xt_proto_prefix array which duplicates xt_prefix and change all users of xt_proto_prefix to xt_prefix. Signed-off-by: Tobias Klauser Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller commit 239254fedcbc6ff79bcf5696fe94723f7a5d0782 Author: David S. Miller Date: Thu Apr 19 19:55:44 2007 -0700 [IPV4] xfrm4_mode_beet: Use skb_transport_header(). Signed-off-by: David S. Miller commit 9c70220b73908f64792422a2c39c593c4792f2c5 Author: Arnaldo Carvalho de Melo Date: Wed Apr 25 18:04:18 2007 -0700 [SK_BUFF]: Introduce skb_transport_header(skb) For the places where we need a pointer to the transport header, it is still legal to touch skb->h.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit a27ef749e7be3b06fb58df53d94eb97a21f18707 Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 17:17:10 2007 -0300 [SCTP]: Eliminate some pointer attributions to the skb layer headers Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit bd82393ca23324d103b21aae43160728da6e6c9c Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 17:10:43 2007 -0300 [SK_BUFF]: More skb_reset_transport_header conversions These are a bit more subtle, they are of this type: - skb->h.raw = payload; __skb_pull(skb, payload - skb->data); + skb_reset_transport_header(skb); __skb_pull results in: skb->data = skb->data + payload - skb->data; skb->data = payload; So after __skb_pull we have skb->data pointing to payload and we can just call skb_reset_transport_header(skb), that will do: skb->h.raw = payload; The others are similar, allowing us to get rid of some more cases where a pointer was being attributed to the layer headers. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 39b89160df691045d1449cbaef43c02084c7543a Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 21:06:25 2007 -0700 [SK_BUFF]: Introduce ipipv6_hdr(), remove skb->h.ipv6h Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit b0061ce49c83657563b64ffcf1ec137110230d93 Author: Arnaldo Carvalho de Melo Date: Wed Apr 25 18:02:22 2007 -0700 [SK_BUFF]: Introduce ipip_hdr(), remove skb->h.ipiph Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit aa8223c7bb0b05183e1737881ed21827aa5b9e73 Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 21:04:22 2007 -0700 [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit ab6a5bb6b28a970104a34f0f6959b73cf61bdc72 Author: Arnaldo Carvalho de Melo Date: Sun Mar 18 17:43:48 2007 -0700 [TCP]: Introduce tcp_hdrlen() and tcp_optlen() The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to avoid the longer, open coded equivalent. Ditched a no-op in bnx2 in the process. I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()... Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 88c7664f13bd1a36acb8566b93892a4c58759ac6 Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 14:43:18 2007 -0300 [SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 4bedb45203eab92a87b4c863fe2d0cded633427f Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 14:28:48 2007 -0300 [SK_BUFF]: Introduce udp_hdr(), remove skb->h.uh Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit d9edf9e2be0f7661558984c32bd53867a7037fd3 Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 14:19:23 2007 -0300 [SK_BUFF]: Introduce igmp_hdr() & friends, remove skb->h.igmph Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit cc70ab261c9f997589546100ddec5da6bfd89c4e Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 14:03:22 2007 -0300 [ICMP6]: Introduce icmp6_hdr() For consistency with all the other skb->h.raw accessors. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 2c0fd387b00a6758550b5ca1aae4408374483fe7 Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 13:59:32 2007 -0300 [SCTP]: Introduce sctp_hdr() For consistency with all the other skb->h.raw accessors. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 967b05f64e27d04a4c8879addd0e1c52137e2c9e Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 13:51:52 2007 -0300 [SK_BUFF]: Introduce skb_set_transport_header For the cases where the transport header is being set to a offset from skb->data. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit ea2ae17d6443abddc79480dc9f7af8feacabddc4 Author: Arnaldo Carvalho de Melo Date: Wed Apr 25 17:55:53 2007 -0700 [SK_BUFF]: Introduce skb_transport_offset() For the quite common 'skb->h.raw - skb->data' sequence. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit badff6d01a8589a1c828b0bf118903ca38627f4e Author: Arnaldo Carvalho de Melo Date: Tue Mar 13 13:06:52 2007 -0300 [SK_BUFF]: Introduce skb_reset_transport_header(skb) For the common, open coded 'skb->h.raw = skb->data' operation, so that we can later turn skb->h.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple cases: skb->h.raw = skb->data; skb->h.raw = {skb_push|[__]skb_pull}() The next ones will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0660e03f6b18f19b6bbafe7583265a51b90daf36 Author: Arnaldo Carvalho de Melo Date: Wed Apr 25 17:54:47 2007 -0700 [SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h Now the skb->nh union has just one member, .raw, i.e. it is just like the skb->mac union, strange, no? I'm just leaving it like that till the transport layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or ->mac_header_offset?), ditto for ->{h,nh}. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit d0a92be05ed4aea7d35c2b257e3f9173565fe4eb Author: Arnaldo Carvalho de Melo Date: Mon Mar 12 20:56:31 2007 -0300 [SK_BUFF]: Introduce arp_hdr(), remove skb->nh.arph Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit fd74e6ccd522e2f26163eb5ac1abebcab2bd017c Author: Stephen Hemminger Date: Mon Mar 12 16:25:32 2007 -0700 [BRIDGE]: faster compare for link local addresses Use logic operations rather than memcmp() to compare destination address with link local multicast addresses. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit eddc9ec53be2ecdbf4efe0efd4a83052594f0ac0 Author: Arnaldo Carvalho de Melo Date: Fri Apr 20 22:47:35 2007 -0700 [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit e023dd643798c4f06c16466af90b4d250e4b8bd7 Author: Arnaldo Carvalho de Melo Date: Mon Mar 12 20:09:36 2007 -0300 [IPMR]: Fix bug introduced when converting to skb_network_reset_header Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit c9bdd4b5257406b0608385d19c40b5511decf4f6 Author: Arnaldo Carvalho de Melo Date: Mon Mar 12 20:09:15 2007 -0300 [IP]: Introduce ip_hdrlen() For the common sequence "skb->nh.iph->ihl * 4", removing a good number of open coded skb->nh.iph uses, now to go after the rest... Just out of curiosity, here are the idioms found to get the same result: skb->nh.iph->ihl << 2 skb->nh.iph->ihl<<2 skb->nh.iph->ihl * 4 skb->nh.iph->ihl*4 (skb->nh.iph)->ihl * sizeof(u32) Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0272ffc46f81a4bbbf302ba093c737e969c5bb55 Author: Arnaldo Carvalho de Melo Date: Mon Mar 12 20:05:39 2007 -0300 [SK_BUFF] ipmr: Missed one conversion to skb_network_header() We can't access skb->nh.raw directly anymore, it will become an offset. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0e1256ffd1ec654b35e023c66f6b262d4cba91e9 Author: Stephen Hemminger Date: Mon Mar 12 14:35:37 2007 -0700 [NET]: show bound packet types Show what protocols are bound to what packet types in /proc/net/ptype Uses kallsyms to decode function pointers if possible. Example: Type Device Function ALL eth1 packet_rcv_spkt+0x0 0800 ip_rcv+0x0 0806 arp_rcv+0x0 86dd :ipv6:ipv6_rcv+0x0 Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit f690808e17925fc45217eb22e8670902ecee5c1b Author: Stephen Hemminger Date: Mon Mar 12 14:34:29 2007 -0700 [NET]: make seq_operations const The seq_file operations stuff can be marked constant to get it out of dirty cache. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 6b2bedc3a659ba228a93afc8e3f008e152abf18a Author: Stephen Hemminger Date: Mon Mar 12 14:33:50 2007 -0700 [NET]: network dev read_mostly For Eric, mark packet type and network device watermarks as read mostly. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit c14d2450cb7fe1786e2ec325172baf66922bf597 Author: Arnaldo Carvalho de Melo Date: Sun Mar 11 22:39:41 2007 -0300 [SK_BUFF]: Introduce skb_set_network_header For the cases where the network header is being set to a offset from skb->data. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 878c814500b123dd61a5e211879a32e5fd932713 Author: Arnaldo Carvalho de Melo Date: Sun Mar 11 22:38:29 2007 -0300 [SK_BUFF] ipmr: Another skb_push related conversion to skb_reset_network_header Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit d56f90a7c96da5187f0cdf07ee7434fe6aa78bbc Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 20:50:43 2007 -0700 [SK_BUFF]: Introduce skb_network_header() For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit bbe735e4247dba32568a305553b010081c8dea99 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 22:16:10 2007 -0300 [SK_BUFF]: Introduce skb_network_offset() For the quite common 'skb->nh.raw - skb->data' sequence. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit e7dd65dafda5737a983c04d652a69ab8da78ee3f Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 20:09:45 2007 -0300 [SK_BUFF] bonding: Set skb->nh.raw relative to skb->mac.raw Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 7f5c0cb05f158ee91414e1f99d3fe18349a80371 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 19:59:16 2007 -0300 [SK_BUFF] xfrm4: use skb_reset_network_header Setting it to skb->h.raw, which is valid, in the (to become) old pointer based world order and in the new world of offset based layer headers. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 1ced98e81d1c2f1ce965ecf8d0032e02ffa07bf0 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 19:57:15 2007 -0300 [SK_BUFF] ipv6: More skb_reset_network_header conversions related to skb_pull Now related to this form: skb->nh.ipv6h = (struct ipv6hdr *)skb_put(skb, length); That, as the others, is done when skb->tail is still equal to skb->data, making the conversion to skb_reset_network_header possible. Also one more case equivalent to skb->nh.raw = skb->data, of this form: iph = (struct ipv6hdr *)skb->data; skb->nh.ipv6h = iph; Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 8856dfa3e9b71ac2177016f66ace3a8978afecc1 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 19:40:39 2007 -0300 [SK_BUFF]: Use skb_reset_network_header after skb_push Some more cases where skb->nh.iph was being set that were converted to using skb_reset_network_header. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 04b964dbad25cbd6edd8ecbeca2efb40c9860865 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 19:27:27 2007 -0300 [SK_BUFF] ipconfig: Another conversion to skb_reset_network_header related to skb_put boot_pkt->iph is the first member, that is at skb->data, so just use skb_reset_network_header(). Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 2ca9e6f2c2a4117d21947e911ae1f5e5306b0df0 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 19:15:25 2007 -0300 [SK_BUFF]: Some more skb_put cases converted to skb_reset_network_header Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 31c7711b509d470ab1e175e7bb98ea66a82aa916 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 19:04:55 2007 -0300 [SK_BUFF]: Some more simple skb_reset_network_header conversions This time of the type: skb->nh.iph = (struct iphdr *)skb->data; That is completely equivalent to: skb->nh.raw = skb->data; Wonder why people love casts... :-) Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 4209fb601c0a0e0a9d90c0008f350dd345c8b7de Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 18:42:03 2007 -0300 [SK_BUFF]: Use skb_reset_network_header where the return of __pskb_pull was being used It returns skb->data, so we can just use skb_reset_network_header after it. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 7e28ecc282574a7d72ace365fc9bc86e27ba880f Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 18:40:59 2007 -0300 [SK_BUFF]: Use skb_reset_network_header where the skb_pull return was being used But only in the cases where its a newly allocated skb, i.e. one where skb->tail is equal to skb->data, or just after skb_reserve, where this requirement is maintained. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit e2d1bca7e6134671bcb19810d004a252aa6a644d Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 20:46:21 2007 -0700 [SK_BUFF]: Use skb_reset_network_header in skb_push cases skb_push updates and returns skb->data, so we can just call skb_reset_network_header after the call to skb_push. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit c1d2bbe1cd6c7bbdc6d532cefebb66c7efb789ce Author: Arnaldo Carvalho de Melo Date: Tue Apr 10 20:45:18 2007 -0700 [SK_BUFF]: Introduce skb_reset_network_header(skb) For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 57effc70a5be9f7804e9a99964eb7265367effca Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 16:21:45 2007 -0300 [IPV6]: Use skb->nh.ipv6h instead of casting skb->nh.raw nh.ipv6h is there exactly for this reason! Use it while it exists ;-) Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit a16aeb36239ce612699ed64a75a03c88cbc657e8 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 16:07:19 2007 -0300 [BONDING]: Introduce arp_pkt() For consistency with all the other skb->nh.raw accessors. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 797659fb4a4a511649cd71028141c32ad1698a12 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 15:56:08 2007 -0300 [PPPOE]: Introduce pppoe_hdr() For consistency with all the other skb->nh.raw accessors. Also do some really obvious simplifications in pppoe_recvmsg, well the kfree_skb one is not so obvious, but free() and kfree() have the same behaviour (hint :-) ). Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 37e6636669b0b996681586facee8034f7f674f6a Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 15:34:36 2007 -0300 [LLC]: Kill llc_set_pdu_hdr We'll have skb_reset_network_header soon. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 98e399f82ab3a6d863d1d4a7ea48925cc91c830e Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 15:33:04 2007 -0700 [SK_BUFF]: Introduce skb_mac_header() For the places where we need a pointer to the mac header, it is still legal to touch skb->mac.raw directly if just adding to, subtracting from or setting it to another layer header. This one also converts some more cases to skb_reset_mac_header() that my regex missed as it had no spaces before nor after '=', ugh. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 31713c333ddbb66d694829082620b69b71c4b09a Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 12:48:37 2007 -0300 [TCP]: Use skb_set_mac_header in tcp_collapse Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit c51957dafa6f960c5c6372aa3da6c8fa71c13730 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 12:47:22 2007 -0300 [TCP]: Do the layer header setting in tcp_collapse relative to skb->data That is equal to skb->head before skb_reserve, to help in the layer header changes. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 39f69c6f922fbfb51e1ff24c9e196584a79f1484 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 12:40:27 2007 -0300 [SK_BUFF] xfrm: Use skb_set_mac_header in the memmove cases Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 48d49d0ccdaa9caff4636ef9c3410973d28131b5 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 12:30:58 2007 -0300 [SK_BUFF]: Introduce skb_set_mac_header() For the cases where we want to set skb->mac.raw to an offset from skb->data. Simple cases first, the memmove ones and specially pktgen will be left for later. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit f64955eb117ad62480b858fd69a11e6f9e74f60b Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 12:17:29 2007 -0300 [LLC]: Use skb_reset_mac_header in llc_mac_hdr_init skb_push updates and returns skb->data, so we can just call skb_reset_mac_header after the call to skb_push. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0a1b0ad9ae27f918fd935c6da101083e11446f09 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 12:14:56 2007 -0300 [LLC]: Use skb_reset_mac_header in llc_alloc_frame skb->head is equal to skb->data after alloc_skb, so reset the mac header while this is true, i.e. before skb_reserve. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 459a98ed881802dee55897441bc7f77af614368e Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 15:30:44 2007 -0700 [SK_BUFF]: Introduce skb_reset_mac_header(skb) For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 4c13eb6657fe9ef7b4dc8f1a405c902e9e5234e0 Author: Arnaldo Carvalho de Melo Date: Wed Apr 25 17:40:23 2007 -0700 [ETH]: Make eth_type_trans set skb->dev like the other *_type_trans One less thing for drivers writers to worry about. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 029720f15dcd3c6c16824177cfc486083b229411 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 11:20:07 2007 -0300 [AOE]: Introduce aoe_hdr() For consistency with other skb->mac.raw users. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 4839fccea04b5f4d2b3ce01585d6bdbcbc24002c Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 11:13:59 2007 -0300 [QETH]: Use eth_hdr() Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 0a4f23fbbff70c268b0f2f5e0b87301c132fb305 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 10:57:13 2007 -0300 [HIPPI/FDDI]: Make {hippi,fddi}_type_trans set skb->dev Now all the _type_trans routines are consistent in this regard. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit c8fb7948dc1aeff0515b2912b564d4236f6c0ebd Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 15:29:16 2007 -0700 [TR]: Make tr_type_trans set skb->dev Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit c1a4b86e396b6870b420d23e4d49c7b685aef0a4 Author: Arnaldo Carvalho de Melo Date: Mon Mar 19 15:27:07 2007 -0700 [TR]: Use tr_hdr() were appropriate Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 7c81fd8bfbaa9732eca142350de5154da6919411 Author: Arnaldo Carvalho de Melo Date: Sat Mar 10 00:39:35 2007 -0300 [SOCKET]: Export __sock_recv_timestamp Kernel: arch/x86_64/boot/bzImage is ready (#2) MODPOST 1816 modules WARNING: "__sock_recv_timestamp" [net/sctp/sctp.ko] undefined! WARNING: "__sock_recv_timestamp" [net/packet/af_packet.ko] undefined! WARNING: "__sock_recv_timestamp" [net/key/af_key.ko] undefined! WARNING: "__sock_recv_timestamp" [net/ipv6/ipv6.ko] undefined! WARNING: "__sock_recv_timestamp" [net/atm/atm.ko] undefined! make[2]: *** [__modpost] Error 1 make[1]: *** [modules] Error 2 make: *** [_all] Error 2 Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 92f37fd2ee805aa77925c1e64fd56088b46094fc Author: Eric Dumazet Date: Sun Mar 25 22:14:49 2007 -0700 [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support Now that network timestamps use ktime_t infrastructure, we can add a new SOL_SOCKET sockopt SO_TIMESTAMPNS. This command is similar to SO_TIMESTAMP, but permits transmission of a 'timespec struct' instead of a 'timeval struct' control message. (nanosecond resolution instead of microsecond) Control message is labelled SCM_TIMESTAMPNS instead of SCM_TIMESTAMP A socket cannot mix SO_TIMESTAMP and SO_TIMESTAMPNS : the two modes are mutually exclusive. sock_recv_timestamp() became too big to be fully inlined so I added a __sock_recv_timestamp() helper function. Signed-off-by: Eric Dumazet CC: linux-arch@vger.kernel.org Signed-off-by: David S. Miller commit c7a3c5da35055e2fa97ed4f0da3eec4bd0ef4c38 Author: Arnaldo Carvalho de Melo Date: Fri Mar 9 13:51:54 2007 -0800 [UDP]: Use __skb_pull since we have checked it won't fail with pskb_may_pull Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller commit 6dea649a8a4c4b086227018c919298f988c34b30 Author: Eric Dumazet Date: Thu Mar 8 22:36:37 2007 -0800 [NET]: New sysctls should use __read_mostly tags net_msg_warn should be placed in the read_mostly section, to avoid performance problems on SMP Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit e5268f12f26f1f51590cd1ed26547e21c46b08f2 Author: YOSHIFUJI Hideaki Date: Thu Mar 8 20:48:23 2007 -0800 [IPV6]: Ensure to truncate result and return full length for sticky options. Bug noticed by Chris Wright . Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 4c6510a738c71ca6b4b7b624a7d0a00acebfd7fb Author: YOSHIFUJI Hideaki Date: Sun Mar 18 17:35:57 2007 -0700 [IPV6]: Return correct result for sticky options. We returned incorrect result with IPV6_RTHDRDSTOPTS, IPV6_RTHDR and IPV6_DSTOPTS. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 3fbe070a4293e8ab2d2edb1bc23f1e5220ce61af Author: Stephen Hemminger Date: Thu Mar 8 20:46:41 2007 -0800 [UDP]: deinline A couple of functions are exported or used indirectly so it is pointless to mark them as inline. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 6f05f629716a71d4c9c82813f45d3e9a6e90d146 Author: Stephen Hemminger Date: Thu Mar 8 20:46:03 2007 -0800 [NET]: deinline some functions Several functions are marked inline or forced inline, but it would be better to let the compiler decide. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 2de979bd7da9c8b39cc0aabb0ab5aa1516d929eb Author: Stephen Hemminger Date: Thu Mar 8 20:45:19 2007 -0800 [TCP]: whitespace cleanup Add whitespace around keywords. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 132adf54639cf7dd9315e8df89c2faa59f6e46d9 Author: Stephen Hemminger Date: Thu Mar 8 20:44:43 2007 -0800 [IPV4]: cleanup Add whitespace around keywords. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 1ac58ee37f439044eb09381f33c97ce0e7f2643b Author: Stephen Hemminger Date: Thu Mar 8 20:43:49 2007 -0800 [WIRELESS]: use ARRAY_SIZE() Use ARRAY_SIZE() macro now. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit e71a4783aae059931f63b2d4e7013e36529badef Author: Stephen Hemminger Date: Tue Apr 10 20:10:33 2007 -0700 [NET] core: whitespace cleanup Fix whitespace around keywords. Fix indentation especially of switch statements. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit add459aa1afe05472abc96f6a29aefd0c84e73d6 Author: Stephen Hemminger Date: Thu Mar 8 20:42:35 2007 -0800 [UDP]: ipv6 style cleanup Fix whitespace around keywords. Eliminate unnecessary ()'s on return statements. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 6516c65573fde5e421c6c92c4b180bbe2245b23b Author: Stephen Hemminger Date: Thu Mar 8 20:41:55 2007 -0800 [UDP]: ipv4 whitespace cleanup Fix whitespace around keywords. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit a2a316fd068c455c609ecc155dcfaa7e208d29fe Author: Stephen Hemminger Date: Thu Mar 8 20:41:08 2007 -0800 [NET]: Replace CONFIG_NET_DEBUG with sysctl. Covert network warning messages from a compile time to runtime choice. Removes kernel config option and replaces it with new /proc/sys/net/core/warnings. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit ae40eb1ef30ab4120bd3c8b7e3da99ee53d27a23 Author: Eric Dumazet Date: Sun Mar 18 17:33:16 2007 -0700 [NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution Now network timestamps use ktime_t infrastructure, we can add a new ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'. User programs can thus access to nanosecond resolution. Signed-off-by: Eric Dumazet CC: Stephen Hemminger Signed-off-by: David S. Miller commit cb69cc52364690d7789940c480b3a9490784b680 Author: Adrian Bunk Date: Wed Mar 7 19:33:52 2007 -0800 [TCP/DCCP/RANDOM]: Remove unused exports. This patch removes the following not or no longer used exports: - drivers/char/random.c: secure_tcp_sequence_number - net/dccp/options.c: sysctl_dccp_feat_sequence_window - net/netlink/af_netlink.c: netlink_set_err Signed-off-by: Adrian Bunk Signed-off-by: David S. Miller commit fe067e8ab5e0dc5ca3c54634924c628da92090b4 Author: David S. Miller Date: Wed Mar 7 12:12:44 2007 -0800 [TCP]: Abstract out all write queue operations. This allows the write queue implementation to be changed, for example, to one which allows fast interval searching. Signed-off-by: David S. Miller commit 02ea4923b4997d7e1310c027081f46d584b9d714 Author: YOSHIFUJI Hideaki Date: Wed Mar 7 14:21:31 2007 +0900 [NET] TIPC: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit b6d9bcb0697e60d5424e2f395fe950f0e22f4418 Author: YOSHIFUJI Hideaki Date: Wed Mar 7 14:21:20 2007 +0900 [NET] SCHED: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 8f05ce91c8b801af106611ad83b1d8d7429b9b46 Author: YOSHIFUJI Hideaki Date: Wed Mar 7 14:21:00 2007 +0900 [NET] NETFILTER: Use htonl() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 4412ec494868160d57da6e436a92b0696f40b19d Author: YOSHIFUJI Hideaki Date: Wed Mar 7 14:19:10 2007 +0900 [NET] IPV4: Use hton{s,l}() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 1c9e8ef7f731c2548414644e5bf540c38c85aff0 Author: YOSHIFUJI Hideaki Date: Wed Mar 7 14:19:05 2007 +0900 [NET] IEEE80211: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit f576e24ffaf2c6b01af389e3bad3342681a8b84f Author: YOSHIFUJI Hideaki Date: Wed Mar 7 14:19:03 2007 +0900 [NET] ETHERNET: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 724800d61b8bc574a364707b6a6c6a6252e8cdb4 Author: YOSHIFUJI Hideaki Date: Sun Mar 25 20:13:04 2007 -0700 [NET] CORE: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit aca3192cc60d2bf193c2252e45563c32e3117289 Author: YOSHIFUJI Hideaki Date: Sun Mar 25 20:12:50 2007 -0700 [NET] BLUETOOTH: Use cpu_to_le{16,32}() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit acde4855bb8f5fba8bb065d35ff6ac8a94b3dfa8 Author: YOSHIFUJI Hideaki Date: Sun Mar 25 20:12:32 2007 -0700 [NET] ATM: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit b93b7eebd328d5c1d171896fb823267539d4a0f6 Author: YOSHIFUJI Hideaki Date: Sun Mar 25 20:12:18 2007 -0700 [NET] 8021Q: Use htons() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 2953fd246845f4d00af3717163f37b2ff4c5ce29 Author: YOSHIFUJI Hideaki Date: Sun Mar 25 20:11:55 2007 -0700 [NET] 802: Use hton{s,l}() where appropriate. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 759e5d006462d53fb708daa8284b4ad909415da1 Author: Herbert Xu Date: Sun Mar 25 20:10:56 2007 -0700 [UDP]: Clean up UDP-Lite receive checksum This patch eliminates some duplicate code for the verification of receive checksums between UDP-Lite and UDP. It does this by introducing __skb_checksum_complete_head which is identical to __skb_checksum_complete_head apart from the fact that it takes a length parameter rather than computing the first skb->len bytes. As a result UDP-Lite will be able to use hardware checksum offload for packets which do not use partial coverage checksums. It also means that UDP-Lite loopback no longer does unnecessary checksum verification. If any NICs start support UDP-Lite this would also start working automatically. This patch removes the assumption that msg_flags has MSG_TRUNC clear upon entry in recvmsg. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit 1ab6eb62b02e0949a392fb19bf31ba59ae1022b1 Author: Herbert Xu Date: Tue Mar 6 20:29:58 2007 -0800 [UDP6]: Restore sk_filter optimisation This reverts the changeset [IPV6]: UDPv6 checksum. We always need to check UDPv6 checksum because it is mandatory. The sk_filter optimisation has nothing to do whether we verify the checksum. It simply postpones it to the point when the user calls recv or poll. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller commit 243bbcaa09e8482aa28065cbc2eb99f0ca2fc8d6 Author: Eric Dumazet Date: Tue Mar 6 20:23:10 2007 -0800 [IPV4]: Optimize inet_getpeer() 1) Some sysctl vars are declared __read_mostly 2) We can avoid updating stack[] when doing an AVL lookup only. lookup() macro is extended to receive a second parameter, that may be NULL in case of a pure lookup (no need to save the AVL path). This removes unnecessary instructions, because compiler knows if this _stack parameter is NULL or not. text size of net/ipv4/inetpeer.o is 2063 bytes instead of 2107 on x86_64 Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit 43e683926f808cec9802466c27cee7499eda3d11 Author: Stephen Hemminger Date: Tue Mar 6 20:21:20 2007 -0800 [TCP] TCP Yeah: cleanup Eliminate need for full 6/4/64 divide to compute queue. Variable maxqueue was really a constant. Fix indentation. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit c5f5877c043ca471c3a607fa2c864848b19bc49a Author: Stephen Hemminger Date: Sun Mar 25 20:21:15 2007 -0700 [TCP] tcp_cubic: faster cube root The Newton-Raphson method is quadratically convergent so only a small fixed number of steps are necessary. Therefore it is faster to unroll the loop. Since div64_64 is no longer inline it won't cause code explosion. Also fixes a bug that can occur if x^2 was bigger than 32 bits. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 8570419fb7be0af84085ac8f13307392a748482c Author: YOSHIFUJI Hideaki Date: Tue Mar 6 20:19:26 2007 -0800 [ATM] ENI: Convert to struct timeval to ktime_t. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit fc910a27839584209726537698b596576940add4 Author: David S. Miller Date: Sun Mar 25 20:27:59 2007 -0700 [NETLINK]: Limit NLMSG_GOODSIZE to 8K. Signed-off-by: David S. Miller commit ca043569390c528de4cd5ec9e07502f2bf4ecd1f Author: YOSHIFUJI Hideaki Date: Wed Feb 28 23:13:20 2007 +0900 [IPV6] ADDRCONF: Fix possible inet6_ifaddr leakage with CONFIG_OPTIMISTIC_DAD. The inet6_ifaddr for source address of RS is leaked if the address is not an optimistic address. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 95c385b4d5a71b8ad552aecaa968ea46d7da2f6a Author: Neil Horman Date: Wed Apr 25 17:08:10 2007 -0700 [IPV6] ADDRCONF: Optimistic Duplicate Address Detection (RFC 4429) Support. Nominally an autoconfigured IPv6 address is added to an interface in the Tentative state (as per RFC 2462). Addresses in this state remain in this state while the Duplicate Address Detection process operates on them to determine their uniqueness on the network. During this period, these tentative addresses may not be used for communication, increasing the time before a node may be able to communicate on a network. Using Optimistic Duplicate Address Detection, autoconfigured addresses may be used immediately for communication on the network, as long as certain rules are followed to avoid conflicts with other nodes during the Duplicate Address Detection process. Signed-off-by: Neil Horman Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 502b093569e48db264831be7966e1c447de2f52f Author: Yasuyuki Kozakai Date: Thu Nov 30 14:43:28 2006 +0900 [IPV6] IP6TUNNEL: Enable to control the handled inner protocol. ip6_tunnel before supporting IPv4/IPv6 tunnel allows only IPPROTO_IPV6 in configurations from userland. This allows userland to set IPPROTO_IPIP and 0(wildcard). ip6_tunnel only handles allowed inner protocols. Signed-off-by: Yasuyuki Kozakai Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 3144581cb0b4b1ef897470195128cc1c8dc037b6 Author: Yasuyuki Kozakai Date: Sat Feb 10 00:30:33 2007 +0900 [IPV6] IP6TUNNEL: Rename functions ip6ip6_* to ip6_tnl_*. Signed-off-by: Yasuyuki Kozakai Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit c4d3efafcc933fd2ffd169d7dc4f980393a13796 Author: Yasuyuki Kozakai Date: Thu Feb 15 00:43:16 2007 +0900 [IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel. Some notes - Protocol number IPPROTO_IPIP is used for IPv4 over IPv6 packets. - If IP6_TNL_F_USE_ORIG_TCLASS is set, TOS in IPv4 header is copied to Traffic Class in outer IPv6 header on xmit. - IP6_TNL_F_USE_ORIG_FLOWLABEL is ignored on xmit of IPv4 packets, because IPv4 header does not have flow label. - Kernel sends ICMP error if IPv4 packet is too big on xmit, even if DF flag is not set. Signed-off-by: Yasuyuki Kozakai Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 61ec2aec28ba8de09f76a558a5d6d3893b1d2e47 Author: Yasuyuki Kozakai Date: Sun Nov 5 22:56:45 2006 +0900 [IPV6] IP6TUNNEL: Split out generic routine in ip6ip6_xmit(). This enables to add IPv4/IPv6 specific handling later, Signed-off-by: Yasuyuki Kozakai Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 8359925be8bb5960f614e3f25454f3ef7cc9df65 Author: Yasuyuki Kozakai Date: Fri Nov 3 09:39:14 2006 +0900 [IPV6] IP6TUNNEL: Split out generic routine in ip6ip6_rcv(). This enables to add IPv4/IPv6 specific handling later, Signed-off-by: Yasuyuki Kozakai Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit e490d1d85cf5e191791979e5f260d32eb4f703a8 Author: Yasuyuki Kozakai Date: Tue Oct 31 23:11:25 2006 +0900 [IPV6] IP6TUNNEL: Split out generic routine in ip6ip6_err(). This enables to add IPv4/IPv6 specific error handling later, Signed-off-by: Yasuyuki Kozakai Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller commit 7159039a128fa0a73ca7b532f6e1d30d9885277f Author: YOSHIFUJI Hideaki Date: Thu Feb 22 22:05:40 2007 +0900 [IPV6]: Decentralize EXPORT_SYMBOLs. Signed-off-by: YOSHIFUJI Hideaki commit b558ff799977a4eda8b3823d1cf6c1c33becb671 Author: David S. Miller Date: Tue Mar 6 17:02:35 2007 -0800 [NETLINK]: Mirror UDP MSG_TRUNC semantics. If the user passes MSG_TRUNC in via msg_flags, return the full packet size not the truncated size. Idea from Herbert Xu and Thomas Graf. Signed-off-by: David S. Miller commit b7aa0bf70c4afb9e38be25f5c0922498d0f8684c Author: Eric Dumazet Date: Thu Apr 19 16:16:32 2007 -0700 [NET]: convert network timestamps to ktime_t We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) Once this ktime infrastructure adopted, we can more easily provide nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or SO_TIMESTAMPNS/SCM_TIMESTAMPNS) Note : this patch includes a bug correction in compat_sock_get_timestamp() where a "err = 0;" was missing (so this syscall returned -ENOENT instead of 0) Signed-off-by: Eric Dumazet CC: Stephen Hemminger CC: John find Signed-off-by: David S. Miller commit 3927f2e8f9afa3424bb51ca81f7abac01ffd0005 Author: Stephen Hemminger Date: Sun Mar 25 19:54:23 2007 -0700 [NET]: div64_64 consolidate (rev3) Here is the current version of the 64 bit divide common code. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller commit 9d729f72dca9406025bcfa9c1f660d71d9ef0ff5 Author: James Morris Date: Sun Mar 4 16:12:44 2007 -0800 [NET]: Convert xtime.tv_sec to get_seconds() Where appropriate, convert references to xtime.tv_sec to the get_seconds() helper function. Signed-off-by: James Morris Signed-off-by: David S. Miller commit 39df232f1a9ba48d41c68ee7d4046756e709cf91 Author: Stephen Hemminger Date: Sun Mar 4 16:11:51 2007 -0800 [PKTGEN]: fix device name handling Since devices can change name and other wierdness, don't hold onto a copy of device name, instead use pointer to output device. Fix a couple of leaks in error handling path as well. Signed-off-by: Stephen Hemminger Signed-off-by: Robert Olsson Signed-off-by: David S. Miller commit d5f1ce9a5e80fb315c86b036a89b1237fdf11938 Author: Stephen Hemminger Date: Sun Mar 4 16:08:08 2007 -0800 [PKTGEN]: don't use __constant_htonl() The existing htonl() macro is smart enough to do the same code as using __constant_htonl() and it looks cleaner. Signed-off-by: Stephen Hemminger Signed-off-by: Robert Olsson Signed-off-by: David S. Miller commit 5fa6fc76f55c5c42fff52ae1d57a685b9373fcdc Author: Stephen Hemminger Date: Sun Mar 4 16:07:28 2007 -0800 [PKTGEN]: use random32 Can use random32() now. Signed-off-by: Stephen Hemminger Signed-off-by: Robert Olsson Signed-off-by: David S. Miller commit 25c4e53a4c9bfe45be52821f54ec5ce957519db2 Author: Stephen Hemminger Date: Sun Mar 4 16:06:47 2007 -0800 [PKTGEN]: use pr_debug Remove private debug macro and replace with standard version Signed-off-by: Stephen Hemminger Signed-off-by: Robert Olsson Signed-off-by: David S. Miller commit fa438ccfdfd3f6db02c13b61b21454eb81cd6a13 Author: Eric Dumazet Date: Sun Mar 4 16:05:44 2007 -0800 [NET]: Keep sk_backlog near sk_lock sk_backlog is a critical field of struct sock. (known famous words) It is (ab)used in hot paths, in particular in release_sock(), tcp_recvmsg(), tcp_v4_rcv(), sk_receive_skb(). It really makes sense to place it next to sk_lock, because sk_backlog is only used after sk_lock locked (and thus memory cache line in L1 cache). This should reduce cache misses and sk_lock acquisition time. (In theory, we could only move the head pointer near sk_lock, and leaving tail far away, because 'tail' is normally not so hot, but keep it simple :) ) Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit e317f6f69cb95527799d308a9421b7dc1252989a Author: Ilpo Järvinen Date: Fri Mar 2 13:34:19 2007 -0800 [TCP]: FRTO undo response falls back to ratehalving one if ECEd Undoing ssthresh is disabled in fastretrans_alert whenever FLAG_ECE is set by clearing prior_ssthresh. The clearing does not protect FRTO because FRTO operates before fastretrans_alert. Moving the clearing of prior_ssthresh earlier seems to be a suboptimal solution to the FRTO case because then FLAG_ECE will cause a second ssthresh reduction in try_to_open (the first occurred when FRTO was entered). So instead, FRTO falls back immediately to the rate halving response, which switches TCP to CA_CWR state preventing the latter reduction of ssthresh. If the first ECE arrived before the ACK after which FRTO is able to decide RTO as spurious, prior_ssthresh is already cleared. Thus no undoing for ssthresh occurs. Besides, FLAG_ECE should be set also in the following ACKs resulting in rate halving response that sees TCP is already in CA_CWR, which again prevents an extra ssthresh reduction on that round-trip. If the first ECE arrived before RTO, ssthresh has already been adapted and prior_ssthresh remains cleared on entry because TCP is in CA_CWR (the same applies also to a case where FRTO is entered more than once and ECE comes in the middle). High_seq must not be touched after tcp_enter_cwr because CWR round-trip calculation depends on it. I believe that after this patch, FRTO should be ECN-safe and even able to take advantage of synergy benefits. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit e01f9d7793be82e6c252efbd52c399d3eb65abe4 Author: Ilpo Järvinen Date: Fri Mar 2 13:27:25 2007 -0800 [TCP]: Complete icsk-to-local-variable change (in tcp_enter_cwr) A local variable for icsk was created but this change was missing. Spotted by Jarek Poplawski. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 89808060b7a71376cc2ba8092d43b2010da465b6 Author: Ilpo Järvinen Date: Tue Feb 27 10:10:55 2007 -0800 [TCP] Sysctl documentation: tcp_frto_response In addition, fixed minor things in tcp_frto sysctl. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 3cfe3baaf07c9e40a75f9a70662de56df1c246a8 Author: Ilpo Järvinen Date: Tue Feb 27 10:09:49 2007 -0800 [TCP]: Add two new spurious RTO responses to FRTO New sysctl tcp_frto_response is added to select amongst these responses: - Rate halving based; reuses CA_CWR state (default) - Very conservative; used to be the only one available (=1) - Undo cwr; undoes ssthresh and cwnd reductions (=2) The response with rate halving requires a new parameter to tcp_enter_cwr because FRTO has already reduced ssthresh and doing a second reduction there has to be prevented. In addition, to keep things nice on 80 cols screen, a local variable was added. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit c5e7af0df5d7234afd8596560d9f570cfc6c18bf Author: Ilpo Järvinen Date: Fri Feb 23 16:22:06 2007 -0800 [TCP]: Correct reordering detection change (no FRTO case) The reordering detection must work also when FRTO has not been used at all which was the original intention of mine, just the expression of the idea was flawed. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit e0ef57cc56c3c96493f9b0d6c77bb9608eeaa173 Author: David S. Miller Date: Thu Feb 22 22:52:59 2007 -0800 [TCP]: Make snd_cwnd_clamp a u32. Signed-off-by: David S. Miller commit 54287cc178cf85dbae0decec8b4dc190bff757ad Author: Eric Dumazet Date: Thu Feb 22 03:20:44 2007 -0800 [TCP]: Keep copied_seq, rcv_wup and rcv_next together. I noticed in oprofile study a cache miss in tcp_rcv_established() to read copied_seq. ffffffff80400a80 : /* tcp_rcv_established total: 4034293   2.0400 */  55493  0.0281 :ffffffff80400bc9:   mov    0x4c8(%r12),%eax copied_seq 543103  0.2746 :ffffffff80400bd1:   cmp    0x3e0(%r12),%eax   rcv_nxt     if (tp->copied_seq == tp->rcv_nxt &&         len - tcp_header_len <= tp->ucopy.len) { In this function, the cache line 0x4c0 -> 0x500 is used only for this reading 'copied_seq' field. rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...) As you suggested, I changed tcp_create_openreq_child() so that these fields are changed together, to avoid adding a new store buffer stall. Patch is 64bit friendly (no new hole because of alignment constraints) Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller commit cf4c6bf83d0fa070f60b1ba8124dfe0e65fbfbcc Author: Ilpo Järvinen Date: Thu Feb 22 01:13:58 2007 -0800 [TCP]: struct *sock argument renamed: sp -> sk In general, TCP code uses "sk" for struct sock pointer. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 886236c1247ab5e2ad9c73f6e9a652e3ae3c8b07 Author: John Heffner Date: Sun Mar 25 19:21:45 2007 -0700 [TCP]: Add RFC3742 Limited Slow-Start, controlled by variable sysctl_tcp_max_ssthresh. Signed-off-by: John Heffner Signed-off-by: David S. Miller commit 5ef814753eb810d900fbd77af7c87f6d04f0e551 Author: Angelo P. Castellani Date: Thu Feb 22 00:23:05 2007 -0800 [TCP] YeAH-TCP: algorithm implementation YeAH-TCP is a sender-side high-speed enabled TCP congestion control algorithm, which uses a mixed loss/delay approach to compute the congestion window. It's design goals target high efficiency, internal, RTT and Reno fairness, resilience to link loss while keeping network elements load as low as possible. For further details look here: http://wil.cs.caltech.edu/pfldnet2007/paper/YeAH_TCP.pdf Signed-off-by: Angelo P. Castellani Signed-off-by: David S. Miller commit 127af0c44fc916908abd145914d65b9fe598bcd7 Author: Ilpo Järvinen Date: Wed Feb 21 23:16:38 2007 -0800 [TCP] FRTO: Sysctl documentation for SACK enhanced version The description is overly verbose to avoid ambiguity between "SACK enabled" and "SACK enhanced FRTO" Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 4dc2665e3634d720a62bd27128fc8781fcdad2dc Author: Ilpo Järvinen Date: Wed Feb 21 23:16:11 2007 -0800 [TCP]: SACK enhanced FRTO Implements the SACK-enhanced FRTO given in RFC4138 using the variant given in Appendix B. RFC4138, Appendix B: "This means that in order to declare timeout spurious, the TCP sender must receive an acknowledgment for non-retransmitted segment between SND.UNA and RecoveryPoint in algorithm step 3. RecoveryPoint is defined in conservative SACK-recovery algorithm [RFC3517]" The basic version of the FRTO algorithm can still be used also when SACK is enabled. To enabled SACK-enhanced version, tcp_frto sysctl is set to 2. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 288035f915686a9a9e85e0358c5392bb5d7ae58d Author: Ilpo Järvinen Date: Wed Feb 21 23:14:42 2007 -0800 [TCP]: Prevent reordering adjustments during FRTO To be honest, I'm not too sure how the reord stuff works in the first place but this seems necessary. When FRTO has been active, the one and only retransmission could be unnecessary but the state and sending order might not be what the sacktag code expects it to be (to work correctly). Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 66e93e45c09affa407750cc06398492e8b897848 Author: Ilpo Järvinen Date: Wed Feb 21 23:13:47 2007 -0800 [TCP] FRTO: Fake cwnd for ssthresh callback TCP without FRTO would be in Loss state with small cwnd. FRTO, however, leaves cwnd (typically) to a larger value which causes ssthresh to become too large in case RTO is triggered again compared to what conventional recovery would do. Because consecutive RTOs result in only a single ssthresh reduction, RTO+cumulative ACK+RTO pattern is required to trigger this event. A large comment is included for congestion control module writers trying to figure out what CA_EVENT_FRTO handler should do because there exists a remote possibility of incompatibility between FRTO and module defined ssthresh functions. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit d1a54c6a0a3f9c2c4ef71982d89b8571bd9eaa51 Author: Ilpo Järvinen Date: Wed Feb 21 23:11:57 2007 -0800 [TCP] FRTO: Reverse RETRANS bit clearing logic Previously RETRANS bits were cleared on the entry to FRTO. We postpone that into tcp_enter_frto_loss, which is really the place were the clearing should be done anyway. This allows simplification of the logic from a clearing loop to the head skb clearing only. Besides, the other changes made in the previous patches to tcp_use_frto made it impossible for the non-SACKed FRTO to be entered if other than the head has been rexmitted. With SACK-enhanced FRTO (and Appendix B), however, there can be a number retransmissions in flight when RTO expires (same thing could happen before this patchset also with non-SACK FRTO). To not introduce any jumpiness into the packet counting during FRTO, instead of clearing RETRANS bits from skbs during entry, do it later on. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 46d0de4ed92650b95f27acae09914996bbe624e7 Author: Ilpo Järvinen Date: Wed Feb 21 23:10:39 2007 -0800 [TCP] FRTO: Entry is allowed only during (New)Reno like recovery This interpretation comes from RFC4138: "If the sender implements some loss recovery algorithm other than Reno or NewReno [FHG04], the F-RTO algorithm SHOULD NOT be entered when earlier fast recovery is underway." I think the RFC means to say (especially in the light of Appendix B) that ...recovery is underway (not just fast recovery) or was underway when it was interrupted by an earlier (F-)RTO that hasn't yet been resolved (snd_una has not advanced enough). Thus, my interpretation is that whenever TCP has ever retransmitted other than head, basic version cannot be used because then the order assumptions which are used as FRTO basis do not hold. NewReno has only the head segment retransmitted at a time. Therefore, walk up to the segment that has not been SACKed, if that segment is not retransmitted nor anything before it, we know for sure, that nothing after the non-SACKed segment should be either. This assumption is valid because TCPCB_EVER_RETRANS does not leave holes but each non-SACKed segment is rexmitted in-order. Check for retrans_out > 1 avoids more expensive walk through the skb list, as we can know the result beforehand: F-RTO will not be allowed. SACKed skb can turn into non-SACked only in the extremely rare case of SACK reneging, in this case we might fail to detect retransmissions if there were them for any other than head. To get rid of that feature, whole rexmit queue would have to be walked (always) or FRTO should be prevented when SACK reneging happens. Of course RTO should still trigger after reneging which makes this issue even less likely to show up. And as long as the response is as conservative as it's now, nothing bad happens even then. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 7c9a4a5b67926dd186d427bc5b9fce6ccbde154c Author: Ilpo Järvinen Date: Wed Feb 21 23:08:34 2007 -0800 [TCP]: Prevent unrelated cwnd adjustment while using FRTO FRTO controls cwnd when it still processes the ACK input or it has just reverted back to conventional RTO recovery; the normal rules apply when FRTO has reverted to standard congestion control. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 94d0ea7786714d78d7cb73144bb850254dd0bb78 Author: Ilpo Järvinen Date: Wed Feb 21 23:07:27 2007 -0800 [TCP] FRTO: frto_counter modulo-op converted to two assignments Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 52c63f1e86ebb18ef4b710b5b647e552a041e5ca Author: Ilpo Järvinen Date: Wed Feb 21 23:06:52 2007 -0800 [TCP]: Don't enter to fast recovery while using FRTO Because TCP is not in Loss state during FRTO recovery, fast recovery could be triggered by accident. Non-SACK FRTO is more robust than not yet included SACK-enhanced version (that can receiver high number of duplicate ACKs with SACK blocks during FRTO), at least with unidirectional transfers, but under extraordinary patterns fast recovery can be incorrectly triggered, e.g., Data loss+ACK losses => cumulative ACK with enough SACK blocks to meet sacked_out >= dupthresh condition). Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit aa8b6a7ad147dfbaaf10368ff15df9418b670d8b Author: Ilpo Järvinen Date: Wed Feb 21 23:06:03 2007 -0800 [TCP] FRTO: Response should reset also snd_cwnd_cnt Since purpose is to reduce CWND, we prevent immediate growth. This is not a major issue nor is "the correct way" specified anywhere. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 95c4922bf9330eb2c71b752359dd89c4e166f3c5 Author: Ilpo Järvinen Date: Wed Feb 21 23:05:18 2007 -0800 [TCP] FRTO: fixes fallback to conventional recovery The FRTO detection did not care how ACK pattern affects to cwnd calculation of the conventional recovery. This caused incorrect setting of cwnd when the fallback becames necessary. The knowledge tcp_process_frto() has about the incoming ACK is now passed on to tcp_enter_frto_loss() in allowed_segments parameter that gives the number of segments that must be added to packets-in-flight while calculating the new cwnd. Instead of snd_una we use FLAG_DATA_ACKED in duplicate ACK detection because RFC4138 states (in Section 2.2): If the first acknowledgment after the RTO retransmission does not acknowledge all of the data that was retransmitted in step 1, the TCP sender reverts to the conventional RTO recovery. Otherwise, a malicious receiver acknowledging partial segments could cause the sender to declare the timeout spurious in a case where data was lost. If the next ACK after RTO is duplicate, we do not retransmit anything, which is equal to what conservative conventional recovery does in such case. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 6408d206c7484615ecae54bf6474a02c94e9e862 Author: Ilpo Järvinen Date: Wed Feb 21 23:04:11 2007 -0800 [TCP] FRTO: Ignore some uninteresting ACKs Handles RFC4138 shortcoming (in step 2); it should also have case c) which ignores ACKs that are not duplicates nor advance window (opposite dir data, winupdate). Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 7b0eb22b1d3b049306813a4aaa52966650f7491c Author: Ilpo Järvinen Date: Wed Feb 21 23:03:35 2007 -0800 [TCP] FRTO: Use Disorder state during operation instead of Open Retransmission counter assumptions are to be changed. Forcing reason to do this exist: Using sysctl in check would be racy as soon as FRTO starts to ignore some ACKs (doing that in the following patches). Userspace may disable it at any moment giving nice oops if timing is right. frto_counter would be inaccessible from userspace, but with SACK enhanced FRTO retrans_out can include other than head, and possibly leaving it non-zero after spurious RTO, boom again. Luckily, solution seems rather simple: never go directly to Open state but use Disorder instead. This does not really change much, since TCP could anyway change its state to Disorder during FRTO using path tcp_fastretrans_alert -> tcp_try_to_open (e.g., when a SACK block makes ACK dubious). Besides, Disorder seems to be the state where TCP should be if not recovering (in Recovery or Loss state) while having some retransmissions in-flight (see tcp_try_to_open), which is exactly what happens with FRTO. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 7487c48c4fd15d1e2542be1183b783562cfe10bc Author: Ilpo Järvinen Date: Wed Feb 21 23:02:30 2007 -0800 [TCP] FRTO: Consecutive RTOs keep prior_ssthresh and ssthresh In case a latency spike causes more than one RTO, the later should not cause the already reduced ssthresh to propagate into the prior_ssthresh since FRTO declares all such RTOs spurious at once or none of them. In treating of ssthresh, we mimic what tcp_enter_loss() does. The previous state (in frto_counter) must be available until we have checked it in tcp_enter_frto(), and also ACK information flag in process_frto(). Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 30935cf4f915c3178ce63331d6ff4c82163e26af Author: Ilpo Järvinen Date: Wed Feb 21 23:01:36 2007 -0800 [TCP] FRTO: Comment cleanup & improvement Moved comments out from the body of process_frto() to the head (preferred way; see Documentation/CodingStyle). Bonus: it's much easier to read in this compacted form. FRTO algorithm and implementation is described in greater detail. For interested reader, more information is available in RFC4138. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit bdaae17da81db79b9aa4dfbf43305cfeef64f6a8 Author: Ilpo Järvinen Date: Wed Feb 21 22:59:58 2007 -0800 [TCP] FRTO: Moved tcp_use_frto from tcp.h to tcp_input.c In addition, removed inline. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 9ead9a1d385ae2c52a6dcf2828d84ce66be04fc2 Author: Ilpo Järvinen Date: Wed Feb 21 22:56:19 2007 -0800 [TCP] FRTO: Separated response from FRTO detection algorithm FRTO spurious RTO detection algorithm (RFC4138) does not include response to a detected spurious RTO but can use different response algorithms. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 522e7548a9bd40305df41c0beae69448b7620d6b Author: Ilpo Järvinen Date: Wed Feb 21 22:54:52 2007 -0800 [TCP] FRTO: Incorrectly clears TCPCB_EVER_RETRANS bit FRTO was slightly too brave... Should only clear TCPCB_SACKED_RETRANS bit. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller commit 1912ffbb88efe872eb8fa8113dfb3cb0b7238764 Author: Joachim Fenkes Date: Mon Apr 23 18:20:27 2007 +0200 IB: Set class_dev->dev in core for nice device symlink All RDMA drivers except ehca set class_dev->dev to their dma_device value (ehca leaves this unset). dma_device is the only value that makes any sense, so move this assignment to core/sysfs.c. This reduce the duplicated code in the rest of the drivers and gives ehca a nice /sys/class/infiniband/ehcaX/device symlink. Signed-off-by: Joachim Fenkes Signed-off-by: Roland Dreier commit c4ed790dfd4b2182c76e0fcd79d4aa85ab02eccf Author: Joachim Fenkes Date: Tue Apr 24 17:44:31 2007 +0200 IB/ehca: Implement modify_port Add "Modify Port" verb support to eHCA driver. The IB communication manager needs this to set the IsCM port capability bit when initializing. Signed-off-by: Joachim Fenkes Signed-off-by: Roland Dreier commit bd8031b49a9b05933fb1ec1c36620ed4e1e67793 Author: Hal Rosenstock Date: Tue Apr 24 21:30:38 2007 -0700 IB/umad: Clarify documentation of transaction ID Signed-off-by: Hal Rosenstock Signed-off-by: Roland Dreier commit 37aebbde7023d75bf09fbadb6796276d0a65a068 Author: Roland Dreier Date: Tue Apr 24 21:30:37 2007 -0700 IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements There are quite a few places in ipoib_cm.c where we know IRQs are enabled because we do something that sleeps in the same function, so we can convert several occurrences of spin_lock_irqsave() to a plain spin_lock_irq(). This cleans up the source a little and makes the code smaller too: add/remove: 0/0 grow/shrink: 1/5 up/down: 3/-51 (-48) function old new delta ipoib_cm_tx_reap 403 406 +3 ipoib_cm_stale_task 146 145 -1 ipoib_cm_dev_stop 173 172 -1 ipoib_cm_tx_handler 964 956 -8 ipoib_cm_rx_handler 956 937 -19 ipoib_cm_skb_reap 212 190 -22 Signed-off-by: Roland Dreier commit de493d47d8b4738827d8914a4dc94058c58f4249 Author: Hal Rosenstock Date: Mon Apr 2 11:24:07 2007 -0400 IB/mad: Change SMI to use enums rather than magic return codes Clarify code by changing return values from SMI functions to named enum values instead of magic 0/1 values. Signed-off-by: Hal Rosenstock Signed-off-by: Roland Dreier commit aeba84a9251968a51fc6faae846518aac4e77565 Author: Sean Hefty Date: Thu Apr 5 11:49:21 2007 -0700 IB/umad: Implement GRH handling for sent/received MADs We need to set the SGID index for routed MADs and pass received GRH information to userspace when a MAD is received. Signed-off-by: Sean Hefty commit 46f1b3d7aff99ef4c1e729e023b9c8ee51de5973 Author: Sean Hefty Date: Thu Apr 5 11:50:11 2007 -0700 IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr To support destinations that are not on the local IB subnet, IPoIB should include the GRH information when constructing an address handle. Using the existing ib_init_ah_from_path() call will do this for us. Signed-off-by: Sean Hefty commit d0e7bb141837db620f24406ca8b4667424138d42 Author: Sean Hefty Date: Thu Apr 5 10:51:10 2007 -0700 IB/sa: Set src_path_bits correctly in ib_init_ah_from_path() src_path_bits needs to mask off the base LID value. Signed-off-by: Sean Hefty commit 9d41b7fdeadb76bd4d06c16803daffd9fcf8dc7f Author: Sean Hefty Date: Thu Apr 5 10:51:05 2007 -0700 IB/ucm: Simplify ib_ucm_event() Use wait_event_interruptible() instead of a more complicated open-coded equivalent. Signed-off-by: Sean Hefty commit d92f76448c1a3e40ff3df96a653ecd83aeac6ee7 Author: Sean Hefty Date: Thu Apr 5 10:49:51 2007 -0700 RDMA/ucma: Simplify ucma_get_event() Use wait_event_interruptible() instead of a more complicated open-coded equivalent. Signed-off-by: Sean Hefty commit 30c00986f3a610cdcee2602b8254c3ffa6cddc04 Author: Roland Dreier Date: Tue Apr 24 16:31:11 2007 -0700 IB/mthca: Simplify CQ cleaning in mthca_free_qp() mthca_free_qp() already has local variables to hold the QP's send_cq and recv_cq, so we can slightly clean up the calls to mthca_cq_clean() by using those local variables instead of expressions like to_mcq(qp->ibqp.send_cq). Also, by cleaning the recv_cq first, we can avoid worrying about whether the QP is attached to an SRQ for the second call, because we would only clean send_cq if send_cq is not equal to recv_cq, and that means send_cq cannot have any receive completions from the QP being destroyed. All this work even improves the generated code a bit: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5) function old new delta mthca_free_qp 510 505 -5 Signed-off-by: Roland Dreier commit 532c3b581725e2c6480a20c845fff920690286f1 Author: Roland Dreier Date: Tue Apr 24 16:31:04 2007 -0700 IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory Commit b2875d4c ("IB/mthca: Always fill MTTs from CPU") causes a crash in mthca_write_mtt() with non-memfree HCAs that have their memory hidden (that is, have only two PCI BARs instead of having a third BAR that allows access to the RAM attached to the HCA) on 64-bit architectures. This is because the commit just before, c20e20ab ("IB/mthca: Merge MR and FMR space on 64-bit systems") makes dev->mr_table.fmr_mtt_buddy equal to &dev->mr_table.mtt_buddy and hence mthca_write_mtt() tries to write directly into the HCA's MTT table. However, since that table is in the HCA's memory, this is impossible without the PCI BAR that gives access to that memory. This causes a crash because mthca_tavor_write_mtt_seg() basically tries to dereference some offset of a NULL pointer. Fix this by adding a test of MTHCA_FLAG_FMR in mthca_write_mtt() so that we always use the WRITE_MTT firmware command rather than writing directly if FMRs are not enabled. Signed-off-by: Roland Dreier commit 3f114853d4f7c1746389f26e1d500887294da8fd Author: Roland Dreier Date: Wed Apr 18 20:21:02 2007 -0700 IB/mthca: Update HCA firmware revisions Update the driver's list of current firmware versions with Mellanox's latest releases. Signed-off-by: Roland Dreier commit 40b90430ecac40cc9adb26b808cc12a3d569da5d Author: Robert Walsh Date: Thu Mar 15 14:45:17 2007 -0700 IB/ipath: Fix WC format drift between user and kernel space The kernel ib_wc structure now uses a QP pointer, but the user space equivalent uses a QP number instead. This means we can no longer use a simple structure copy to copy stuff into user space. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 6ce73b07db7aa05d4a30716d6a99c832b6d9db4a Author: Robert Walsh Date: Thu Mar 15 14:45:16 2007 -0700 IB/ipath: Check that a UD work request's address handle is valid Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 0d6172a4284b21e4762e8638a4d693ef52f63bfe Author: Robert Walsh Date: Thu Mar 15 14:45:15 2007 -0700 IB/ipath: Remove duplicate stuff from ipath_verbs.h Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 253fb3902008353831525ab711909abdd5ee191f Author: Robert Walsh Date: Thu Mar 15 14:45:14 2007 -0700 IB/ipath: Check reserved memory keys Don't let userspace use the direct-physical-map L_key or R_key. Signed-off-by: Ralph Campbell Signed-off-by: Roland Dreier commit f0810daf74c564a3615eba5002cc11c21a0949ba Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:13 2007 -0700 IB/ipath: Fix unit selection when all CPU affinity bits set At some point things changed so that all the affinity bits can be set, but cpus_full() macro is not true. This caused problems with the unit selection logic on multi-unit (board) configurations. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 662af5813be9aadf95ca310b7b6d1d37070c9922 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:12 2007 -0700 IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times Signed-off-by: Robert Walsh Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 53c1d2c943a67fb129ed2797182305a4633531fb Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:11 2007 -0700 IB/ipath: Disable IB link earlier in shutdown sequence Move the code that shuts down the IB link earlier in the unload process, to be sure no new packets can arrive while we are unloading. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 490462c2686df6e35c21d1efe935e0b4a3bddb39 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:10 2007 -0700 IB/ipath: Prevent random program use of diags interface To prevent random utility reads and writes of the diag interface to the chip, we first require a handshake of reading from offset 0 and writing to offset 0 before any other reads or writes can be done through the diags device. Otherwise chip errors can be triggered. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit f5408ac7ccec0a7edd2b6add0da82735375a37a0 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:09 2007 -0700 IB/ipath: On unrecoverable errors, force link down, LEDs off If the chip is no longer usable, LEDs should be turned off so system can be found easily in the cluster. Also some minor reorganizing so both chips print hardware error message at same point and only if there were unrecovered errors Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 27b044a815df7d4530bc68560796680ed588070c Author: Michael Albaugh Date: Thu Mar 15 14:45:08 2007 -0700 IB/ipath: Fix driver crash (in interrupt or during unload) after chip reset Re-init of the kernel structures after a chip reset was leaving the portdata structure for port zero in an inconsistent state, and a pointer to it either stale (in re-init code) or NULL (in devdata) Fixing the order of operations on this struct, and the condition for interrupt access, prevents the crashes. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 9783ab405844202b452ac673677e6c8f8c9a6a99 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:07 2007 -0700 IB/ipath: Improve handling and reporting of parity errors Mostly cleanup. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 820054b7ca7a54ba94d89db4b3c53a24d2d66633 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:06 2007 -0700 IB/ipath: Print better error messages if kernel is misconfigured Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 569b87b47f906d65ee35d6ecc4767f20a6390b9b Author: Arthur Jones Date: Thu Mar 15 14:45:05 2007 -0700 IB/ipath: Force PIOAvail update entry point Due to a chip bug, the PIOAvail register is not always updated to memory. This patch allows userspace to force an update. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 7b196e2ff3953063b656212ff517f6115a1477b2 Author: Arthur Jones Date: Thu Mar 15 14:45:04 2007 -0700 IB/ipath: Call free_irq() on chip specific initialization failure In initialization, if we bailed at chip specific initialization, we forgot to clean up the irq we had requested. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 5a7d4eea9185c20275307fcd1077d6f9dfdab48a Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:03 2007 -0700 IB/ipath: Discard multicast packets without a GRH This patch fixes a bug where multicast packets without a GRH were not being dropped as per the IB spec. Signed-off-by: Ralph Campbell Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 0ed3c594e3878274787810422760dc7c51e0ee72 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:02 2007 -0700 IB/ipath: Fix calculation for number of kernel PIO buffers If the module parameter "kpiobufs" is set too high, the calculation to reset it to a sane value was incorrect. Signed-off-by: Ralph Campbell Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit c8c6f5d496fe794cbb52fe5a08c2bd839eecaa07 Author: Bryan O'Sullivan Date: Thu Mar 15 14:45:01 2007 -0700 IB/ipath: Remove unused ipath_read_kreg64_port() Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit dd5190b6be0f3e27b6a4933a6a6d2d59957fc748 Author: Ralph Campbell Date: Thu Mar 15 14:45:00 2007 -0700 IB/ipath: Fix RDMA reads of length zero and error handling Fix RDMA read response length checking for RDMA_READ_RESPONSE_ONLY to allow a zero length response. RDMA read responses which don't match the expected length or occur in response to some other operation should generate a completion queue error (see table 56, ch. 9.9.2.3 in the IB spec). Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit c7e29ff11f23ec78b3caf691789c2b791bb596bf Author: Mark Debbage Date: Thu Mar 15 14:44:59 2007 -0700 IB/ipath: Allow receive ports mapped into userspace to be shared Improve port-sharing performance by allowing any process to receive packets from the shared hardware port under a spin lock for mutual exclusion. Previously, one process was nominated as the master and that process was responsible for receiving all packets from the shared hardware port and either consuming them or forwarding them to their destination. This led to starvation problems for other processes when the master process was busy in computation phases. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 0a5a83cffc03592c2102ad07b7532b596a16f8cd Author: Ralph Campbell Date: Thu Mar 15 14:44:58 2007 -0700 IB/ipath: Fix port sharing on powerpc The port sharing feature mixed kernel virtual addresses as well as physical addresses for the offset used to describe the mmap address to map the InfiniPath hardware into user space. This had a conflict on powerpc. The new scheme converts it to a physical address so it doesn't conflict with chip addresses and yet still fits in 40/44 bits so it isn't truncated by 32-bit applications calling mmap64(). Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 041eab9136d8325c332429df71d05ba3e0ea8ebc Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:57 2007 -0700 IB/ipath: Fix CQ flushing when QP is modified to error state If a receive work request has been removed from the queue but has not had a CQ entry generated for it and the QP is modified to the error state, the completion entry generated is incorrect. This patch fixes the problem. Signed-off-by: Ralph Campbell Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 614d49a21e96737f84b13f644ac813f8eb6d297a Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:56 2007 -0700 IB/ipath: Fix bad argument to clear_bit() Code was converted from a &= ~mask to clear_bit, but the bit was left shifted instead of being used directly, so we were either trashing memory several pages away, or sometimes taking a kernel page fault on an invalid page. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 8ec1077b35359c973f4b1de7c516be570a6df495 Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:55 2007 -0700 IB/ipath: Change packet problems vs chip errors handling and reporting Some types of packet errors are moderately common with longer IB cables and large clusters, and are not reported with prints by other IB HCA drivers. This suppresses those messages unless the new __IPATH_ERRPKTDBG bit is set in ipath_debug. Reporting of temporarily disabled frequent error interrupts was also made clearer We also distinguish between chip errors, and bad packets sent or received in the wording of the messages. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 6f5c407460bba332d6bee52e19f2305539395511 Author: Ralph Campbell Date: Thu Mar 15 14:44:54 2007 -0700 IB/ipath: Fix PSN update for RC retries This patch fixes a number of bugs with updating the PSN for retries of RC requests. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 0434d271fddaabd65aaa4dbd0145112d6e8aa388 Author: Ralph Campbell Date: Thu Mar 15 14:44:53 2007 -0700 IB/ipath: Fix QP error completion queue entries When switching to the QP error state, the completion queue entries (error or flush) were not being generated correctly. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 39c0d0b919ae5080163bd2d41c0271cda250d382 Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:52 2007 -0700 IB/ipath: Fix up some debug messages ipath_dbg doesn't need the same prefixes that printk does. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 3859e39d75b72f35f7d38c618fbbacb39a440c22 Author: Ralph Campbell Date: Thu Mar 15 14:44:51 2007 -0700 IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC and IB_QP_MAX_QP_RD_ATOMIC This patch adds support for multiple RDMA reads and atomics to be sent before an ACK is required to be seen by the requester. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 7b21d26ddad6912bf345e8e88a51a5ce98a036ad Author: Ralph Campbell Date: Thu Mar 15 14:44:50 2007 -0700 IB/ipath: NMI cpu lockup if local loopback used If a post send is done in loopback and there is no receive queue entry, the sending QP is put on a timeout list for a while so the receiver has a chance to post a receive buffer. If the another post send is done, the code incorrectly tried to put the QP on the timeout list again an corrupted the timeout list. This eventually leads to a spin lock deadlock NMI due to the timer function looping forever with the lock held. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 9f9630d5e12a51f38513de0d64320a55ab6f02d5 Author: Ralph Campbell Date: Thu Mar 15 14:44:49 2007 -0700 IB/ipath: Fix SRQ limit event causing dropped CQ entry A silly programming error causes a CQ entry to not be generated if a SRQ limit event is generated. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 947d7617a1d876c2c93f73017a734e070c64d43b Author: Ralph Campbell Date: Thu Mar 15 14:44:48 2007 -0700 IB/ipath: Don't initialize port memory for subports A recent change was made to allocate memory for a port after CPU affinity is set. That change didn't account for subports and was trying to allocate memory for the port twice. Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 19085745598ec254fd814411b675b52380c3bac0 Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:47 2007 -0700 IB/ipath: Definitions of two RXE parity err bits were reversed The chip documentation on the expected TID vs eager TID parity error bits was reversed from what was implemented in the RTL, for both chips. This corrects the definitions. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 165c552c35052284e8ec4f7e9c027dfd33490e2c Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:46 2007 -0700 IB/ipath: Fix user memory region creation when IOMMU present The loop which initializes the user memory region from an array of pages was using the wrong limit for the array. This worked OK when dma_map_sg() returned the same number as the number of pages. This patch fixes the problem. Signed-off-by: Ralph Campbell Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit 946db67fbf836af30835d610b914cdde0cf467f8 Author: Bryan O'Sullivan Date: Thu Mar 15 14:44:45 2007 -0700 IB/ipath: Add ability to set and clear IB local loopback This is a sticky state. It is useful for diagnosing problems with boards versus cable/switch problems. Signed-off-by: Dave Olson Signed-off-by: Bryan O'Sullivan Signed-off-by: Roland Dreier commit a89875fc7e23ec91561bc3742df3bd5d12b376b4 Author: Roland Dreier Date: Wed Apr 18 20:20:53 2007 -0700 IPoIB: Remove pointless opcode field from debugging output There's no point in printing the opcode field in the completion handling debugging output, since the type of completion is already printed at the beginning of the line. In fact the opcode field is not even defined for completions with a status other than success. Signed-off-by: Roland Dreier commit 9a4b65e35714516980c863bfb7edc5f232b8b458 Author: Hal Rosenstock Date: Mon Apr 2 12:45:16 2007 -0400 IB/umad: Fix declaration of dev_map[] The current ib_umad code never accesses bits past IB_UMAD_MAX_PORTS in dev_map[]. We shouldn't declare it to be twice as big. Pointed-out-by: Roland Dreier Signed-off-by: Hal Rosenstock commit 9b620d2a16814e5f2a063359c953c41f804e091a Author: Roland Dreier Date: Wed Apr 18 20:20:53 2007 -0700 IB: Remove reference to obsolete CONFIG_IPATH_CORE Since commit b1c1b6a3 ("IB/ipath: merge ipath_core and ib_ipath drivers"), CONFIG_IPATH_CORE no longer exists, so there's no reason to have a line for it in drivers/Makefile. Pointed out by Robert P. J. Day . Signed-off-by: Roland Dreier