From clameter@sgi.com Wed Jun 20 10:59:57 2007 Message-Id: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:27 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [00/37] Large Blocksize Support V4 V3->V4 - More page cache cleanup using a set of functions - Disable bouncing when the gfp mask is setup. - Disable mmap directly in mm/filemap.c to avoid filesystem changes while we have no mmap support for higher order pages. RFC V2->V3 - More restructuring - It actually works! - Add XFS support - Fix up UP support - Work out the direct I/O issues - Add CONFIG_LARGE_BLOCKSIZE. Off by default which makes the inlines revert back to constants. Disabled for 32bit and HIGHMEM configurations. This also allows a gradual migration to the new page cache inline functions. LARGE_BLOCKSIZE capabilities can be added gradually and if there is a problem then we can disable a subsystem. RFC V1->V2 - Some ext2 support - Some block layer, fs layer support etc. - Better page cache macros - Use macros to clean up code. This patchset modifies the Linux kernel so that larger block sizes than page size can be supported. Larger block sizes are handled by using compound pages of an arbitrary order for the page cache instead of single pages with order 0. Rationales: 1. We have problems supporting devices with a higher blocksize than page size. This is for example important to support CD and DVDs that can only read and write 32k or 64k blocks. We currently have a shim layer in there to deal with this situation which limits the speed of I/O. The developers are currently looking for ways to completely bypass the page cache because of this deficiency. 2. 32/64k blocksize is also used in flash devices. Same issues. 3. Future harddisks will support bigger block sizes that Linux cannot support since we are limited to PAGE_SIZE. Ok the on board cache may buffer this for us but what is the point of handling smaller page sizes than what the drive supports? 4. Reduce fsck times. Larger block sizes mean faster file system checking. 5. Performance. If we look at IA64 vs. x86_64 then it seems that the faster interrupt handling on x86_64 compensate for the speed loss due to a smaller page size (4k vs 16k on IA64). Supporting larger block sizes sizes on all allows a significant reduction in I/O overhead and increases the size of I/O that can be performed by hardware in a single request since the number of scatter gather entries are typically limited for one request. This is going to become increasingly important to support the ever growing memory sizes since we may have to handle excessively large amounts of 4k requests for data sizes that may become common soon. For example to write a 1 terabyte file the kernel would have to handle 256 million 4k chunks. 6. Cross arch compatibility: It is currently not possible to mount an 16k blocksize ext2 filesystem created on IA64 on an x86_64 system. With this patch this becoems possible. How to make this work: 1. Apply this patchset to 2.6.22-rc4-mm2 2. Configure LARGE_BLOCKSIZE Support 3. compile kernel Tested file systems: Filesystem Max Blocksize Changes Reiserfs 8k Page size functions Ext2 64k Page size functions XFS 64k Page size functions / Remove PAGE_SIZE check RAMFS MAX_ORDER Parameter to specify order Todo: - Antifragmentation in mm does address some fragmentation issues (typically works up to 32k blocksize). However, large orders lead to fragmentation of the movable sections. Seems that we need Mel's memory compaction to support even larger orders. - Remove PAGE_CACHE_xxx constants after using page_cache_xxx functions everywhere. But that will have to wait until merging becomes possible. For now certain subsystems (shmem f.e.) are not using these functions. They will only use order 0 pages. - Support for non harddisk based filesystems. Remove the pktdvd etc layers needed because the VM current does not support sufficiently large blocksizes to support these devices. Look for other places in the kernel where we have similar issues. - Mmap read support - Full mmmap support -- From clameter@sgi.com Wed Jun 20 10:59:57 2007 Message-Id: <20070620175957.407825225@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:28 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [01/37] Define functions for page cache handling Content-Disposition: inline; filename=vps_page_cache_functions We use the macros PAGE_CACHE_SIZE PAGE_CACHE_SHIFT PAGE_CACHE_MASK and PAGE_CACHE_ALIGN in various places in the kernel. Many times common operations like calculating the offset or the index are coded using shifts and adds. This patch provides inline function to get the calculations accomplished in a consistent way. All functions take an address_space pointer. The address space pointer will be used in the future to eventually support a variable size page cache. Information reachable via the mapping may then determine page size. New function Related base page constant --------------------------------------------------- page_cache_shift(a) PAGE_CACHE_SHIFT page_cache_size(a) PAGE_CACHE_SIZE page_cache_mask(a) PAGE_CACHE_MASK page_cache_index(a, pos) Calculate page number from position page_cache_next(addr, pos) Page number of next page page_cache_offset(a, pos) Calculate offset into a page page_cache_pos(a, index, offset) Form position based on page number and an offset. This provides a basis that would allow the conversion of all page cache handling in the kernel and ultimately allow the removal of the PAGE_CACHE_* constants. Signed-off-by: Christoph Lameter --- include/linux/pagemap.h | 54 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) Index: vps/include/linux/pagemap.h =================================================================== --- vps.orig/include/linux/pagemap.h 2007-06-08 10:57:49.000000000 -0700 +++ vps/include/linux/pagemap.h 2007-06-08 11:01:37.000000000 -0700 @@ -52,12 +52,66 @@ static inline void mapping_set_gfp_mask( * space in smaller chunks for same flexibility). * * Or rather, it _will_ be done in larger chunks. + * + * The following constants can be used if a filesystem only supports a single + * page size. */ #define PAGE_CACHE_SHIFT PAGE_SHIFT #define PAGE_CACHE_SIZE PAGE_SIZE #define PAGE_CACHE_MASK PAGE_MASK #define PAGE_CACHE_ALIGN(addr) (((addr)+PAGE_CACHE_SIZE-1)&PAGE_CACHE_MASK) +/* + * Functions that are currently setup for a fixed PAGE_SIZEd. The use of + * these will allow a variable page size pagecache in the future. + */ +static inline int mapping_order(struct address_space *a) +{ + return 0; +} + +static inline int page_cache_shift(struct address_space *a) +{ + return PAGE_SHIFT; +} + +static inline unsigned int page_cache_size(struct address_space *a) +{ + return PAGE_SIZE; +} + +static inline loff_t page_cache_mask(struct address_space *a) +{ + return (loff_t)PAGE_MASK; +} + +static inline unsigned int page_cache_offset(struct address_space *a, + loff_t pos) +{ + return pos & ~PAGE_MASK; +} + +static inline pgoff_t page_cache_index(struct address_space *a, + loff_t pos) +{ + return pos >> page_cache_shift(a); +} + +/* + * Index of the page starting on or after the given position. + */ +static inline pgoff_t page_cache_next(struct address_space *a, + loff_t pos) +{ + return page_cache_index(a, pos + page_cache_size(a) - 1); +} + +static inline loff_t page_cache_pos(struct address_space *a, + pgoff_t index, unsigned long offset) +{ + return ((loff_t)index << page_cache_shift(a)) + offset; +} + #define page_cache_get(page) get_page(page) #define page_cache_release(page) put_page(page) void release_pages(struct page **pages, int nr, int cold); -- From clameter@sgi.com Wed Jun 20 10:59:57 2007 Message-Id: <20070620175957.587800058@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:29 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [02/37] Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user Content-Disposition: inline; filename=zero_user_segments Simplify page cache zeroing of segments of pages through 3 functions zero_user_segments(page, start1, end1, start2, end2) Zeros two segments of the page. It takes the position where to start and end the zeroing which avoids length calculations. zero_user_segment(page, start, end) Same for a single segment. zero_user(page, start, length) Length variant for the case where we know the length. We remove the zero_user_page macro. Issues: 1. Its a macro. Inline functions are preferable. 2. The KM_USER0 macro is only defined for HIGHMEM. Having to treat this special case everywhere makes the code needlessly complex. The parameter for zeroing is always KM_USER0 except in one single case that we open code. Avoiding KM_USER0 makes a lot of code not having to be dealing with the special casing for HIGHMEM anymore. Dealing with kmap is only necessary for HIGHMEM configurations. In those configurations we use KM_USER0 like we do for a series of other functions defined in highmem.h. Since KM_USER0 is depends on HIGHMEM the existing zero_user_page function could not be a macro. zero_user_* functions introduced here can be because that constant is not used when these functions are called. Extract the flushing of the caches to be outside of the kmap. Signed-off-by: Christoph Lameter --- fs/affs/file.c | 2 - fs/buffer.c | 47 +++++++++-------------------- fs/direct-io.c | 4 +- fs/ecryptfs/mmap.c | 5 +-- fs/ext3/inode.c | 4 +- fs/gfs2/bmap.c | 2 - fs/libfs.c | 19 +++--------- fs/mpage.c | 7 +--- fs/nfs/read.c | 10 +++--- fs/nfs/write.c | 2 - fs/ntfs/aops.c | 18 ++++++----- fs/ntfs/file.c | 32 +++++++++----------- fs/ocfs2/aops.c | 2 - fs/reiser4/plugin/file/cryptcompress.c | 8 +---- fs/reiser4/plugin/file/file.c | 2 - fs/reiser4/plugin/item/ctail.c | 2 - fs/reiser4/plugin/item/extent_file_ops.c | 4 +- fs/reiser4/plugin/item/tail.c | 3 - fs/reiserfs/inode.c | 4 +- fs/xfs/linux-2.6/xfs_lrw.c | 2 - include/linux/highmem.h | 49 +++++++++++++++++++------------ mm/filemap_xip.c | 2 - mm/truncate.c | 2 - 23 files changed, 107 insertions(+), 125 deletions(-) Index: vps/include/linux/highmem.h =================================================================== --- vps.orig/include/linux/highmem.h 2007-06-11 22:33:01.000000000 -0700 +++ vps/include/linux/highmem.h 2007-06-11 22:33:07.000000000 -0700 @@ -124,28 +124,41 @@ static inline void clear_highpage(struct kunmap_atomic(kaddr, KM_USER0); } -/* - * Same but also flushes aliased cache contents to RAM. - * - * This must be a macro because KM_USER0 and friends aren't defined if - * !CONFIG_HIGHMEM - */ -#define zero_user_page(page, offset, size, km_type) \ - do { \ - void *kaddr; \ - \ - BUG_ON((offset) + (size) > PAGE_SIZE); \ - \ - kaddr = kmap_atomic(page, km_type); \ - memset((char *)kaddr + (offset), 0, (size)); \ - flush_dcache_page(page); \ - kunmap_atomic(kaddr, (km_type)); \ - } while (0) +static inline void zero_user_segments(struct page *page, + unsigned start1, unsigned end1, + unsigned start2, unsigned end2) +{ + void *kaddr = kmap_atomic(page, KM_USER0); + + BUG_ON(end1 > PAGE_SIZE || + end2 > PAGE_SIZE); + + if (end1 > start1) + memset(kaddr + start1, 0, end1 - start1); + + if (end2 > start2) + memset(kaddr + start2, 0, end2 - start2); + + kunmap_atomic(kaddr, KM_USER0); + flush_dcache_page(page); +} + +static inline void zero_user_segment(struct page *page, + unsigned start, unsigned end) +{ + zero_user_segments(page, start, end, 0, 0); +} + +static inline void zero_user(struct page *page, + unsigned start, unsigned size) +{ + zero_user_segments(page, start, start + size, 0, 0); +} static inline void __deprecated memclear_highpage_flush(struct page *page, unsigned int offset, unsigned int size) { - zero_user_page(page, offset, size, KM_USER0); + zero_user(page, offset, size); } #ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE Index: vps/fs/buffer.c =================================================================== --- vps.orig/fs/buffer.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/buffer.c 2007-06-11 22:49:08.000000000 -0700 @@ -1792,7 +1792,7 @@ void page_zero_new_buffers(struct page * start = max(from, block_start); size = min(to, block_end) - start; - zero_user_page(page, start, size, KM_USER0); + zero_user(page, start, size); set_buffer_uptodate(bh); } @@ -1855,19 +1855,10 @@ static int __block_prepare_write(struct mark_buffer_dirty(bh); continue; } - if (block_end > to || block_start < from) { - void *kaddr; - - kaddr = kmap_atomic(page, KM_USER0); - if (block_end > to) - memset(kaddr+to, 0, - block_end-to); - if (block_start < from) - memset(kaddr+block_start, - 0, from-block_start); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); - } + if (block_end > to || block_start < from) + zero_user_segments(page, + to, block_end, + block_start, from); continue; } } @@ -2095,8 +2086,7 @@ int block_read_full_page(struct page *pa SetPageError(page); } if (!buffer_mapped(bh)) { - zero_user_page(page, i * blocksize, blocksize, - KM_USER0); + zero_user(page, i * blocksize, blocksize); if (!err) set_buffer_uptodate(bh); continue; @@ -2209,7 +2199,7 @@ int cont_expand_zero(struct file *file, &page, &fsdata); if (err) goto out; - zero_user_page(page, zerofrom, len, KM_USER0); + zero_user(page, zerofrom, len); err = pagecache_write_end(file, mapping, curpos, len, len, page, fsdata); if (err < 0) @@ -2236,7 +2226,7 @@ int cont_expand_zero(struct file *file, &page, &fsdata); if (err) goto out; - zero_user_page(page, zerofrom, len, KM_USER0); + zero_user(page, zerofrom, len); err = pagecache_write_end(file, mapping, curpos, len, len, page, fsdata); if (err < 0) @@ -2350,7 +2340,6 @@ int nobh_prepare_write(struct page *page unsigned block_in_page; unsigned block_start; sector_t block_in_file; - char *kaddr; int nr_reads = 0; int i; int ret = 0; @@ -2390,13 +2379,8 @@ int nobh_prepare_write(struct page *page if (PageUptodate(page)) continue; if (buffer_new(&map_bh) || !buffer_mapped(&map_bh)) { - kaddr = kmap_atomic(page, KM_USER0); - if (block_start < from) - memset(kaddr+block_start, 0, from-block_start); - if (block_end > to) - memset(kaddr + to, 0, block_end - to); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_segments(page, block_start, from, + to, block_end); continue; } if (buffer_uptodate(&map_bh)) @@ -2462,7 +2446,7 @@ failed: * Error recovery is pretty slack. Clear the page and mark it dirty * so we'll later zero out any blocks which _were_ allocated. */ - zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); set_page_dirty(page); return ret; @@ -2531,7 +2515,7 @@ int nobh_writepage(struct page *page, ge * the page size, the remaining memory is zeroed when mapped, and * writes to that region are not written out to the file." */ - zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, KM_USER0); + zero_user_segment(page, offset, PAGE_CACHE_SIZE); out: ret = mpage_writepage(page, get_block, wbc); if (ret == -EAGAIN) @@ -2565,8 +2549,7 @@ int nobh_truncate_page(struct address_sp to = (offset + blocksize) & ~(blocksize - 1); ret = a_ops->prepare_write(NULL, page, offset, to); if (ret == 0) { - zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, - KM_USER0); + zero_user_segment(page, offset, PAGE_CACHE_SIZE); /* * It would be more correct to call aops->commit_write() * here, but this is more efficient. @@ -2645,7 +2628,7 @@ int block_truncate_page(struct address_s goto unlock; } - zero_user_page(page, offset, length, KM_USER0); + zero_user(page, offset, length); mark_buffer_dirty(bh); err = 0; @@ -2691,7 +2674,7 @@ int block_write_full_page(struct page *p * the page size, the remaining memory is zeroed when mapped, and * writes to that region are not written out to the file." */ - zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, KM_USER0); + zero_user_segment(page, offset, PAGE_CACHE_SIZE); return __block_write_full_page(inode, page, get_block, wbc); } Index: vps/fs/libfs.c =================================================================== --- vps.orig/fs/libfs.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/libfs.c 2007-06-11 22:49:09.000000000 -0700 @@ -340,13 +340,10 @@ int simple_prepare_write(struct file *fi unsigned from, unsigned to) { if (!PageUptodate(page)) { - if (to - from != PAGE_CACHE_SIZE) { - void *kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr, 0, from); - memset(kaddr + to, 0, PAGE_CACHE_SIZE - to); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); - } + if (to - from != PAGE_CACHE_SIZE) + zero_user_segments(page, + 0, from, + to, PAGE_CACHE_SIZE); } return 0; } @@ -396,12 +393,8 @@ int simple_write_end(struct file *file, unsigned from = pos & (PAGE_CACHE_SIZE - 1); /* zero the stale part of the page if we did a short copy */ - if (copied < len) { - void *kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + from + copied, 0, len - copied); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); - } + if (copied < len) + zero_user(page, from + copied, len); simple_commit_write(file, page, from, from+copied); Index: vps/fs/affs/file.c =================================================================== --- vps.orig/fs/affs/file.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/affs/file.c 2007-06-11 22:33:07.000000000 -0700 @@ -628,7 +628,7 @@ static int affs_prepare_write_ofs(struct return err; } if (to < PAGE_CACHE_SIZE) { - zero_user_page(page, to, PAGE_CACHE_SIZE - to, KM_USER0); + zero_user_segment(page, to, PAGE_CACHE_SIZE); if (size > offset + to) { if (size < offset + PAGE_CACHE_SIZE) tmp = size & ~PAGE_CACHE_MASK; Index: vps/fs/mpage.c =================================================================== --- vps.orig/fs/mpage.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/mpage.c 2007-06-11 22:49:08.000000000 -0700 @@ -284,9 +284,7 @@ do_mpage_readpage(struct bio *bio, struc } if (first_hole != blocks_per_page) { - zero_user_page(page, first_hole << blkbits, - PAGE_CACHE_SIZE - (first_hole << blkbits), - KM_USER0); + zero_user_segment(page, first_hole << blkbits, PAGE_CACHE_SIZE); if (first_hole == 0) { SetPageUptodate(page); unlock_page(page); @@ -579,8 +577,7 @@ page_is_mapped: if (page->index > end_index || !offset) goto confused; - zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, - KM_USER0); + zero_user_segment(page, offset, PAGE_CACHE_SIZE); } /* Index: vps/fs/ntfs/aops.c =================================================================== --- vps.orig/fs/ntfs/aops.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/ntfs/aops.c 2007-06-11 22:33:07.000000000 -0700 @@ -87,13 +87,17 @@ static void ntfs_end_buffer_async_read(s /* Check for the current buffer head overflowing. */ if (unlikely(file_ofs + bh->b_size > init_size)) { int ofs; + void *kaddr; ofs = 0; if (file_ofs < init_size) ofs = init_size - file_ofs; local_irq_save(flags); - zero_user_page(page, bh_offset(bh) + ofs, - bh->b_size - ofs, KM_BIO_SRC_IRQ); + kaddr = kmap_atomic(page, KM_BIO_SRC_IRQ); + memset(kaddr + bh_offset(bh) + ofs, 0, + bh->b_size - ofs); + flush_dcache_page(page); + kunmap_atomic(kaddr, KM_BIO_SRC_IRQ); local_irq_restore(flags); } } else { @@ -334,7 +338,7 @@ handle_hole: bh->b_blocknr = -1UL; clear_buffer_mapped(bh); handle_zblock: - zero_user_page(page, i * blocksize, blocksize, KM_USER0); + zero_user(page, i * blocksize, blocksize); if (likely(!err)) set_buffer_uptodate(bh); } while (i++, iblock++, (bh = bh->b_this_page) != head); @@ -451,7 +455,7 @@ retry_readpage: * ok to ignore the compressed flag here. */ if (unlikely(page->index > 0)) { - zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_CACHE_SIZE); goto done; } if (!NInoAttr(ni)) @@ -780,8 +784,7 @@ lock_retry_remap: if (err == -ENOENT || lcn == LCN_ENOENT) { bh->b_blocknr = -1; clear_buffer_dirty(bh); - zero_user_page(page, bh_offset(bh), blocksize, - KM_USER0); + zero_user(page, bh_offset(bh), blocksize); set_buffer_uptodate(bh); err = 0; continue; @@ -1406,8 +1409,7 @@ retry_writepage: if (page->index >= (i_size >> PAGE_CACHE_SHIFT)) { /* The page straddles i_size. */ unsigned int ofs = i_size & ~PAGE_CACHE_MASK; - zero_user_page(page, ofs, PAGE_CACHE_SIZE - ofs, - KM_USER0); + zero_user_segment(page, ofs, PAGE_CACHE_SIZE); } /* Handle mst protected attributes. */ if (NInoMstProtected(ni)) Index: vps/fs/reiserfs/inode.c =================================================================== --- vps.orig/fs/reiserfs/inode.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/reiserfs/inode.c 2007-06-11 22:33:07.000000000 -0700 @@ -2151,7 +2151,7 @@ int reiserfs_truncate_file(struct inode /* if we are not on a block boundary */ if (length) { length = blocksize - length; - zero_user_page(page, offset, length, KM_USER0); + zero_user(page, offset, length); if (buffer_mapped(bh) && bh->b_blocknr != 0) { mark_buffer_dirty(bh); } @@ -2375,7 +2375,7 @@ static int reiserfs_write_full_page(stru unlock_page(page); return 0; } - zero_user_page(page, last_offset, PAGE_CACHE_SIZE - last_offset, KM_USER0); + zero_user_segment(page, last_offset, PAGE_CACHE_SIZE); } bh = head; block = page->index << (PAGE_CACHE_SHIFT - s->s_blocksize_bits); Index: vps/mm/truncate.c =================================================================== --- vps.orig/mm/truncate.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/mm/truncate.c 2007-06-11 22:54:27.000000000 -0700 @@ -47,7 +47,7 @@ void do_invalidatepage(struct page *page static inline void truncate_partial_page(struct page *page, unsigned partial) { - zero_user_page(page, partial, PAGE_CACHE_SIZE - partial, KM_USER0); + zero_user_segment(page, partial, PAGE_CACHE_SIZE); if (PagePrivate(page)) do_invalidatepage(page, partial); } Index: vps/fs/direct-io.c =================================================================== --- vps.orig/fs/direct-io.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/direct-io.c 2007-06-11 22:33:07.000000000 -0700 @@ -887,8 +887,8 @@ do_holes: page_cache_release(page); goto out; } - zero_user_page(page, block_in_page << blkbits, - 1 << blkbits, KM_USER0); + zero_user(page, block_in_page << blkbits, + 1 << blkbits); dio->block_in_file++; block_in_page++; goto next_block; Index: vps/mm/filemap_xip.c =================================================================== --- vps.orig/mm/filemap_xip.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/mm/filemap_xip.c 2007-06-11 22:54:27.000000000 -0700 @@ -461,7 +461,7 @@ xip_truncate_page(struct address_space * else return PTR_ERR(page); } - zero_user_page(page, offset, length, KM_USER0); + zero_user(page, offset, length); return 0; } EXPORT_SYMBOL_GPL(xip_truncate_page); Index: vps/fs/ext3/inode.c =================================================================== --- vps.orig/fs/ext3/inode.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/ext3/inode.c 2007-06-11 22:33:07.000000000 -0700 @@ -1818,7 +1818,7 @@ static int ext3_block_truncate_page(hand */ if (!page_has_buffers(page) && test_opt(inode->i_sb, NOBH) && ext3_should_writeback_data(inode) && PageUptodate(page)) { - zero_user_page(page, offset, length, KM_USER0); + zero_user(page, offset, length); set_page_dirty(page); goto unlock; } @@ -1871,7 +1871,7 @@ static int ext3_block_truncate_page(hand goto unlock; } - zero_user_page(page, offset, length, KM_USER0); + zero_user(page, offset, length); BUFFER_TRACE(bh, "zeroed end of block"); err = 0; Index: vps/fs/ntfs/file.c =================================================================== --- vps.orig/fs/ntfs/file.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/ntfs/file.c 2007-06-11 22:33:07.000000000 -0700 @@ -607,8 +607,8 @@ do_next_page: ntfs_submit_bh_for_read(bh); *wait_bh++ = bh; } else { - zero_user_page(page, bh_offset(bh), - blocksize, KM_USER0); + zero_user(page, bh_offset(bh), + blocksize); set_buffer_uptodate(bh); } } @@ -683,9 +683,8 @@ map_buffer_cached: ntfs_submit_bh_for_read(bh); *wait_bh++ = bh; } else { - zero_user_page(page, - bh_offset(bh), - blocksize, KM_USER0); + zero_user(page, bh_offset(bh), + blocksize); set_buffer_uptodate(bh); } } @@ -703,8 +702,8 @@ map_buffer_cached: */ if (bh_end <= pos || bh_pos >= end) { if (!buffer_uptodate(bh)) { - zero_user_page(page, bh_offset(bh), - blocksize, KM_USER0); + zero_user(page, bh_offset(bh), + blocksize); set_buffer_uptodate(bh); } mark_buffer_dirty(bh); @@ -743,8 +742,7 @@ map_buffer_cached: if (!buffer_uptodate(bh)) set_buffer_uptodate(bh); } else if (!buffer_uptodate(bh)) { - zero_user_page(page, bh_offset(bh), blocksize, - KM_USER0); + zero_user(page, bh_offset(bh), blocksize); set_buffer_uptodate(bh); } continue; @@ -868,8 +866,8 @@ rl_not_mapped_enoent: if (!buffer_uptodate(bh)) set_buffer_uptodate(bh); } else if (!buffer_uptodate(bh)) { - zero_user_page(page, bh_offset(bh), - blocksize, KM_USER0); + zero_user(page, bh_offset(bh), + blocksize); set_buffer_uptodate(bh); } continue; @@ -1128,8 +1126,8 @@ rl_not_mapped_enoent: if (likely(bh_pos < initialized_size)) ofs = initialized_size - bh_pos; - zero_user_page(page, bh_offset(bh) + ofs, - blocksize - ofs, KM_USER0); + zero_user_segment(page, bh_offset(bh) + ofs, + blocksize); } } else /* if (unlikely(!buffer_uptodate(bh))) */ err = -EIO; @@ -1269,8 +1267,8 @@ rl_not_mapped_enoent: if (PageUptodate(page)) set_buffer_uptodate(bh); else { - zero_user_page(page, bh_offset(bh), - blocksize, KM_USER0); + zero_user(page, bh_offset(bh), + blocksize); set_buffer_uptodate(bh); } } @@ -1330,7 +1328,7 @@ err_out: len = PAGE_CACHE_SIZE; if (len > bytes) len = bytes; - zero_user_page(*pages, 0, len, KM_USER0); + zero_user(*pages, 0, len); } goto out; } @@ -1451,7 +1449,7 @@ err_out: len = PAGE_CACHE_SIZE; if (len > bytes) len = bytes; - zero_user_page(*pages, 0, len, KM_USER0); + zero_user(*pages, 0, len); } goto out; } Index: vps/fs/nfs/read.c =================================================================== --- vps.orig/fs/nfs/read.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/nfs/read.c 2007-06-11 22:33:07.000000000 -0700 @@ -79,7 +79,7 @@ void nfs_readdata_release(void *data) static int nfs_return_empty_page(struct page *page) { - zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); unlock_page(page); return 0; @@ -103,10 +103,10 @@ static void nfs_readpage_truncate_uninit pglen = PAGE_CACHE_SIZE - base; for (;;) { if (remainder <= pglen) { - zero_user_page(*pages, base, remainder, KM_USER0); + zero_user(*pages, base, remainder); break; } - zero_user_page(*pages, base, pglen, KM_USER0); + zero_user(*pages, base, pglen); pages++; remainder -= pglen; pglen = PAGE_CACHE_SIZE; @@ -130,7 +130,7 @@ static int nfs_readpage_async(struct nfs return PTR_ERR(new); } if (len < PAGE_CACHE_SIZE) - zero_user_page(page, len, PAGE_CACHE_SIZE - len, KM_USER0); + zero_user_segment(page, len, PAGE_CACHE_SIZE); nfs_list_add_request(new, &one_request); if (NFS_SERVER(inode)->rsize < PAGE_CACHE_SIZE) @@ -538,7 +538,7 @@ readpage_async_filler(void *data, struct goto out_error; if (len < PAGE_CACHE_SIZE) - zero_user_page(page, len, PAGE_CACHE_SIZE - len, KM_USER0); + zero_user_segment(page, len, PAGE_CACHE_SIZE); nfs_pageio_add_request(desc->pgio, new); return 0; out_error: Index: vps/fs/nfs/write.c =================================================================== --- vps.orig/fs/nfs/write.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/nfs/write.c 2007-06-11 22:33:07.000000000 -0700 @@ -168,7 +168,7 @@ static void nfs_mark_uptodate(struct pag if (count != nfs_page_length(page)) return; if (count != PAGE_CACHE_SIZE) - zero_user_page(page, count, PAGE_CACHE_SIZE - count, KM_USER0); + zero_user_segment(page, count, PAGE_CACHE_SIZE); SetPageUptodate(page); } Index: vps/fs/xfs/linux-2.6/xfs_lrw.c =================================================================== --- vps.orig/fs/xfs/linux-2.6/xfs_lrw.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/xfs/linux-2.6/xfs_lrw.c 2007-06-11 22:33:07.000000000 -0700 @@ -154,7 +154,7 @@ xfs_iozero( if (status) break; - zero_user_page(page, offset, bytes, KM_USER0); + zero_user(page, offset, bytes); status = pagecache_write_end(NULL, mapping, pos, bytes, bytes, page, fsdata); Index: vps/fs/ecryptfs/mmap.c =================================================================== --- vps.orig/fs/ecryptfs/mmap.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/ecryptfs/mmap.c 2007-06-11 22:33:07.000000000 -0700 @@ -370,8 +370,7 @@ static int fill_zeros_to_end_of_page(str end_byte_in_page = i_size_read(inode) % PAGE_CACHE_SIZE; if (to > end_byte_in_page) end_byte_in_page = to; - zero_user_page(page, end_byte_in_page, - PAGE_CACHE_SIZE - end_byte_in_page, KM_USER0); + zero_user_segment(page, end_byte_in_page, PAGE_CACHE_SIZE); out: return 0; } @@ -784,7 +783,7 @@ int write_zeros(struct file *file, pgoff page_cache_release(tmp_page); goto out; } - zero_user_page(tmp_page, start, num_zeros, KM_USER0); + zero_user(tmp_page, start, num_zeros); rc = ecryptfs_commit_write(file, tmp_page, start, start + num_zeros); if (rc < 0) { ecryptfs_printk(KERN_ERR, "Error attempting to write zero's " Index: vps/fs/gfs2/bmap.c =================================================================== --- vps.orig/fs/gfs2/bmap.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/gfs2/bmap.c 2007-06-11 22:33:07.000000000 -0700 @@ -932,7 +932,7 @@ static int gfs2_block_truncate_page(stru if (sdp->sd_args.ar_data == GFS2_DATA_ORDERED || gfs2_is_jdata(ip)) gfs2_trans_add_bh(ip->i_gl, bh, 0); - zero_user_page(page, offset, length, KM_USER0); + zero_user(page, offset, length); unlock: unlock_page(page); Index: vps/fs/ocfs2/aops.c =================================================================== --- vps.orig/fs/ocfs2/aops.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/ocfs2/aops.c 2007-06-11 22:33:07.000000000 -0700 @@ -238,7 +238,7 @@ static int ocfs2_readpage(struct file *f * XXX sys_readahead() seems to get that wrong? */ if (start >= i_size_read(inode)) { - zero_user_page(page, 0, PAGE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_SIZE); SetPageUptodate(page); ret = 0; goto out_alloc; Index: vps/fs/reiser4/plugin/file/cryptcompress.c =================================================================== --- vps.orig/fs/reiser4/plugin/file/cryptcompress.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/reiser4/plugin/file/cryptcompress.c 2007-06-11 22:33:07.000000000 -0700 @@ -1933,7 +1933,7 @@ static int write_hole(struct inode *inod to_pg = min_count(PAGE_CACHE_SIZE - pg_off, cl_count); lock_page(page); - zero_user_page(page, pg_off, to_pg, KM_USER0); + zero_user(page, pg_off, to_pg); SetPageUptodate(page); unlock_page(page); @@ -2169,8 +2169,7 @@ static int read_some_cluster_pages(struc off = off_to_pgoff(win->off+win->count+win->delta); if (off) { lock_page(pg); - zero_user_page(pg, off, PAGE_CACHE_SIZE - off, - KM_USER0); + zero_user_segment(pg, off, PAGE_CACHE_SIZE); unlock_page(pg); } } @@ -2217,8 +2216,7 @@ static int read_some_cluster_pages(struc offset = off_to_pgoff(win->off + win->count + win->delta); - zero_user_page(pg, offset, PAGE_CACHE_SIZE - offset, - KM_USER0); + zero_user_segment(pg, offset, PAGE_CACHE_SIZE); unlock_page(pg); /* still not uptodate */ break; Index: vps/fs/reiser4/plugin/file/file.c =================================================================== --- vps.orig/fs/reiser4/plugin/file/file.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/reiser4/plugin/file/file.c 2007-06-11 22:33:07.000000000 -0700 @@ -538,7 +538,7 @@ static int shorten_file(struct inode *in lock_page(page); assert("vs-1066", PageLocked(page)); - zero_user_page(page, padd_from, PAGE_CACHE_SIZE - padd_from, KM_USER0); + zero_user_segment(page, padd_from, PAGE_CACHE_SIZE); unlock_page(page); page_cache_release(page); /* the below does up(sbinfo->delete_mutex). Do not get confused */ Index: vps/fs/reiser4/plugin/item/ctail.c =================================================================== --- vps.orig/fs/reiser4/plugin/item/ctail.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/reiser4/plugin/item/ctail.c 2007-06-11 22:33:07.000000000 -0700 @@ -627,7 +627,7 @@ int do_readpage_ctail(struct inode * ino #endif case FAKE_DISK_CLUSTER: /* fill the page by zeroes */ - zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); break; case PREP_DISK_CLUSTER: Index: vps/fs/reiser4/plugin/item/extent_file_ops.c =================================================================== --- vps.orig/fs/reiser4/plugin/item/extent_file_ops.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/reiser4/plugin/item/extent_file_ops.c 2007-06-11 22:33:07.000000000 -0700 @@ -1112,7 +1112,7 @@ int reiser4_do_readpage_extent(reiser4_e */ j = jfind(mapping, index); if (j == NULL) { - zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); unlock_page(page); return 0; @@ -1127,7 +1127,7 @@ int reiser4_do_readpage_extent(reiser4_e block = *jnode_get_io_block(j); spin_unlock_jnode(j); if (block == 0) { - zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0); + zero_user(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); unlock_page(page); jput(j); Index: vps/fs/reiser4/plugin/item/tail.c =================================================================== --- vps.orig/fs/reiser4/plugin/item/tail.c 2007-06-11 22:33:01.000000000 -0700 +++ vps/fs/reiser4/plugin/item/tail.c 2007-06-11 22:33:07.000000000 -0700 @@ -392,8 +392,7 @@ static int do_readpage_tail(uf_coord_t * done: if (mapped != PAGE_CACHE_SIZE) - zero_user_page(page, mapped, PAGE_CACHE_SIZE - mapped, - KM_USER0); + zero_user_segment(page, mapped, PAGE_CACHE_SIZE); SetPageUptodate(page); out_unlock_page: unlock_page(page); -- From clameter@sgi.com Wed Jun 20 10:59:57 2007 Message-Id: <20070620175957.751816393@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:30 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [03/37] Use page_cache_xxx function in mm/filemap.c Content-Disposition: inline; filename=vps_mm_filemap Signed-off-by: Christoph Lameter --- mm/filemap.c | 76 +++++++++++++++++++++++++++++------------------------------ 1 file changed, 38 insertions(+), 38 deletions(-) Index: vps/mm/filemap.c =================================================================== --- vps.orig/mm/filemap.c 2007-06-08 10:57:37.000000000 -0700 +++ vps/mm/filemap.c 2007-06-09 21:15:04.000000000 -0700 @@ -304,8 +304,8 @@ EXPORT_SYMBOL(add_to_page_cache_lru); int sync_page_range(struct inode *inode, struct address_space *mapping, loff_t pos, loff_t count) { - pgoff_t start = pos >> PAGE_CACHE_SHIFT; - pgoff_t end = (pos + count - 1) >> PAGE_CACHE_SHIFT; + pgoff_t start = page_cache_index(mapping, pos); + pgoff_t end = page_cache_index(mapping, pos + count - 1); int ret; if (!mapping_cap_writeback_dirty(mapping) || !count) @@ -336,8 +336,8 @@ EXPORT_SYMBOL(sync_page_range); int sync_page_range_nolock(struct inode *inode, struct address_space *mapping, loff_t pos, loff_t count) { - pgoff_t start = pos >> PAGE_CACHE_SHIFT; - pgoff_t end = (pos + count - 1) >> PAGE_CACHE_SHIFT; + pgoff_t start = page_cache_index(mapping, pos); + pgoff_t end = page_cache_index(mapping, pos + count - 1); int ret; if (!mapping_cap_writeback_dirty(mapping) || !count) @@ -366,7 +366,7 @@ int filemap_fdatawait(struct address_spa return 0; return wait_on_page_writeback_range(mapping, 0, - (i_size - 1) >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, i_size - 1)); } EXPORT_SYMBOL(filemap_fdatawait); @@ -414,8 +414,8 @@ int filemap_write_and_wait_range(struct /* See comment of filemap_write_and_wait() */ if (err != -EIO) { int err2 = wait_on_page_writeback_range(mapping, - lstart >> PAGE_CACHE_SHIFT, - lend >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, lstart), + page_cache_index(mapping, lend)); if (!err) err = err2; } @@ -881,28 +881,28 @@ void do_generic_mapping_read(struct addr int error; struct file_ra_state ra = *_ra; - index = *ppos >> PAGE_CACHE_SHIFT; + index = page_cache_index(mapping, *ppos); next_index = index; prev_index = ra.prev_index; prev_offset = ra.prev_offset; - last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT; - offset = *ppos & ~PAGE_CACHE_MASK; + last_index = page_cache_next(mapping, *ppos + desc->count); + offset = page_cache_offset(mapping, *ppos); isize = i_size_read(inode); if (!isize) goto out; - end_index = (isize - 1) >> PAGE_CACHE_SHIFT; + end_index = page_cache_index(mapping, isize - 1); for (;;) { struct page *page; unsigned long nr, ret; /* nr is the maximum number of bytes to copy from this page */ - nr = PAGE_CACHE_SIZE; + nr = page_cache_size(mapping); if (index >= end_index) { if (index > end_index) goto out; - nr = ((isize - 1) & ~PAGE_CACHE_MASK) + 1; + nr = page_cache_offset(mapping, isize - 1) + 1; if (nr <= offset) { goto out; } @@ -956,8 +956,8 @@ page_ok: */ ret = actor(desc, page, offset, nr); offset += ret; - index += offset >> PAGE_CACHE_SHIFT; - offset &= ~PAGE_CACHE_MASK; + index += page_cache_index(mapping, offset); + offset = page_cache_offset(mapping, offset); prev_offset = offset; ra.prev_offset = offset; @@ -1023,16 +1023,16 @@ readpage: * another truncate extends the file - this is desired though). */ isize = i_size_read(inode); - end_index = (isize - 1) >> PAGE_CACHE_SHIFT; + end_index = page_cache_index(mapping, isize - 1); if (unlikely(!isize || index > end_index)) { page_cache_release(page); goto out; } /* nr is the maximum number of bytes to copy from this page */ - nr = PAGE_CACHE_SIZE; + nr = page_cache_size(mapping); if (index == end_index) { - nr = ((isize - 1) & ~PAGE_CACHE_MASK) + 1; + nr = page_cache_offset(mapping, isize - 1) + 1; if (nr <= offset) { page_cache_release(page); goto out; @@ -1073,7 +1073,7 @@ out: *_ra = ra; _ra->prev_index = prev_index; - *ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset; + *ppos = page_cache_pos(mapping, index, offset); if (filp) file_accessed(filp); } @@ -1291,8 +1291,8 @@ asmlinkage ssize_t sys_readahead(int fd, if (file) { if (file->f_mode & FMODE_READ) { struct address_space *mapping = file->f_mapping; - unsigned long start = offset >> PAGE_CACHE_SHIFT; - unsigned long end = (offset + count - 1) >> PAGE_CACHE_SHIFT; + unsigned long start = page_cache_index(mapping, offset); + unsigned long end = page_cache_index(mapping, offset + count - 1); unsigned long len = end - start + 1; ret = do_readahead(mapping, file, start, len); } @@ -1364,7 +1364,7 @@ struct page *filemap_fault(struct vm_are BUG_ON(!(vma->vm_flags & VM_CAN_INVALIDATE)); - size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; + size = page_cache_next(mapping, i_size_read(inode)); if (fdata->pgoff >= size) goto outside_data_content; @@ -1439,7 +1439,7 @@ retry_find: goto page_not_uptodate; /* Must recheck i_size under page lock */ - size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; + size = page_cache_next(mapping, i_size_read(inode)); if (unlikely(fdata->pgoff >= size)) { unlock_page(page); goto outside_data_content; @@ -1930,8 +1930,8 @@ int pagecache_write_begin(struct file *f pagep, fsdata); } else { int ret; - pgoff_t index = pos >> PAGE_CACHE_SHIFT; - unsigned offset = pos & (PAGE_CACHE_SIZE - 1); + pgoff_t index = page_cache_index(mapping, pos); + unsigned offset = page_cache_offset(mapping, pos); struct inode *inode = mapping->host; struct page *page; again: @@ -1984,7 +1984,7 @@ int pagecache_write_end(struct file *fil ret = aops->write_end(file, mapping, pos, len, copied, page, fsdata); } else { - unsigned offset = pos & (PAGE_CACHE_SIZE - 1); + unsigned offset = page_cache_offset(mapping, pos); struct inode *inode = mapping->host; flush_dcache_page(page); @@ -2089,9 +2089,9 @@ static ssize_t generic_perform_write_2co unsigned long bytes; /* Bytes to write to page */ size_t copied; /* Bytes copied from user */ - offset = (pos & (PAGE_CACHE_SIZE - 1)); - index = pos >> PAGE_CACHE_SHIFT; - bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset, + offset = page_cache_offset(mapping, pos ); + index = page_cache_index(mapping, pos); + bytes = min_t(unsigned long, page_cache_size(mapping) - offset, iov_iter_count(i)); /* @@ -2267,9 +2267,9 @@ static ssize_t generic_perform_write(str size_t copied; /* Bytes copied from user */ void *fsdata; - offset = (pos & (PAGE_CACHE_SIZE - 1)); - index = pos >> PAGE_CACHE_SHIFT; - bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset, + offset = page_cache_offset(mapping, pos); + index = page_cache_index(mapping, pos); + bytes = min_t(unsigned long, page_cache_size(mapping) - offset, iov_iter_count(i)); again: @@ -2316,7 +2316,7 @@ again: * because not all segments in the iov can be copied at * once without a pagefault. */ - bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset, + bytes = min_t(unsigned long, page_cache_size(mapping) - offset, iov_iter_single_seg_count(i)); goto again; } @@ -2459,8 +2459,8 @@ __generic_file_aio_write_nolock(struct k if (err == 0) { written = written_buffered; invalidate_mapping_pages(mapping, - pos >> PAGE_CACHE_SHIFT, - endbyte >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, pos), + page_cache_index(mapping, endbyte)); } else { /* * We don't know how much we wrote, so just return @@ -2547,7 +2547,7 @@ generic_file_direct_IO(int rw, struct ki */ if (rw == WRITE) { write_len = iov_length(iov, nr_segs); - end = (offset + write_len - 1) >> PAGE_CACHE_SHIFT; + end = page_cache_index(mapping, offset + write_len - 1); if (mapping_mapped(mapping)) unmap_mapping_range(mapping, offset, write_len, 0); } @@ -2564,7 +2564,7 @@ generic_file_direct_IO(int rw, struct ki */ if (rw == WRITE && mapping->nrpages) { retval = invalidate_inode_pages2_range(mapping, - offset >> PAGE_CACHE_SHIFT, end); + page_cache_index(mapping, offset), end); if (retval) goto out; } @@ -2582,7 +2582,7 @@ generic_file_direct_IO(int rw, struct ki */ if (rw == WRITE && mapping->nrpages) { int err = invalidate_inode_pages2_range(mapping, - offset >> PAGE_CACHE_SHIFT, end); + page_cache_index(mapping, offset), end); if (err && retval >= 0) retval = err; } -- From clameter@sgi.com Wed Jun 20 10:59:58 2007 Message-Id: <20070620175957.892501882@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:31 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [04/37] Use page_cache_xxx in mm/page-writeback.c Content-Disposition: inline; filename=vps_mm_page_writeback Signed-off-by: Christoph Lameter --- mm/page-writeback.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: vps/mm/page-writeback.c =================================================================== --- vps.orig/mm/page-writeback.c 2007-06-07 17:01:04.000000000 -0700 +++ vps/mm/page-writeback.c 2007-06-09 21:34:24.000000000 -0700 @@ -626,8 +626,8 @@ int write_cache_pages(struct address_spa index = mapping->writeback_index; /* Start from prev offset */ end = -1; } else { - index = wbc->range_start >> PAGE_CACHE_SHIFT; - end = wbc->range_end >> PAGE_CACHE_SHIFT; + index = page_cache_index(mapping, wbc->range_start); + end = page_cache_index(mapping, wbc->range_end); if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) range_whole = 1; scanned = 1; @@ -829,7 +829,7 @@ int __set_page_dirty_nobuffers(struct pa WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page)); if (mapping_cap_account_dirty(mapping)) { __inc_zone_page_state(page, NR_FILE_DIRTY); - task_io_account_write(PAGE_CACHE_SIZE); + task_io_account_write(page_cache_size(mapping)); } radix_tree_tag_set(&mapping->page_tree, page_index(page), PAGECACHE_TAG_DIRTY); -- From clameter@sgi.com Wed Jun 20 10:59:58 2007 Message-Id: <20070620175958.058961893@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:32 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [05/37] Use page_cache_xxx in mm/truncate.c Content-Disposition: inline; filename=vps_mm_truncate Signed-off-by: Christoph Lameter --- mm/truncate.c | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) Index: vps/mm/truncate.c =================================================================== --- vps.orig/mm/truncate.c 2007-06-09 20:35:19.000000000 -0700 +++ vps/mm/truncate.c 2007-06-09 21:39:47.000000000 -0700 @@ -45,9 +45,10 @@ void do_invalidatepage(struct page *page (*invalidatepage)(page, offset); } -static inline void truncate_partial_page(struct page *page, unsigned partial) +static inline void truncate_partial_page(struct address_space *mapping, + struct page *page, unsigned partial) { - zero_user_segment(page, partial, PAGE_CACHE_SIZE); + zero_user_segment(page, partial, page_cache_size(mapping)); if (PagePrivate(page)) do_invalidatepage(page, partial); } @@ -95,7 +96,7 @@ truncate_complete_page(struct address_sp if (page->mapping != mapping) return; - cancel_dirty_page(page, PAGE_CACHE_SIZE); + cancel_dirty_page(page, page_cache_size(mapping)); if (PagePrivate(page)) do_invalidatepage(page, 0); @@ -157,9 +158,9 @@ invalidate_complete_page(struct address_ void truncate_inode_pages_range(struct address_space *mapping, loff_t lstart, loff_t lend) { - const pgoff_t start = (lstart + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT; + const pgoff_t start = page_cache_next(mapping, lstart); pgoff_t end; - const unsigned partial = lstart & (PAGE_CACHE_SIZE - 1); + const unsigned partial = page_cache_offset(mapping, lstart); struct pagevec pvec; pgoff_t next; int i; @@ -167,8 +168,9 @@ void truncate_inode_pages_range(struct a if (mapping->nrpages == 0) return; - BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1)); - end = (lend >> PAGE_CACHE_SHIFT); + BUG_ON(page_cache_offset(mapping, lend) != + page_cache_size(mapping) - 1); + end = page_cache_index(mapping, lend); pagevec_init(&pvec, 0); next = start; @@ -194,8 +196,8 @@ void truncate_inode_pages_range(struct a } if (page_mapped(page)) { unmap_mapping_range(mapping, - (loff_t)page_index<index<index, 0), + page_cache_size(mapping), 0); } if (page->index > next) next = page->index; @@ -421,9 +423,8 @@ int invalidate_inode_pages2_range(struct * Zap the rest of the file in one hit. */ unmap_mapping_range(mapping, - (loff_t)page_index< References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:33 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [06/37] Use page_cache_xxx in mm/rmap.c Content-Disposition: inline; filename=vps_mm_rmap Signed-off-by: Christoph Lameter --- mm/rmap.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) Index: linux-2.6.22-rc4-mm2/mm/rmap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/rmap.c 2007-06-14 10:35:45.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/rmap.c 2007-06-14 10:49:29.000000000 -0700 @@ -210,9 +210,14 @@ static inline unsigned long vma_address(struct page *page, struct vm_area_struct *vma) { - pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + pgoff_t pgoff; unsigned long address; + if (PageAnon(page)) + pgoff = page->index; + else + pgoff = page->index << mapping_order(page->mapping); + address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); if (unlikely(address < vma->vm_start || address >= vma->vm_end)) { /* page should be within any vma from prio_tree_next */ @@ -357,7 +362,7 @@ { unsigned int mapcount; struct address_space *mapping = page->mapping; - pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + pgoff_t pgoff = page->index << (page_cache_shift(mapping) - PAGE_SHIFT); struct vm_area_struct *vma; struct prio_tree_iter iter; int referenced = 0; @@ -469,7 +474,7 @@ static int page_mkclean_file(struct address_space *mapping, struct page *page) { - pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + pgoff_t pgoff = page->index << (page_cache_shift(mapping) - PAGE_SHIFT); struct vm_area_struct *vma; struct prio_tree_iter iter; int ret = 0; @@ -885,7 +890,7 @@ static int try_to_unmap_file(struct page *page, int migration) { struct address_space *mapping = page->mapping; - pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + pgoff_t pgoff = page->index << (page_cache_shift(mapping) - PAGE_SHIFT); struct vm_area_struct *vma; struct prio_tree_iter iter; int ret = SWAP_AGAIN; -- From clameter@sgi.com Wed Jun 20 10:59:58 2007 Message-Id: <20070620175958.392767583@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:34 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [07/37] Use page_cache_xxx in mm/filemap_xip.c Content-Disposition: inline; filename=vps_mm_filemap_xip Signed-off-by: Christoph Lameter --- mm/filemap_xip.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) Index: vps/mm/filemap_xip.c =================================================================== --- vps.orig/mm/filemap_xip.c 2007-06-09 21:52:40.000000000 -0700 +++ vps/mm/filemap_xip.c 2007-06-09 21:58:11.000000000 -0700 @@ -60,24 +60,24 @@ do_xip_mapping_read(struct address_space BUG_ON(!mapping->a_ops->get_xip_page); - index = *ppos >> PAGE_CACHE_SHIFT; - offset = *ppos & ~PAGE_CACHE_MASK; + index = page_cache_index(mapping, *ppos); + offset = page_cache_offset(mapping, *ppos); isize = i_size_read(inode); if (!isize) goto out; - end_index = (isize - 1) >> PAGE_CACHE_SHIFT; + end_index = page_cache_index(mapping, isize - 1); for (;;) { struct page *page; unsigned long nr, ret; /* nr is the maximum number of bytes to copy from this page */ - nr = PAGE_CACHE_SIZE; + nr = page_cache_size(mapping); if (index >= end_index) { if (index > end_index) goto out; - nr = ((isize - 1) & ~PAGE_CACHE_MASK) + 1; + nr = page_cache_next(mapping, size - 1) + 1; if (nr <= offset) { goto out; } @@ -116,8 +116,8 @@ do_xip_mapping_read(struct address_space */ ret = actor(desc, page, offset, nr); offset += ret; - index += offset >> PAGE_CACHE_SHIFT; - offset &= ~PAGE_CACHE_MASK; + index += page_cache_index(mapping, offset); + offset = page_cache_offset(mapping, offset); if (ret == nr && desc->count) continue; @@ -130,7 +130,7 @@ no_xip_page: } out: - *ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset; + *ppos = page_cache_pos(mapping, index, offset); if (filp) file_accessed(filp); } @@ -242,7 +242,7 @@ static struct page *xip_file_fault(struc /* XXX: are VM_FAULT_ codes OK? */ - size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; + size = page_cache_next(mapping, i_size_read(inode)); if (fdata->pgoff >= size) { fdata->type = VM_FAULT_SIGBUS; return NULL; @@ -320,9 +320,9 @@ __xip_file_write(struct file *filp, cons size_t copied; char *kaddr; - offset = (pos & (PAGE_CACHE_SIZE -1)); /* Within page */ - index = pos >> PAGE_CACHE_SHIFT; - bytes = PAGE_CACHE_SIZE - offset; + offset = page_cache_offset(mapping, pos); /* Within page */ + index = page_cache_index(mapping, pos); + bytes = page_cache_size(mapping) - offset; if (bytes > count) bytes = count; @@ -433,8 +433,8 @@ EXPORT_SYMBOL_GPL(xip_file_write); int xip_truncate_page(struct address_space *mapping, loff_t from) { - pgoff_t index = from >> PAGE_CACHE_SHIFT; - unsigned offset = from & (PAGE_CACHE_SIZE-1); + pgoff_t index = page_cache_index(mapping, from); + unsigned offset = page_cache_offset(mapping, from); unsigned blocksize; unsigned length; struct page *page; -- From clameter@sgi.com Wed Jun 20 10:59:58 2007 Message-Id: <20070620175958.571046469@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:35 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [08/37] Use page_cache_xxx in mm/migrate.c Content-Disposition: inline; filename=vps_mm_migrate Signed-off-by: Christoph Lameter --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: vps/mm/migrate.c =================================================================== --- vps.orig/mm/migrate.c 2007-06-11 15:56:37.000000000 -0700 +++ vps/mm/migrate.c 2007-06-11 22:05:16.000000000 -0700 @@ -196,7 +196,7 @@ static void remove_file_migration_ptes(s struct vm_area_struct *vma; struct address_space *mapping = page_mapping(new); struct prio_tree_iter iter; - pgoff_t pgoff = new->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + pgoff_t pgoff = new->index << mapping_order(mapping); if (!mapping) return; -- From clameter@sgi.com Wed Jun 20 10:59:58 2007 Message-Id: <20070620175958.749029245@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:36 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [09/37] Use page_cache_xxx in fs/libfs.c Content-Disposition: inline; filename=vps_fs_libfs Signed-off-by: Christoph Lameter --- fs/libfs.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) Index: vps/fs/libfs.c =================================================================== --- vps.orig/fs/libfs.c 2007-06-11 21:39:09.000000000 -0700 +++ vps/fs/libfs.c 2007-06-11 22:08:13.000000000 -0700 @@ -16,7 +16,8 @@ int simple_getattr(struct vfsmount *mnt, { struct inode *inode = dentry->d_inode; generic_fillattr(inode, stat); - stat->blocks = inode->i_mapping->nrpages << (PAGE_CACHE_SHIFT - 9); + stat->blocks = inode->i_mapping->nrpages << + (page_cache_shift(inode->i_mapping) - 9); return 0; } @@ -340,10 +341,10 @@ int simple_prepare_write(struct file *fi unsigned from, unsigned to) { if (!PageUptodate(page)) { - if (to - from != PAGE_CACHE_SIZE) + if (to - from != page_cache_size(file->f_mapping)) zero_user_segments(page, 0, from, - to, PAGE_CACHE_SIZE); + to, page_cache_size(file->f_mapping)); } return 0; } @@ -356,8 +357,8 @@ int simple_write_begin(struct file *file pgoff_t index; unsigned from; - index = pos >> PAGE_CACHE_SHIFT; - from = pos & (PAGE_CACHE_SIZE - 1); + index = page_cache_index(mapping, pos); + from = page_cache_offset(mapping, pos); page = __grab_cache_page(mapping, index); if (!page) @@ -371,8 +372,9 @@ int simple_write_begin(struct file *file int simple_commit_write(struct file *file, struct page *page, unsigned from, unsigned to) { - struct inode *inode = page->mapping->host; - loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + struct address_space *mapping = page->mapping; + struct inode *inode = mapping->host; + loff_t pos = page_cache_pos(mapping, page->index, to); if (!PageUptodate(page)) SetPageUptodate(page); @@ -390,7 +392,7 @@ int simple_write_end(struct file *file, loff_t pos, unsigned len, unsigned copied, struct page *page, void *fsdata) { - unsigned from = pos & (PAGE_CACHE_SIZE - 1); + unsigned from = page_cache_offset(mapping, pos); /* zero the stale part of the page if we did a short copy */ if (copied < len) -- From clameter@sgi.com Wed Jun 20 10:59:59 2007 Message-Id: <20070620175958.911619816@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:37 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [10/37] Use page_cache_xxx in fs/sync. Content-Disposition: inline; filename=vps_fs_sync Signed-off-by: Christoph Lameter --- fs/sync.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) Index: vps/fs/sync.c =================================================================== --- vps.orig/fs/sync.c 2007-06-04 17:57:25.000000000 -0700 +++ vps/fs/sync.c 2007-06-09 21:17:45.000000000 -0700 @@ -252,8 +252,8 @@ int do_sync_mapping_range(struct address ret = 0; if (flags & SYNC_FILE_RANGE_WAIT_BEFORE) { ret = wait_on_page_writeback_range(mapping, - offset >> PAGE_CACHE_SHIFT, - endbyte >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, offset), + page_cache_index(mapping, endbyte)); if (ret < 0) goto out; } @@ -267,8 +267,8 @@ int do_sync_mapping_range(struct address if (flags & SYNC_FILE_RANGE_WAIT_AFTER) { ret = wait_on_page_writeback_range(mapping, - offset >> PAGE_CACHE_SHIFT, - endbyte >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, offset), + page_cache_index(mapping, endbyte)); } out: return ret; -- From clameter@sgi.com Wed Jun 20 10:59:59 2007 Message-Id: <20070620175959.086959094@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:38 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [11/37] Use page_cache_xxx in fs/buffer.c Content-Disposition: inline; filename=vps_fs_buffer Signed-off-by: Christoph Lameter --- fs/buffer.c | 99 +++++++++++++++++++++++++++++++++--------------------------- 1 file changed, 56 insertions(+), 43 deletions(-) Index: vps/fs/buffer.c =================================================================== --- vps.orig/fs/buffer.c 2007-06-11 22:33:07.000000000 -0700 +++ vps/fs/buffer.c 2007-06-11 22:34:34.000000000 -0700 @@ -265,7 +265,7 @@ __find_get_block_slow(struct block_devic struct page *page; int all_mapped = 1; - index = block >> (PAGE_CACHE_SHIFT - bd_inode->i_blkbits); + index = block >> (page_cache_shift(bd_mapping) - bd_inode->i_blkbits); page = find_get_page(bd_mapping, index); if (!page) goto out; @@ -705,7 +705,7 @@ static int __set_page_dirty(struct page if (mapping_cap_account_dirty(mapping)) { __inc_zone_page_state(page, NR_FILE_DIRTY); - task_io_account_write(PAGE_CACHE_SIZE); + task_io_account_write(page_cache_size(mapping)); } radix_tree_tag_set(&mapping->page_tree, page_index(page), PAGECACHE_TAG_DIRTY); @@ -899,10 +899,11 @@ struct buffer_head *alloc_page_buffers(s { struct buffer_head *bh, *head; long offset; + unsigned int page_size = page_cache_size(page->mapping); try_again: head = NULL; - offset = PAGE_SIZE; + offset = page_size; while ((offset -= size) >= 0) { bh = alloc_buffer_head(GFP_NOFS); if (!bh) @@ -1434,7 +1435,7 @@ void set_bh_page(struct buffer_head *bh, struct page *page, unsigned long offset) { bh->b_page = page; - BUG_ON(offset >= PAGE_SIZE); + BUG_ON(offset >= page_cache_size(page->mapping)); if (PageHighMem(page)) /* * This catches illegal uses and preserves the offset: @@ -1613,6 +1614,7 @@ static int __block_write_full_page(struc struct buffer_head *bh, *head; const unsigned blocksize = 1 << inode->i_blkbits; int nr_underway = 0; + struct address_space *mapping = inode->i_mapping; BUG_ON(!PageLocked(page)); @@ -1633,7 +1635,8 @@ static int __block_write_full_page(struc * handle that here by just cleaning them. */ - block = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits); + block = (sector_t)page->index << + (page_cache_shift(mapping) - inode->i_blkbits); head = page_buffers(page); bh = head; @@ -1750,7 +1753,7 @@ recover: } while ((bh = bh->b_this_page) != head); SetPageError(page); BUG_ON(PageWriteback(page)); - mapping_set_error(page->mapping, err); + mapping_set_error(mapping, err); set_page_writeback(page); do { struct buffer_head *next = bh->b_this_page; @@ -1817,8 +1820,8 @@ static int __block_prepare_write(struct struct buffer_head *bh, *head, *wait[2], **wait_bh=wait; BUG_ON(!PageLocked(page)); - BUG_ON(from > PAGE_CACHE_SIZE); - BUG_ON(to > PAGE_CACHE_SIZE); + BUG_ON(from > page_cache_size(inode->i_mapping)); + BUG_ON(to > page_cache_size(inode->i_mapping)); BUG_ON(from > to); blocksize = 1 << inode->i_blkbits; @@ -1827,7 +1830,8 @@ static int __block_prepare_write(struct head = page_buffers(page); bbits = inode->i_blkbits; - block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits); + block = (sector_t)page->index << + (page_cache_shift(inode->i_mapping) - bbits); for(bh = head, block_start = 0; bh != head || !block_start; block++, block_start=block_end, bh = bh->b_this_page) { @@ -1942,8 +1946,8 @@ int block_write_begin(struct file *file, unsigned start, end; int ownpage = 0; - index = pos >> PAGE_CACHE_SHIFT; - start = pos & (PAGE_CACHE_SIZE - 1); + index = page_cache_index(mapping, pos); + start = page_cache_offset(mapping, pos); end = start + len; page = *pagep; @@ -1989,7 +1993,7 @@ int block_write_end(struct file *file, s struct inode *inode = mapping->host; unsigned start; - start = pos & (PAGE_CACHE_SIZE - 1); + start = page_cache_offset(mapping, pos); if (unlikely(copied < len)) { /* @@ -2065,7 +2069,8 @@ int block_read_full_page(struct page *pa create_empty_buffers(page, blocksize, 0); head = page_buffers(page); - iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits); + iblock = (sector_t)page->index << + (page_cache_shift(page->mapping) - inode->i_blkbits); lblock = (i_size_read(inode)+blocksize-1) >> inode->i_blkbits; bh = head; nr = 0; @@ -2183,16 +2188,17 @@ int cont_expand_zero(struct file *file, unsigned zerofrom, offset, len; int err = 0; - index = pos >> PAGE_CACHE_SHIFT; - offset = pos & ~PAGE_CACHE_MASK; + index = page_cache_index(mapping, pos); + offset = page_cache_offset(mapping, pos); - while (index > (curidx = (curpos = *bytes)>>PAGE_CACHE_SHIFT)) { - zerofrom = curpos & ~PAGE_CACHE_MASK; + while (index > (curidx = page_cache_index(mapping, + (curpos = *bytes)))) { + zerofrom = page_cache_offset(mapping, curpos); if (zerofrom & (blocksize-1)) { *bytes |= (blocksize-1); (*bytes)++; } - len = PAGE_CACHE_SIZE - zerofrom; + len = page_cache_size(mapping) - zerofrom; err = pagecache_write_begin(file, mapping, curpos, len, AOP_FLAG_UNINTERRUPTIBLE, @@ -2210,7 +2216,7 @@ int cont_expand_zero(struct file *file, /* page covers the boundary, find the boundary offset */ if (index == curidx) { - zerofrom = curpos & ~PAGE_CACHE_MASK; + zerofrom = page_cache_offset(mapping, curpos); /* if we will expand the thing last block will be filled */ if (offset <= zerofrom) { goto out; @@ -2256,7 +2262,7 @@ int cont_write_begin(struct file *file, if (err) goto out; - zerofrom = *bytes & ~PAGE_CACHE_MASK; + zerofrom = page_cache_offset(mapping, *bytes); if (pos+len > *bytes && zerofrom & (blocksize-1)) { *bytes |= (blocksize-1); (*bytes)++; @@ -2289,8 +2295,9 @@ int block_commit_write(struct page *page int generic_commit_write(struct file *file, struct page *page, unsigned from, unsigned to) { - struct inode *inode = page->mapping->host; - loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + struct address_space *mapping = page->mapping; + struct inode *inode = mapping->host; + loff_t pos = page_cache_pos(mapping, page->index, to); __block_commit_write(inode,page,from,to); /* * No need to use i_size_read() here, the i_size @@ -2332,6 +2339,7 @@ static void end_buffer_read_nobh(struct int nobh_prepare_write(struct page *page, unsigned from, unsigned to, get_block_t *get_block) { + struct address_space *mapping = page->mapping; struct inode *inode = page->mapping->host; const unsigned blkbits = inode->i_blkbits; const unsigned blocksize = 1 << blkbits; @@ -2339,6 +2347,7 @@ int nobh_prepare_write(struct page *page struct buffer_head *read_bh[MAX_BUF_PER_PAGE]; unsigned block_in_page; unsigned block_start; + unsigned page_size = page_cache_size(mapping); sector_t block_in_file; int nr_reads = 0; int i; @@ -2348,7 +2357,8 @@ int nobh_prepare_write(struct page *page if (PageMappedToDisk(page)) return 0; - block_in_file = (sector_t)page->index << (PAGE_CACHE_SHIFT - blkbits); + block_in_file = (sector_t)page->index << + (page_cache_shift(mapping) - blkbits); map_bh.b_page = page; /* @@ -2357,7 +2367,7 @@ int nobh_prepare_write(struct page *page * page is fully mapped-to-disk. */ for (block_start = 0, block_in_page = 0; - block_start < PAGE_CACHE_SIZE; + block_start < page_size; block_in_page++, block_start += blocksize) { unsigned block_end = block_start + blocksize; int create; @@ -2446,7 +2456,7 @@ failed: * Error recovery is pretty slack. Clear the page and mark it dirty * so we'll later zero out any blocks which _were_ allocated. */ - zero_user(page, 0, PAGE_CACHE_SIZE); + zero_user(page, 0, page_size); SetPageUptodate(page); set_page_dirty(page); return ret; @@ -2460,8 +2470,9 @@ EXPORT_SYMBOL(nobh_prepare_write); int nobh_commit_write(struct file *file, struct page *page, unsigned from, unsigned to) { - struct inode *inode = page->mapping->host; - loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + struct address_space *mapping = page->mapping; + struct inode *inode = mapping->host; + loff_t pos = page_cache_pos(mapping, page->index, to); SetPageUptodate(page); set_page_dirty(page); @@ -2481,9 +2492,10 @@ EXPORT_SYMBOL(nobh_commit_write); int nobh_writepage(struct page *page, get_block_t *get_block, struct writeback_control *wbc) { - struct inode * const inode = page->mapping->host; + struct address_space *mapping = page->mapping; + struct inode * const inode = mapping->host; loff_t i_size = i_size_read(inode); - const pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT; + const pgoff_t end_index = page_cache_offset(mapping, i_size); unsigned offset; int ret; @@ -2492,7 +2504,7 @@ int nobh_writepage(struct page *page, ge goto out; /* Is the page fully outside i_size? (truncate in progress) */ - offset = i_size & (PAGE_CACHE_SIZE-1); + offset = page_cache_offset(mapping, i_size); if (page->index >= end_index+1 || !offset) { /* * The page may have dirty, unmapped buffers. For example, @@ -2515,7 +2527,7 @@ int nobh_writepage(struct page *page, ge * the page size, the remaining memory is zeroed when mapped, and * writes to that region are not written out to the file." */ - zero_user_segment(page, offset, PAGE_CACHE_SIZE); + zero_user_segment(page, offset, page_cache_size(mapping)); out: ret = mpage_writepage(page, get_block, wbc); if (ret == -EAGAIN) @@ -2531,8 +2543,8 @@ int nobh_truncate_page(struct address_sp { struct inode *inode = mapping->host; unsigned blocksize = 1 << inode->i_blkbits; - pgoff_t index = from >> PAGE_CACHE_SHIFT; - unsigned offset = from & (PAGE_CACHE_SIZE-1); + pgoff_t index = page_cache_index(mapping, from); + unsigned offset = page_cache_offset(mapping, from); unsigned to; struct page *page; const struct address_space_operations *a_ops = mapping->a_ops; @@ -2549,7 +2561,7 @@ int nobh_truncate_page(struct address_sp to = (offset + blocksize) & ~(blocksize - 1); ret = a_ops->prepare_write(NULL, page, offset, to); if (ret == 0) { - zero_user_segment(page, offset, PAGE_CACHE_SIZE); + zero_user_segment(page, offset, page_cache_size(mapping)); /* * It would be more correct to call aops->commit_write() * here, but this is more efficient. @@ -2567,8 +2579,8 @@ EXPORT_SYMBOL(nobh_truncate_page); int block_truncate_page(struct address_space *mapping, loff_t from, get_block_t *get_block) { - pgoff_t index = from >> PAGE_CACHE_SHIFT; - unsigned offset = from & (PAGE_CACHE_SIZE-1); + pgoff_t index = page_cache_index(mapping, from); + unsigned offset = page_cache_offset(mapping, from); unsigned blocksize; sector_t iblock; unsigned length, pos; @@ -2585,8 +2597,8 @@ int block_truncate_page(struct address_s return 0; length = blocksize - length; - iblock = (sector_t)index << (PAGE_CACHE_SHIFT - inode->i_blkbits); - + iblock = (sector_t)index << + (page_cache_shift(mapping) - inode->i_blkbits); page = grab_cache_page(mapping, index); err = -ENOMEM; if (!page) @@ -2645,9 +2657,10 @@ out: int block_write_full_page(struct page *page, get_block_t *get_block, struct writeback_control *wbc) { - struct inode * const inode = page->mapping->host; + struct address_space *mapping = page->mapping; + struct inode * const inode = mapping->host; loff_t i_size = i_size_read(inode); - const pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT; + const pgoff_t end_index = page_cache_index(mapping, i_size); unsigned offset; /* Is the page fully inside i_size? */ @@ -2655,7 +2668,7 @@ int block_write_full_page(struct page *p return __block_write_full_page(inode, page, get_block, wbc); /* Is the page fully outside i_size? (truncate in progress) */ - offset = i_size & (PAGE_CACHE_SIZE-1); + offset = page_cache_offset(mapping, i_size); if (page->index >= end_index+1 || !offset) { /* * The page may have dirty, unmapped buffers. For example, @@ -2674,7 +2687,7 @@ int block_write_full_page(struct page *p * the page size, the remaining memory is zeroed when mapped, and * writes to that region are not written out to the file." */ - zero_user_segment(page, offset, PAGE_CACHE_SIZE); + zero_user_segment(page, offset, page_cache_size(mapping)); return __block_write_full_page(inode, page, get_block, wbc); } @@ -2928,7 +2941,7 @@ int try_to_free_buffers(struct page *pag * dirty bit from being lost. */ if (ret) - cancel_dirty_page(page, PAGE_CACHE_SIZE); + cancel_dirty_page(page, page_cache_size(mapping)); spin_unlock(&mapping->private_lock); out: if (buffers_to_free) { -- From clameter@sgi.com Wed Jun 20 10:59:59 2007 Message-Id: <20070620175959.337726295@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:39 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [12/37] Use page_cache_xxx in mm/mpage.c Content-Disposition: inline; filename=vps_fs_mpage Signed-off-by: Christoph Lameter --- fs/mpage.c | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) Index: vps/fs/mpage.c =================================================================== --- vps.orig/fs/mpage.c 2007-06-11 22:33:07.000000000 -0700 +++ vps/fs/mpage.c 2007-06-11 22:37:24.000000000 -0700 @@ -133,7 +133,8 @@ mpage_alloc(struct block_device *bdev, static void map_buffer_to_page(struct page *page, struct buffer_head *bh, int page_block) { - struct inode *inode = page->mapping->host; + struct address_space *mapping = page->mapping; + struct inode *inode = mapping->host; struct buffer_head *page_bh, *head; int block = 0; @@ -142,9 +143,9 @@ map_buffer_to_page(struct page *page, st * don't make any buffers if there is only one buffer on * the page and the page just needs to be set up to date */ - if (inode->i_blkbits == PAGE_CACHE_SHIFT && + if (inode->i_blkbits == page_cache_shift(mapping) && buffer_uptodate(bh)) { - SetPageUptodate(page); + SetPageUptodate(page); return; } create_empty_buffers(page, 1 << inode->i_blkbits, 0); @@ -177,9 +178,10 @@ do_mpage_readpage(struct bio *bio, struc sector_t *last_block_in_bio, struct buffer_head *map_bh, unsigned long *first_logical_block, get_block_t get_block) { - struct inode *inode = page->mapping->host; + struct address_space *mapping = page->mapping; + struct inode *inode = mapping->host; const unsigned blkbits = inode->i_blkbits; - const unsigned blocks_per_page = PAGE_CACHE_SIZE >> blkbits; + const unsigned blocks_per_page = page_cache_size(mapping) >> blkbits; const unsigned blocksize = 1 << blkbits; sector_t block_in_file; sector_t last_block; @@ -196,7 +198,7 @@ do_mpage_readpage(struct bio *bio, struc if (page_has_buffers(page)) goto confused; - block_in_file = (sector_t)page->index << (PAGE_CACHE_SHIFT - blkbits); + block_in_file = (sector_t)page->index << (page_cache_shift(mapping) - blkbits); last_block = block_in_file + nr_pages * blocks_per_page; last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; if (last_block > last_block_in_file) @@ -284,7 +286,8 @@ do_mpage_readpage(struct bio *bio, struc } if (first_hole != blocks_per_page) { - zero_user_segment(page, first_hole << blkbits, PAGE_CACHE_SIZE); + zero_user_segment(page, first_hole << blkbits, + page_cache_size(mapping)); if (first_hole == 0) { SetPageUptodate(page); unlock_page(page); @@ -462,7 +465,7 @@ static int __mpage_writepage(struct page struct inode *inode = page->mapping->host; const unsigned blkbits = inode->i_blkbits; unsigned long end_index; - const unsigned blocks_per_page = PAGE_CACHE_SIZE >> blkbits; + const unsigned blocks_per_page = page_cache_size(mapping) >> blkbits; sector_t last_block; sector_t block_in_file; sector_t blocks[MAX_BUF_PER_PAGE]; @@ -531,7 +534,8 @@ static int __mpage_writepage(struct page * The page has no buffers: map it to disk */ BUG_ON(!PageUptodate(page)); - block_in_file = (sector_t)page->index << (PAGE_CACHE_SHIFT - blkbits); + block_in_file = (sector_t)page->index << + (page_cache_shift(mapping) - blkbits); last_block = (i_size - 1) >> blkbits; map_bh.b_page = page; for (page_block = 0; page_block < blocks_per_page; ) { @@ -563,7 +567,7 @@ static int __mpage_writepage(struct page first_unmapped = page_block; page_is_mapped: - end_index = i_size >> PAGE_CACHE_SHIFT; + end_index = page_cache_index(mapping, i_size); if (page->index >= end_index) { /* * The page straddles i_size. It must be zeroed out on each @@ -573,11 +577,11 @@ page_is_mapped: * is zeroed when mapped, and writes to that region are not * written out to the file." */ - unsigned offset = i_size & (PAGE_CACHE_SIZE - 1); + unsigned offset = page_cache_offset(mapping, i_size); if (page->index > end_index || !offset) goto confused; - zero_user_segment(page, offset, PAGE_CACHE_SIZE); + zero_user_segment(page, offset, page_cache_size(mapping)); } /* -- From clameter@sgi.com Wed Jun 20 10:59:59 2007 Message-Id: <20070620175959.412855214@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:40 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [13/37] Use page_cache_xxx in mm/fadvise.c Content-Disposition: inline; filename=vps_fs_fadvise Signed-off-by: Christoph Lameter --- mm/fadvise.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) Index: vps/mm/fadvise.c =================================================================== --- vps.orig/mm/fadvise.c 2007-06-04 17:57:25.000000000 -0700 +++ vps/mm/fadvise.c 2007-06-09 21:32:46.000000000 -0700 @@ -79,8 +79,8 @@ asmlinkage long sys_fadvise64_64(int fd, } /* First and last PARTIAL page! */ - start_index = offset >> PAGE_CACHE_SHIFT; - end_index = endbyte >> PAGE_CACHE_SHIFT; + start_index = page_cache_index(mapping, offset); + end_index = page_cache_index(mapping, endbyte); /* Careful about overflow on the "+1" */ nrpages = end_index - start_index + 1; @@ -100,8 +100,8 @@ asmlinkage long sys_fadvise64_64(int fd, filemap_flush(mapping); /* First and last FULL page! */ - start_index = (offset+(PAGE_CACHE_SIZE-1)) >> PAGE_CACHE_SHIFT; - end_index = (endbyte >> PAGE_CACHE_SHIFT); + start_index = page_cache_next(mapping, offset); + end_index = page_cache_index(mapping, endbyte); if (end_index >= start_index) invalidate_mapping_pages(mapping, start_index, -- From clameter@sgi.com Wed Jun 20 10:59:59 2007 Message-Id: <20070620175959.591009368@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:41 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [14/37] Use page_cache_xxx in fs/splice.c Content-Disposition: inline; filename=vps_fs_splice Signed-off-by: Christoph Lameter --- fs/splice.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) Index: vps/fs/splice.c =================================================================== --- vps.orig/fs/splice.c 2007-06-09 22:18:02.000000000 -0700 +++ vps/fs/splice.c 2007-06-09 22:22:08.000000000 -0700 @@ -282,9 +282,9 @@ __generic_file_splice_read(struct file * .ops = &page_cache_pipe_buf_ops, }; - index = *ppos >> PAGE_CACHE_SHIFT; - loff = *ppos & ~PAGE_CACHE_MASK; - nr_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; + index = page_cache_index(mapping, *ppos); + loff = page_cache_offset(mapping, *ppos); + nr_pages = page_cache_next(mapping, len + loff); if (nr_pages > PIPE_BUFFERS) nr_pages = PIPE_BUFFERS; @@ -345,7 +345,7 @@ __generic_file_splice_read(struct file * * Now loop over the map and see if we need to start IO on any * pages, fill in the partial map, etc. */ - index = *ppos >> PAGE_CACHE_SHIFT; + index = page_cache_index(mapping, *ppos); nr_pages = spd.nr_pages; spd.nr_pages = 0; for (page_nr = 0; page_nr < nr_pages; page_nr++) { @@ -357,7 +357,8 @@ __generic_file_splice_read(struct file * /* * this_len is the max we'll use from this page */ - this_len = min_t(unsigned long, len, PAGE_CACHE_SIZE - loff); + this_len = min_t(unsigned long, len, + page_cache_size(mapping) - loff); page = pages[page_nr]; if (PageReadahead(page)) @@ -416,7 +417,7 @@ __generic_file_splice_read(struct file * * i_size must be checked after ->readpage(). */ isize = i_size_read(mapping->host); - end_index = (isize - 1) >> PAGE_CACHE_SHIFT; + end_index = page_cache_index(mapping, isize - 1); if (unlikely(!isize || index > end_index)) break; @@ -425,7 +426,8 @@ __generic_file_splice_read(struct file * * the length and stop */ if (end_index == index) { - loff = PAGE_CACHE_SIZE - (isize & ~PAGE_CACHE_MASK); + loff = page_cache_size(mapping) + - page_cache_offset(mapping, isize); if (total_len + loff > isize) break; /* @@ -557,6 +559,7 @@ static int pipe_to_file(struct pipe_inod struct page *page; void *fsdata; int ret; + int pagesize = page_cache_size(mapping); /* * make sure the data in this buffer is uptodate @@ -565,11 +568,11 @@ static int pipe_to_file(struct pipe_inod if (unlikely(ret)) return ret; - offset = sd->pos & ~PAGE_CACHE_MASK; + offset = page_cache_offset(mapping, sd->pos); this_len = sd->len; - if (this_len + offset > PAGE_CACHE_SIZE) - this_len = PAGE_CACHE_SIZE - offset; + if (this_len + offset > pagesize) + this_len = pagesize - offset; ret = pagecache_write_begin(file, mapping, sd->pos, sd->len, AOP_FLAG_UNINTERRUPTIBLE, &page, &fsdata); -- From clameter@sgi.com Wed Jun 20 10:59:59 2007 Message-Id: <20070620175959.779069814@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:42 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [15/37] Use page_cache_xxx functions in fs/ext2 Content-Disposition: inline; filename=vps_fs_ext2 Signed-off-by: Christoph Lameter --- fs/ext2/dir.c | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/ext2/dir.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext2/dir.c 2007-06-15 17:35:32.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext2/dir.c 2007-06-18 18:57:34.000000000 -0700 @@ -45,7 +45,8 @@ static inline void ext2_put_page(struct static inline unsigned long dir_pages(struct inode *inode) { - return (inode->i_size+PAGE_CACHE_SIZE-1)>>PAGE_CACHE_SHIFT; + return (inode->i_size+page_cache_size(inode->i_mapping)-1)>> + page_cache_shift(inode->i_mapping); } /* @@ -56,10 +57,11 @@ static unsigned ext2_last_byte(struct inode *inode, unsigned long page_nr) { unsigned last_byte = inode->i_size; + struct address_space *mapping = inode->i_mapping; - last_byte -= page_nr << PAGE_CACHE_SHIFT; - if (last_byte > PAGE_CACHE_SIZE) - last_byte = PAGE_CACHE_SIZE; + last_byte -= page_nr << page_cache_shift(mapping); + if (last_byte > page_cache_size(mapping)) + last_byte = page_cache_size(mapping); return last_byte; } @@ -88,18 +90,19 @@ static int ext2_commit_chunk(struct page static void ext2_check_page(struct page *page) { - struct inode *dir = page->mapping->host; + struct address_space *mapping = page->mapping; + struct inode *dir = mapping->host; struct super_block *sb = dir->i_sb; unsigned chunk_size = ext2_chunk_size(dir); char *kaddr = page_address(page); u32 max_inumber = le32_to_cpu(EXT2_SB(sb)->s_es->s_inodes_count); unsigned offs, rec_len; - unsigned limit = PAGE_CACHE_SIZE; + unsigned limit = page_cache_size(mapping); ext2_dirent *p; char *error; - if ((dir->i_size >> PAGE_CACHE_SHIFT) == page->index) { - limit = dir->i_size & ~PAGE_CACHE_MASK; + if (page_cache_index(mapping, dir->i_size) == page->index) { + limit = page_cache_offset(mapping, dir->i_size); if (limit & (chunk_size - 1)) goto Ebadsize; if (!limit) @@ -151,7 +154,7 @@ Einumber: bad_entry: ext2_error (sb, "ext2_check_page", "bad entry in directory #%lu: %s - " "offset=%lu, inode=%lu, rec_len=%d, name_len=%d", - dir->i_ino, error, (page->index<i_ino, error, page_cache_pos(mapping, page->index, offs), (unsigned long) le32_to_cpu(p->inode), rec_len, p->name_len); goto fail; @@ -160,7 +163,7 @@ Eend: ext2_error (sb, "ext2_check_page", "entry in directory #%lu spans the page boundary" "offset=%lu, inode=%lu", - dir->i_ino, (page->index<i_ino, page_cache_pos(mapping, page->index, offs), (unsigned long) le32_to_cpu(p->inode)); fail: SetPageChecked(page); @@ -258,8 +261,9 @@ ext2_readdir (struct file * filp, void * loff_t pos = filp->f_pos; struct inode *inode = filp->f_path.dentry->d_inode; struct super_block *sb = inode->i_sb; - unsigned int offset = pos & ~PAGE_CACHE_MASK; - unsigned long n = pos >> PAGE_CACHE_SHIFT; + struct address_space *mapping = inode->i_mapping; + unsigned int offset = page_cache_offset(mapping, pos); + unsigned long n = page_cache_index(mapping, pos); unsigned long npages = dir_pages(inode); unsigned chunk_mask = ~(ext2_chunk_size(inode)-1); unsigned char *types = NULL; @@ -280,14 +284,14 @@ ext2_readdir (struct file * filp, void * ext2_error(sb, __FUNCTION__, "bad page in #%lu", inode->i_ino); - filp->f_pos += PAGE_CACHE_SIZE - offset; + filp->f_pos += page_cache_size(mapping) - offset; return -EIO; } kaddr = page_address(page); if (unlikely(need_revalidate)) { if (offset) { offset = ext2_validate_entry(kaddr, offset, chunk_mask); - filp->f_pos = (n<f_pos = page_cache_pos(mapping, n, offset); } filp->f_version = inode->i_version; need_revalidate = 0; @@ -310,7 +314,7 @@ ext2_readdir (struct file * filp, void * offset = (char *)de - kaddr; over = filldir(dirent, de->name, de->name_len, - (n<inode), d_type); if (over) { ext2_put_page(page); @@ -336,6 +340,7 @@ struct ext2_dir_entry_2 * ext2_find_entr struct dentry *dentry, struct page ** res_page) { const char *name = dentry->d_name.name; + struct address_space *mapping = dir->i_mapping; int namelen = dentry->d_name.len; unsigned reclen = EXT2_DIR_REC_LEN(namelen); unsigned long start, n; @@ -377,7 +382,7 @@ struct ext2_dir_entry_2 * ext2_find_entr if (++n >= npages) n = 0; /* next page is past the blocks we've got */ - if (unlikely(n > (dir->i_blocks >> (PAGE_CACHE_SHIFT - 9)))) { + if (unlikely(n > (dir->i_blocks >> (page_cache_shift(mapping) - 9)))) { ext2_error(dir->i_sb, __FUNCTION__, "dir %lu size %lld exceeds block count %llu", dir->i_ino, dir->i_size, @@ -448,6 +453,7 @@ void ext2_set_link(struct inode *dir, st int ext2_add_link (struct dentry *dentry, struct inode *inode) { struct inode *dir = dentry->d_parent->d_inode; + struct address_space *mapping = inode->i_mapping; const char *name = dentry->d_name.name; int namelen = dentry->d_name.len; unsigned chunk_size = ext2_chunk_size(dir); @@ -477,7 +483,7 @@ int ext2_add_link (struct dentry *dentry kaddr = page_address(page); dir_end = kaddr + ext2_last_byte(dir, n); de = (ext2_dirent *)kaddr; - kaddr += PAGE_CACHE_SIZE - reclen; + kaddr += page_cache_size(mapping) - reclen; while ((char *)de <= kaddr) { if ((char *)de == dir_end) { /* We hit i_size */ -- From clameter@sgi.com Wed Jun 20 11:00:00 2007 Message-Id: <20070620175959.935997683@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:43 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [16/37] Use page_cache_xxx in fs/ext3 Content-Disposition: inline; filename=vps_fs_ext3 Signed-off-by: Christoph Lameter --- fs/ext3/dir.c | 3 ++- fs/ext3/inode.c | 36 ++++++++++++++++++------------------ 2 files changed, 20 insertions(+), 19 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/ext3/dir.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext3/dir.c 2007-06-15 17:35:33.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext3/dir.c 2007-06-18 18:59:38.000000000 -0700 @@ -137,7 +137,8 @@ static int ext3_readdir(struct file * fi &map_bh, 0, 0); if (err > 0) { pgoff_t index = map_bh.b_blocknr >> - (PAGE_CACHE_SHIFT - inode->i_blkbits); + (page_cache_shift(inode->i_mapping) + - inode->i_blkbits); if (!ra_has_index(&filp->f_ra, index)) page_cache_readahead_ondemand( sb->s_bdev->bd_inode->i_mapping, Index: linux-2.6.22-rc4-mm2/fs/ext3/inode.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext3/inode.c 2007-06-18 18:42:45.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext3/inode.c 2007-06-18 18:59:38.000000000 -0700 @@ -1159,8 +1159,8 @@ static int ext3_write_begin(struct file pgoff_t index; unsigned from, to; - index = pos >> PAGE_CACHE_SHIFT; - from = pos & (PAGE_CACHE_SIZE - 1); + index = page_cache_index(mapping, pos); + from = page_cache_offset(mapping, pos); to = from + len; retry: @@ -1233,7 +1233,7 @@ static int ext3_ordered_write_end(struct unsigned from, to; int ret = 0, ret2; - from = pos & (PAGE_CACHE_SIZE - 1); + from = page_cache_offset(mapping, pos); to = from + len; ret = walk_page_buffers(handle, page_buffers(page), @@ -1300,7 +1300,7 @@ static int ext3_journalled_write_end(str int partial = 0; unsigned from, to; - from = pos & (PAGE_CACHE_SIZE - 1); + from = page_cache_offset(mapping, pos); to = from + len; if (copied < len) { @@ -1462,6 +1462,7 @@ static int ext3_ordered_writepage(struct handle_t *handle = NULL; int ret = 0; int err; + int pagesize = page_cache_size(inode->i_mapping); J_ASSERT(PageLocked(page)); @@ -1484,8 +1485,7 @@ static int ext3_ordered_writepage(struct (1 << BH_Dirty)|(1 << BH_Uptodate)); } page_bufs = page_buffers(page); - walk_page_buffers(handle, page_bufs, 0, - PAGE_CACHE_SIZE, NULL, bget_one); + walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bget_one); ret = block_write_full_page(page, ext3_get_block, wbc); @@ -1502,13 +1502,12 @@ static int ext3_ordered_writepage(struct * and generally junk. */ if (ret == 0) { - err = walk_page_buffers(handle, page_bufs, 0, PAGE_CACHE_SIZE, - NULL, journal_dirty_data_fn); + err = walk_page_buffers(handle, page_bufs, 0, pagesize, + NULL, journal_dirty_data_fn); if (!ret) ret = err; } - walk_page_buffers(handle, page_bufs, 0, - PAGE_CACHE_SIZE, NULL, bput_one); + walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bput_one); err = ext3_journal_stop(handle); if (!ret) ret = err; @@ -1560,6 +1559,7 @@ static int ext3_journalled_writepage(str handle_t *handle = NULL; int ret = 0; int err; + int pagesize = page_cache_size(inode->i_mapping); if (ext3_journal_current_handle()) goto no_write; @@ -1576,17 +1576,16 @@ static int ext3_journalled_writepage(str * doesn't seem much point in redirtying the page here. */ ClearPageChecked(page); - ret = block_prepare_write(page, 0, PAGE_CACHE_SIZE, - ext3_get_block); + ret = block_prepare_write(page, 0, pagesize, ext3_get_block); if (ret != 0) { ext3_journal_stop(handle); goto out_unlock; } ret = walk_page_buffers(handle, page_buffers(page), 0, - PAGE_CACHE_SIZE, NULL, do_journal_get_write_access); + pagesize, NULL, do_journal_get_write_access); err = walk_page_buffers(handle, page_buffers(page), 0, - PAGE_CACHE_SIZE, NULL, write_end_fn); + pagesize, NULL, write_end_fn); if (ret == 0) ret = err; EXT3_I(inode)->i_state |= EXT3_STATE_JDATA; @@ -1801,8 +1800,8 @@ void ext3_set_aops(struct inode *inode) static int ext3_block_truncate_page(handle_t *handle, struct page *page, struct address_space *mapping, loff_t from) { - ext3_fsblk_t index = from >> PAGE_CACHE_SHIFT; - unsigned offset = from & (PAGE_CACHE_SIZE-1); + ext3_fsblk_t index = page_cache_index(mapping, from); + unsigned offset = page_cache_offset(mapping, from); unsigned blocksize, iblock, length, pos; struct inode *inode = mapping->host; struct buffer_head *bh; @@ -1810,7 +1809,8 @@ static int ext3_block_truncate_page(hand blocksize = inode->i_sb->s_blocksize; length = blocksize - (offset & (blocksize - 1)); - iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits); + iblock = index << + (page_cache_shift(mapping) - inode->i_sb->s_blocksize_bits); /* * For "nobh" option, we can only work if we don't need to @@ -2289,7 +2289,7 @@ void ext3_truncate(struct inode *inode) page = NULL; } else { page = grab_cache_page(mapping, - inode->i_size >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, inode->i_size)); if (!page) return; } -- From clameter@sgi.com Wed Jun 20 11:00:00 2007 Message-Id: <20070620180000.096527573@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:44 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [17/37] Use page_cache_xxx in fs/ext4 Content-Disposition: inline; filename=vps_fs_ext4 Signed-off-by: Christoph Lameter --- fs/ext4/dir.c | 3 ++- fs/ext4/inode.c | 34 +++++++++++++++++----------------- fs/ext4/writeback.c | 35 +++++++++++++++++++---------------- 3 files changed, 38 insertions(+), 34 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/ext4/dir.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext4/dir.c 2007-06-18 19:01:00.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext4/dir.c 2007-06-18 19:01:06.000000000 -0700 @@ -136,7 +136,8 @@ static int ext4_readdir(struct file * fi err = ext4_get_blocks_wrap(NULL, inode, blk, 1, &map_bh, 0, 0); if (err > 0) { pgoff_t index = map_bh.b_blocknr >> - (PAGE_CACHE_SHIFT - inode->i_blkbits); + (page_cache_size(node->i_mapping) + - inode->i_blkbits); if (!ra_has_index(&filp->f_ra, index)) page_cache_readahead_ondemand( sb->s_bdev->bd_inode->i_mapping, Index: linux-2.6.22-rc4-mm2/fs/ext4/inode.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext4/inode.c 2007-06-18 19:01:00.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext4/inode.c 2007-06-18 19:01:06.000000000 -0700 @@ -1158,8 +1158,8 @@ static int ext4_write_begin(struct file pgoff_t index; unsigned from, to; - index = pos >> PAGE_CACHE_SHIFT; - from = pos & (PAGE_CACHE_SIZE - 1); + index = page_cache_index(mapping, pos); + from = page_cache_offset(mapping, pos); to = from + len; retry: @@ -1231,7 +1231,7 @@ static int ext4_ordered_write_end(struct unsigned from, to; int ret = 0, ret2; - from = pos & (PAGE_CACHE_SIZE - 1); + from = page_cache_offset(mapping, pos); to = from + len; ret = walk_page_buffers(handle, page_buffers(page), @@ -1298,7 +1298,7 @@ static int ext4_journalled_write_end(str int partial = 0; unsigned from, to; - from = pos & (PAGE_CACHE_SIZE - 1); + from = page_cache_offset(mapping, pos); to = from + len; if (copied < len) { @@ -1460,6 +1460,7 @@ static int ext4_ordered_writepage(struct handle_t *handle = NULL; int ret = 0; int err; + int pagesize = page_cache_size(inode->i_mapping); J_ASSERT(PageLocked(page)); @@ -1482,8 +1483,7 @@ static int ext4_ordered_writepage(struct (1 << BH_Dirty)|(1 << BH_Uptodate)); } page_bufs = page_buffers(page); - walk_page_buffers(handle, page_bufs, 0, - PAGE_CACHE_SIZE, NULL, bget_one); + walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bget_one); ret = block_write_full_page(page, ext4_get_block, wbc); @@ -1500,13 +1500,12 @@ static int ext4_ordered_writepage(struct * and generally junk. */ if (ret == 0) { - err = walk_page_buffers(handle, page_bufs, 0, PAGE_CACHE_SIZE, + err = walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, jbd2_journal_dirty_data_fn); if (!ret) ret = err; } - walk_page_buffers(handle, page_bufs, 0, - PAGE_CACHE_SIZE, NULL, bput_one); + walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bput_one); err = ext4_journal_stop(handle); if (!ret) ret = err; @@ -1558,6 +1557,7 @@ static int ext4_journalled_writepage(str handle_t *handle = NULL; int ret = 0; int err; + int pagesize = page_cache_size(inode->i_mapping); if (ext4_journal_current_handle()) goto no_write; @@ -1574,17 +1574,16 @@ static int ext4_journalled_writepage(str * doesn't seem much point in redirtying the page here. */ ClearPageChecked(page); - ret = block_prepare_write(page, 0, PAGE_CACHE_SIZE, - ext4_get_block); + ret = block_prepare_write(page, 0, pagesize, ext4_get_block); if (ret != 0) { ext4_journal_stop(handle); goto out_unlock; } ret = walk_page_buffers(handle, page_buffers(page), 0, - PAGE_CACHE_SIZE, NULL, do_journal_get_write_access); + pagesize, NULL, do_journal_get_write_access); err = walk_page_buffers(handle, page_buffers(page), 0, - PAGE_CACHE_SIZE, NULL, write_end_fn); + pagesize, NULL, write_end_fn); if (ret == 0) ret = err; EXT4_I(inode)->i_state |= EXT4_STATE_JDATA; @@ -1824,8 +1823,8 @@ void ext4_set_aops(struct inode *inode) int ext4_block_truncate_page(handle_t *handle, struct page *page, struct address_space *mapping, loff_t from) { - ext4_fsblk_t index = from >> PAGE_CACHE_SHIFT; - unsigned offset = from & (PAGE_CACHE_SIZE-1); + ext4_fsblk_t index = page_cache_index(mapping, from); + unsigned offset = page_cache_offset(mapping, from); unsigned blocksize, iblock, length, pos; struct inode *inode = mapping->host; struct buffer_head *bh; @@ -1839,7 +1838,8 @@ int ext4_block_truncate_page(handle_t *h blocksize = inode->i_sb->s_blocksize; length = blocksize - (offset & (blocksize - 1)); - iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits); + iblock = index << + (page_cache_shift(mapping) - inode->i_sb->s_blocksize_bits); /* * For "nobh" option, we can only work if we don't need to @@ -2325,7 +2325,7 @@ void ext4_truncate(struct inode *inode) page = NULL; } else { page = grab_cache_page(mapping, - inode->i_size >> PAGE_CACHE_SHIFT); + page_cache_index(mapping, inode->i_size)); if (!page) return; } Index: linux-2.6.22-rc4-mm2/fs/ext4/writeback.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext4/writeback.c 2007-06-18 19:01:00.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext4/writeback.c 2007-06-18 19:01:06.000000000 -0700 @@ -21,7 +21,7 @@ * MUST: * - flush dirty pages in -ENOSPC case in order to free reserved blocks * - direct I/O support - * - blocksize != PAGE_CACHE_SIZE support + * - blocksize != PAGE_SIZE support * - store last unwritten page in ext4_wb_writepages() and * continue from it in a next run * WISH: @@ -285,7 +285,7 @@ static int ext4_wb_submit_extent(struct * let's cook bios from them and start real I/O */ - BUG_ON(PAGE_CACHE_SHIFT < blkbits); + BUG_ON(page_cache_shift(inode->i_mapping) < blkbits); BUG_ON(list_empty(&wc->list)); wb_debug("cook and submit bios for %u/%u/%u for %lu/%u\n", @@ -301,8 +301,9 @@ static int ext4_wb_submit_extent(struct if (page == NULL) break; - pstart = page->index << (PAGE_CACHE_SHIFT - blkbits); - plen = PAGE_SIZE >> blkbits; + pstart = page->index << + (page_cache_shift(node->i_mapping) - blkbits); + plen = page_cache_size(inode->i_mapping) >> blkbits; if (pstart > blk) { /* probably extent covers long space and page * to be written in the middle of it */ @@ -333,7 +334,8 @@ alloc_new_bio: /* +2 because head/tail may belong to different pages */ nr_pages = (le16_to_cpu(ex->ee_len) - (blk - le32_to_cpu(ex->ee_block))); - nr_pages = (nr_pages >> (PAGE_CACHE_SHIFT - blkbits)); + nr_pages = nr_pages >> + (page_cache_shift(inode->mapping) - blkbits); off = le32_to_cpu(ex->ee_start) + (blk - le32_to_cpu(ex->ee_block)); off |= (ext4_fsblk_t) @@ -832,11 +834,12 @@ int ext4_wb_writepages(struct address_sp static void ext4_wb_clear_page(struct page *page, int from, int to) { void *kaddr; + struct address_space *mapping = page_mapping(page); - if (to < PAGE_CACHE_SIZE || from > 0) { + if (to < page_cache_size(mapping) || from > 0) { kaddr = kmap_atomic(page, KM_USER0); - if (PAGE_CACHE_SIZE > to) - memset(kaddr + to, 0, PAGE_CACHE_SIZE - to); + if (page_cache_size(mapping) > to) + memset(kaddr + to, 0, page_cache_size(mapping) - to); if (0 < from) memset(kaddr, 0, from); flush_dcache_page(page); @@ -878,7 +881,7 @@ int ext4_wb_prepare_write(struct file *f } else { /* block is already mapped, so no need to reserve */ BUG_ON(PagePrivate(page)); - if (to - from < PAGE_CACHE_SIZE) { + if (to - from < page_cache_size(inode->i_mapping)) { wb_debug("read block %u\n", (unsigned) bhw->b_blocknr); set_bh_page(bhw, page, 0); @@ -905,7 +908,7 @@ int ext4_wb_prepare_write(struct file *f int ext4_wb_commit_write(struct file *file, struct page *page, unsigned from, unsigned to) { - loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + loff_t pos = page_cache_pos(file->f_mapping, page->index, to); struct inode *inode = page->mapping->host; int err = 0; @@ -984,7 +987,7 @@ int ext4_wb_writepage(struct page *page, { struct inode *inode = page->mapping->host; loff_t i_size = i_size_read(inode); - pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT; + pgoff_t end_index = page_cache_index(inode->i_mapping, i_size); unsigned offset; void *kaddr; @@ -1017,7 +1020,7 @@ int ext4_wb_writepage(struct page *page, return ext4_wb_write_single_page(page, wbc); /* Is the page fully outside i_size? (truncate in progress) */ - offset = i_size & (PAGE_CACHE_SIZE-1); + offset = page_cache_offset(inode->i_mapping, i_size); if (page->index >= end_index + 1 || !offset) { /* * The page may have dirty, unmapped buffers. For example, @@ -1037,7 +1040,7 @@ int ext4_wb_writepage(struct page *page, * writes to that region are not written out to the file." */ kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, PAGE_CACHE_SIZE - offset); + memset(kaddr + offset, 0, page_cache_size(inode->i_mapping) - offset); flush_dcache_page(page); kunmap_atomic(kaddr, KM_USER0); return ext4_wb_write_single_page(page, wbc); @@ -1086,7 +1089,7 @@ void ext4_wb_invalidatepage(struct page int ext4_wb_block_truncate_page(handle_t *handle, struct page *page, struct address_space *mapping, loff_t from) { - unsigned offset = from & (PAGE_CACHE_SIZE-1); + unsigned offset = page_cache_offset(mapping, from); struct inode *inode = mapping->host; struct buffer_head bh, *bhw = &bh; unsigned blocksize, length; @@ -1147,9 +1150,9 @@ void ext4_wb_init(struct super_block *sb if (!test_opt(sb, DELAYED_ALLOC)) return; - if (PAGE_CACHE_SHIFT != sb->s_blocksize_bits) { + if (PAGE_SHIFT != sb->s_blocksize_bits) { printk(KERN_ERR "EXT4-fs: delayed allocation isn't" - "supported for PAGE_CACHE_SIZE != blocksize yet\n"); + "supported for PAGE_SIZE != blocksize yet\n"); clear_opt (EXT4_SB(sb)->s_mount_opt, DELAYED_ALLOC); return; } -- From clameter@sgi.com Wed Jun 20 11:00:00 2007 Message-Id: <20070620180000.346131312@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:45 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [18/37] Use page_cache_xxx in fs/reiserfs Content-Disposition: inline; filename=vps_fs_reiserfs Signed-off-by: Christoph Lameter --- fs/reiserfs/bitmap.c | 7 ++++++- fs/reiserfs/file.c | 5 +++-- fs/reiserfs/inode.c | 37 ++++++++++++++++++++++--------------- fs/reiserfs/ioctl.c | 2 +- fs/reiserfs/stree.c | 8 +++++--- fs/reiserfs/tail_conversion.c | 5 +++-- fs/reiserfs/xattr.c | 19 ++++++++++--------- 7 files changed, 50 insertions(+), 33 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/reiserfs/ioctl.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/ioctl.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/ioctl.c 2007-06-18 19:04:38.000000000 -0700 @@ -173,8 +173,8 @@ static int reiserfs_unpack(struct inode ** reiserfs_prepare_write on that page. This will force a ** reiserfs_get_block to unpack the tail for us. */ - index = inode->i_size >> PAGE_CACHE_SHIFT; mapping = inode->i_mapping; + index = page_cache_index(mapping, inode->i_size); page = grab_cache_page(mapping, index); retval = -ENOMEM; if (!page) { Index: linux-2.6.22-rc4-mm2/fs/reiserfs/stree.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/stree.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/stree.c 2007-06-18 19:04:38.000000000 -0700 @@ -1282,7 +1282,8 @@ int reiserfs_delete_item(struct reiserfs */ data = kmap_atomic(p_s_un_bh->b_page, KM_USER0); - off = ((le_ih_k_offset(&s_ih) - 1) & (PAGE_CACHE_SIZE - 1)); + off = page_cache_offset(p_s_inode->i_mapping, + le_ih_k_offset(&s_ih) - 1); memcpy(data + off, B_I_PITEM(PATH_PLAST_BUFFER(p_s_path), &s_ih), n_ret_value); @@ -1438,7 +1439,7 @@ static void unmap_buffers(struct page *p if (page) { if (page_has_buffers(page)) { - tail_index = pos & (PAGE_CACHE_SIZE - 1); + tail_index = page_cache_offset(page_mapping(page), pos); cur_index = 0; head = page_buffers(page); bh = head; @@ -1458,7 +1459,8 @@ static void unmap_buffers(struct page *p bh = next; } while (bh != head); if (PAGE_SIZE == bh->b_size) { - cancel_dirty_page(page, PAGE_CACHE_SIZE); + cancel_dirty_page(page, + page_cache_size(page_mapping(page))); } } } Index: linux-2.6.22-rc4-mm2/fs/reiserfs/tail_conversion.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/tail_conversion.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/tail_conversion.c 2007-06-18 19:21:18.000000000 -0700 @@ -128,7 +128,8 @@ int direct2indirect(struct reiserfs_tran */ if (up_to_date_bh) { unsigned pgoff = - (tail_offset + total_tail - 1) & (PAGE_CACHE_SIZE - 1); + page_cache_offset(inode->i_mapping, + tail_offset + total_tail - 1); char *kaddr = kmap_atomic(up_to_date_bh->b_page, KM_USER0); memset(kaddr + pgoff, 0, n_blk_size - total_tail); kunmap_atomic(kaddr, KM_USER0); @@ -238,7 +239,7 @@ int indirect2direct(struct reiserfs_tran ** the page was locked and this part of the page was up to date when ** indirect2direct was called, so we know the bytes are still valid */ - tail = tail + (pos & (PAGE_CACHE_SIZE - 1)); + tail = tail + page_cache_offset(p_s_inode->i_mapping, pos); PATH_LAST_POSITION(p_s_path)++; Index: linux-2.6.22-rc4-mm2/fs/reiserfs/file.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/file.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/file.c 2007-06-18 19:04:38.000000000 -0700 @@ -161,11 +161,12 @@ int reiserfs_commit_page(struct inode *i int partial = 0; unsigned blocksize; struct buffer_head *bh, *head; - unsigned long i_size_index = inode->i_size >> PAGE_CACHE_SHIFT; + unsigned long i_size_index = + page_cache_offset(inode->i_mapping, inode->i_size); int new; int logit = reiserfs_file_data_log(inode); struct super_block *s = inode->i_sb; - int bh_per_page = PAGE_CACHE_SIZE / s->s_blocksize; + int bh_per_page = page_cache_size(inode->i_mapping) / s->s_blocksize; struct reiserfs_transaction_handle th; int ret = 0; Index: linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/xattr.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c 2007-06-18 21:53:15.000000000 -0700 @@ -493,13 +493,13 @@ reiserfs_xattr_set(struct inode *inode, while (buffer_pos < buffer_size || buffer_pos == 0) { size_t chunk; size_t skip = 0; - size_t page_offset = (file_pos & (PAGE_CACHE_SIZE - 1)); - if (buffer_size - buffer_pos > PAGE_CACHE_SIZE) - chunk = PAGE_CACHE_SIZE; + size_t page_offset = page_cache_offset(mapping, file_pos); + if (buffer_size - buffer_pos > page_cache_size(mapping)) + chunk = page_cache_size(mapping); else chunk = buffer_size - buffer_pos; - page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT); + page = reiserfs_get_page(xinode, page_cache_index(mapping, file_pos)); if (IS_ERR(page)) { err = PTR_ERR(page); goto out_filp; @@ -511,8 +511,8 @@ reiserfs_xattr_set(struct inode *inode, if (file_pos == 0) { struct reiserfs_xattr_header *rxh; skip = file_pos = sizeof(struct reiserfs_xattr_header); - if (chunk + skip > PAGE_CACHE_SIZE) - chunk = PAGE_CACHE_SIZE - skip; + if (chunk + skip > page_cache_size(mapping)) + chunk = page_cache_size(mapping) - skip; rxh = (struct reiserfs_xattr_header *)data; rxh->h_magic = cpu_to_le32(REISERFS_XATTR_MAGIC); rxh->h_hash = cpu_to_le32(xahash); @@ -603,12 +603,13 @@ reiserfs_xattr_get(const struct inode *i size_t chunk; char *data; size_t skip = 0; - if (isize - file_pos > PAGE_CACHE_SIZE) - chunk = PAGE_CACHE_SIZE; + if (isize - file_pos > page_cache_size(xinode->mapping)) + chunk = page_cache_size(xinode->mapping); else chunk = isize - file_pos; - page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT); + page = reiserfs_get_page(xinode, + page_index(xinode->mapping, file_pos)); if (IS_ERR(page)) { err = PTR_ERR(page); goto out_dput; Index: linux-2.6.22-rc4-mm2/fs/reiserfs/inode.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/inode.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/inode.c 2007-06-18 19:20:38.000000000 -0700 @@ -337,7 +337,8 @@ static int _get_block_create_0(struct in goto finished; } // read file tail into part of page - offset = (cpu_key_k_offset(&key) - 1) & (PAGE_CACHE_SIZE - 1); + offset = page_cache_offset(inode->i_mapping, + cpu_key_k_offset(&key) - 1); fs_gen = get_generation(inode->i_sb); copy_item_head(&tmp_ih, ih); @@ -523,10 +524,10 @@ static int convert_tail_for_hole(struct return -EIO; /* always try to read until the end of the block */ - tail_start = tail_offset & (PAGE_CACHE_SIZE - 1); + tail_start = page_cache_offset(inode->i_mapping, tail_offset); tail_end = (tail_start | (bh_result->b_size - 1)) + 1; - index = tail_offset >> PAGE_CACHE_SHIFT; + index = page_cache_index(inode->i_mapping, tail_offset); /* hole_page can be zero in case of direct_io, we are sure that we cannot get here if we write with O_DIRECT into tail page */ @@ -2008,11 +2009,13 @@ static int grab_tail_page(struct inode * /* we want the page with the last byte in the file, ** not the page that will hold the next byte for appending */ - unsigned long index = (p_s_inode->i_size - 1) >> PAGE_CACHE_SHIFT; + unsigned long index = page_cache_index(p_s_inode->i_mapping, + p_s_inode->i_size - 1); unsigned long pos = 0; unsigned long start = 0; unsigned long blocksize = p_s_inode->i_sb->s_blocksize; - unsigned long offset = (p_s_inode->i_size) & (PAGE_CACHE_SIZE - 1); + unsigned long offset = page_cache_index(p_s_inode->i_mapping, + p_s_inode->i_size); struct buffer_head *bh; struct buffer_head *head; struct page *page; @@ -2084,7 +2087,8 @@ int reiserfs_truncate_file(struct inode { struct reiserfs_transaction_handle th; /* we want the offset for the first byte after the end of the file */ - unsigned long offset = p_s_inode->i_size & (PAGE_CACHE_SIZE - 1); + unsigned long offset = page_cache_offset(p_s_inode->i_mapping, + p_s_inode->i_size); unsigned blocksize = p_s_inode->i_sb->s_blocksize; unsigned length; struct page *page = NULL; @@ -2233,7 +2237,7 @@ static int map_block_for_writepage(struc } else if (is_direct_le_ih(ih)) { char *p; p = page_address(bh_result->b_page); - p += (byte_offset - 1) & (PAGE_CACHE_SIZE - 1); + p += page_cache_offset(inode->i_mapping, byte_offset - 1); copy_size = ih_item_len(ih) - pos_in_item; fs_gen = get_generation(inode->i_sb); @@ -2332,7 +2336,8 @@ static int reiserfs_write_full_page(stru struct writeback_control *wbc) { struct inode *inode = page->mapping->host; - unsigned long end_index = inode->i_size >> PAGE_CACHE_SHIFT; + unsigned long end_index = page_cache_index(inode->i_mapping, + inode->i_size); int error = 0; unsigned long block; sector_t last_block; @@ -2342,7 +2347,7 @@ static int reiserfs_write_full_page(stru int checked = PageChecked(page); struct reiserfs_transaction_handle th; struct super_block *s = inode->i_sb; - int bh_per_page = PAGE_CACHE_SIZE / s->s_blocksize; + int bh_per_page = page_cache_size(inode->i_mapping) / s->s_blocksize; th.t_trans_id = 0; /* no logging allowed when nonblocking or from PF_MEMALLOC */ @@ -2369,16 +2374,18 @@ static int reiserfs_write_full_page(stru if (page->index >= end_index) { unsigned last_offset; - last_offset = inode->i_size & (PAGE_CACHE_SIZE - 1); + last_offset = page_cache_offset(inode->i_mapping, inode->i_size); /* no file contents in this page */ if (page->index >= end_index + 1 || !last_offset) { unlock_page(page); return 0; } - zero_user_segment(page, last_offset, PAGE_CACHE_SIZE); + zero_user_segment(page, last_offset, + page_cache_size(inode->i_mapping)); } bh = head; - block = page->index << (PAGE_CACHE_SHIFT - s->s_blocksize_bits); + block = page->index << (page_cache_shift(inode->i_mapping) + - s->s_blocksize_bits); last_block = (i_size_read(inode) - 1) >> inode->i_blkbits; /* first map all the buffers, logging any direct items we find */ do { @@ -2570,7 +2577,7 @@ static int reiserfs_write_begin(struct f *fsdata = (void *)(unsigned long)flags; } - index = pos >> PAGE_CACHE_SHIFT; + index = page_cache_index(mapping, pos); page = __grab_cache_page(mapping, index); if (!page) return -ENOMEM; @@ -2694,7 +2701,7 @@ static int reiserfs_write_end(struct fil else th = NULL; - start = pos & (PAGE_CACHE_SIZE - 1); + start = page_cache_offset(mapping, pos); if (unlikely(copied < len)) { if (!PageUptodate(page)) copied = 0; @@ -2774,7 +2781,7 @@ int reiserfs_commit_write(struct file *f unsigned from, unsigned to) { struct inode *inode = page->mapping->host; - loff_t pos = ((loff_t) page->index << PAGE_CACHE_SHIFT) + to; + loff_t pos = page_cache_pos(inode->i_mapping, page->index, to); int ret = 0; int update_sd = 0; struct reiserfs_transaction_handle *th = NULL; Index: linux-2.6.22-rc4-mm2/fs/reiserfs/bitmap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/bitmap.c 2007-06-18 19:04:34.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/bitmap.c 2007-06-18 19:04:38.000000000 -0700 @@ -1249,9 +1249,14 @@ int reiserfs_can_fit_pages(struct super_ int space; spin_lock(&REISERFS_SB(sb)->bitmap_lock); + + /* + * Note the PAGE_SHIFT here. This means that the superblock + * and metadata is restricted to page size. + */ space = (SB_FREE_BLOCKS(sb) - - REISERFS_SB(sb)->reserved_blocks) >> (PAGE_CACHE_SHIFT - + REISERFS_SB(sb)->reserved_blocks) >> (PAGE_SHIFT - sb->s_blocksize_bits); spin_unlock(&REISERFS_SB(sb)->bitmap_lock); -- From clameter@sgi.com Wed Jun 20 11:00:00 2007 Message-Id: <20070620180000.425249859@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:46 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [19/37] Use page_cache_xxx for fs/xfs Content-Disposition: inline; filename=vps_fs_xfs Signed-off-by: Christoph Lameter --- fs/xfs/linux-2.6/xfs_aops.c | 55 +++++++++++++++++++++++--------------------- fs/xfs/linux-2.6/xfs_lrw.c | 4 +-- 2 files changed, 31 insertions(+), 28 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_aops.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_aops.c 2007-06-18 19:05:21.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_aops.c 2007-06-18 19:07:15.000000000 -0700 @@ -74,7 +74,7 @@ xfs_page_trace( xfs_inode_t *ip; bhv_vnode_t *vp = vn_from_inode(inode); loff_t isize = i_size_read(inode); - loff_t offset = page_offset(page); + loff_t offset = page_cache_offset(page->mapping); int delalloc = -1, unmapped = -1, unwritten = -1; if (page_has_buffers(page)) @@ -610,7 +610,7 @@ xfs_probe_page( break; } while ((bh = bh->b_this_page) != head); } else - ret = mapped ? 0 : PAGE_CACHE_SIZE; + ret = mapped ? 0 : page_cache_size(page->mapping); } return ret; @@ -637,7 +637,7 @@ xfs_probe_cluster( } while ((bh = bh->b_this_page) != head); /* if we reached the end of the page, sum forwards in following pages */ - tlast = i_size_read(inode) >> PAGE_CACHE_SHIFT; + tlast = page_cache_index(inode->i_mapping, i_size_read(inode)); tindex = startpage->index + 1; /* Prune this back to avoid pathological behavior */ @@ -655,14 +655,14 @@ xfs_probe_cluster( size_t pg_offset, len = 0; if (tindex == tlast) { - pg_offset = - i_size_read(inode) & (PAGE_CACHE_SIZE - 1); + pg_offset = page_cache_offset(inode->i_mapping, + i_size_read(inode)); if (!pg_offset) { done = 1; break; } } else - pg_offset = PAGE_CACHE_SIZE; + pg_offset = page_cache_size(inode->i_mapping); if (page->index == tindex && !TestSetPageLocked(page)) { len = xfs_probe_page(page, pg_offset, mapped); @@ -744,7 +744,8 @@ xfs_convert_page( int bbits = inode->i_blkbits; int len, page_dirty; int count = 0, done = 0, uptodate = 1; - xfs_off_t offset = page_offset(page); + struct address_space *map = inode->i_mapping; + xfs_off_t offset = page_cache_pos(map, page->index, 0); if (page->index != tindex) goto fail; @@ -752,7 +753,7 @@ xfs_convert_page( goto fail; if (PageWriteback(page)) goto fail_unlock_page; - if (page->mapping != inode->i_mapping) + if (page->mapping != map) goto fail_unlock_page; if (!xfs_is_delayed_page(page, (*ioendp)->io_type)) goto fail_unlock_page; @@ -764,20 +765,20 @@ xfs_convert_page( * Derivation: * * End offset is the highest offset that this page should represent. - * If we are on the last page, (end_offset & (PAGE_CACHE_SIZE - 1)) - * will evaluate non-zero and be less than PAGE_CACHE_SIZE and + * If we are on the last page, (end_offset & page_cache_mask()) + * will evaluate non-zero and be less than page_cache_size() and * hence give us the correct page_dirty count. On any other page, * it will be zero and in that case we need page_dirty to be the * count of buffers on the page. */ end_offset = min_t(unsigned long long, - (xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT, + (xfs_off_t)(page->index + 1) << page_cache_shift(map), i_size_read(inode)); len = 1 << inode->i_blkbits; - p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1), - PAGE_CACHE_SIZE); - p_offset = p_offset ? roundup(p_offset, len) : PAGE_CACHE_SIZE; + p_offset = min_t(unsigned long, page_cache_offset(map, end_offset), + page_cache_size(map)); + p_offset = p_offset ? roundup(p_offset, len) : page_cache_size(map); page_dirty = p_offset / len; bh = head = page_buffers(page); @@ -933,6 +934,8 @@ xfs_page_state_convert( int page_dirty, count = 0; int trylock = 0; int all_bh = unmapped; + struct address_space *map = inode->i_mapping; + int pagesize = page_cache_size(map); if (startio) { if (wbc->sync_mode == WB_SYNC_NONE && wbc->nonblocking) @@ -941,11 +944,11 @@ xfs_page_state_convert( /* Is this page beyond the end of the file? */ offset = i_size_read(inode); - end_index = offset >> PAGE_CACHE_SHIFT; - last_index = (offset - 1) >> PAGE_CACHE_SHIFT; + end_index = page_cache_index(map, offset); + last_index = page_cache_index(map, (offset - 1)); if (page->index >= end_index) { if ((page->index >= end_index + 1) || - !(i_size_read(inode) & (PAGE_CACHE_SIZE - 1))) { + !(page_cache_offset(map, i_size_read(inode)))) { if (startio) unlock_page(page); return 0; @@ -959,22 +962,22 @@ xfs_page_state_convert( * Derivation: * * End offset is the highest offset that this page should represent. - * If we are on the last page, (end_offset & (PAGE_CACHE_SIZE - 1)) - * will evaluate non-zero and be less than PAGE_CACHE_SIZE and - * hence give us the correct page_dirty count. On any other page, + * If we are on the last page, (page_cache_offset(mapping, end_offset)) + * will evaluate non-zero and be less than page_cache_size(mapping) + * and hence give us the correct page_dirty count. On any other page, * it will be zero and in that case we need page_dirty to be the * count of buffers on the page. */ end_offset = min_t(unsigned long long, - (xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT, offset); + (xfs_off_t)page_cache_pos(map, page->index + 1, 0), offset); len = 1 << inode->i_blkbits; - p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1), - PAGE_CACHE_SIZE); - p_offset = p_offset ? roundup(p_offset, len) : PAGE_CACHE_SIZE; + p_offset = min_t(unsigned long, page_cache_offset(map, end_offset), + pagesize); + p_offset = p_offset ? roundup(p_offset, len) : pagesize; page_dirty = p_offset / len; bh = head = page_buffers(page); - offset = page_offset(page); + offset = page_cache_pos(map, page->index, 0); flags = BMAPI_READ; type = IOMAP_NEW; @@ -1111,7 +1114,7 @@ xfs_page_state_convert( if (ioend && iomap_valid) { offset = (iomap.iomap_offset + iomap.iomap_bsize - 1) >> - PAGE_CACHE_SHIFT; + page_cache_shift(map); tlast = min_t(pgoff_t, offset, last_index); xfs_cluster_write(inode, page->index + 1, &iomap, &ioend, wbc, startio, all_bh, tlast); Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_lrw.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_lrw.c 2007-06-18 19:05:21.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_lrw.c 2007-06-18 19:07:15.000000000 -0700 @@ -143,8 +143,8 @@ xfs_iozero( unsigned offset, bytes; void *fsdata; - offset = (pos & (PAGE_CACHE_SIZE -1)); /* Within page */ - bytes = PAGE_CACHE_SIZE - offset; + offset = page_cache_offset(mapping, pos); /* Within page */ + bytes = page_cache_size(mapping) - offset; if (bytes > count) bytes = count; -- From clameter@sgi.com Wed Jun 20 11:00:00 2007 Message-Id: <20070620180000.595008697@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:47 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [20/37] Fix PAGE SIZE assumption in miscellaneous places. Content-Disposition: inline; filename=vps_kernel Signed-off-by: Christoph Lameter --- kernel/container.c | 4 ++-- kernel/futex.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) Index: vps/kernel/futex.c =================================================================== --- vps.orig/kernel/futex.c 2007-06-14 21:52:20.000000000 -0700 +++ vps/kernel/futex.c 2007-06-14 21:53:43.000000000 -0700 @@ -255,7 +255,7 @@ int get_futex_key(u32 __user *uaddr, str err = get_user_pages(current, mm, address, 1, 0, 0, &page, NULL); if (err >= 0) { key->shared.pgoff = - page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + page->index << (compound_order(page) - PAGE_SHIFT); put_page(page); return 0; } Index: vps/kernel/container.c =================================================================== --- vps.orig/kernel/container.c 2007-06-14 21:53:57.000000000 -0700 +++ vps/kernel/container.c 2007-06-14 21:54:13.000000000 -0700 @@ -840,8 +840,8 @@ static int container_fill_super(struct s struct dentry *root; struct containerfs_root *hroot = options; - sb->s_blocksize = PAGE_CACHE_SIZE; - sb->s_blocksize_bits = PAGE_CACHE_SHIFT; + sb->s_blocksize = PAGE_SIZE; + sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = CONTAINER_SUPER_MAGIC; sb->s_op = &container_ops; -- From clameter@sgi.com Wed Jun 20 11:00:00 2007 Message-Id: <20070620180000.773559727@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:48 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [21/37] Use page_cache_xxx in drivers/block/loop.c Content-Disposition: inline; filename=vps_drivers_block_loop Signed-off-by: Christoph Lameter --- drivers/block/loop.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) Index: linux-2.6.22-rc4-mm2/drivers/block/loop.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/drivers/block/loop.c 2007-06-19 20:50:30.000000000 -0700 +++ linux-2.6.22-rc4-mm2/drivers/block/loop.c 2007-06-20 00:34:46.000000000 -0700 @@ -215,8 +215,8 @@ static int do_lo_send_aops(struct loop_d int len, ret; mutex_lock(&mapping->host->i_mutex); - index = pos >> PAGE_CACHE_SHIFT; - offset = pos & ((pgoff_t)PAGE_CACHE_SIZE - 1); + index = page_cache_index(mapping, pos); + offset = page_cache_offset(mapping, pos); bv_offs = bvec->bv_offset; len = bvec->bv_len; while (len > 0) { @@ -226,8 +226,9 @@ static int do_lo_send_aops(struct loop_d struct page *page; void *fsdata; - IV = ((sector_t)index << (PAGE_CACHE_SHIFT - 9))+(offset >> 9); - size = PAGE_CACHE_SIZE - offset; + IV = ((sector_t)index << (page_cache_shift(mapping) - 9)) + + (offset >> 9); + size = page_cache_size(mapping) - offset; if (size > len) size = len; @@ -393,7 +394,9 @@ lo_read_actor(read_descriptor_t *desc, s struct loop_device *lo = p->lo; sector_t IV; - IV = ((sector_t) page->index << (PAGE_CACHE_SHIFT - 9))+(offset >> 9); + IV = ((sector_t) page->index << + (page_cache_shift(page_mapping(page)) - 9)) + + (offset >> 9); if (size > count) size = count; -- From clameter@sgi.com Wed Jun 20 11:00:01 2007 Message-Id: <20070620180000.944283205@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:49 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [22/37] Use page_cache_xxx in drivers/block/rd.c Content-Disposition: inline; filename=vps_drivers_block_rd Signed-off-by: Christoph Lameter --- drivers/block/rd.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) Index: vps/drivers/block/rd.c =================================================================== --- vps.orig/drivers/block/rd.c 2007-06-14 21:49:09.000000000 -0700 +++ vps/drivers/block/rd.c 2007-06-14 21:50:28.000000000 -0700 @@ -121,7 +121,7 @@ static void make_page_uptodate(struct pa } } while ((bh = bh->b_this_page) != head); } else { - memset(page_address(page), 0, PAGE_CACHE_SIZE); + memset(page_address(page), 0, page_cache_size(page_mapping(page))); } flush_dcache_page(page); SetPageUptodate(page); @@ -201,9 +201,9 @@ static const struct address_space_operat static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector, struct address_space *mapping) { - pgoff_t index = sector >> (PAGE_CACHE_SHIFT - 9); + pgoff_t index = sector >> (page_cache_size(mapping) - 9); unsigned int vec_offset = vec->bv_offset; - int offset = (sector << 9) & ~PAGE_CACHE_MASK; + int offset = page_cache_offset(mapping, (sector << 9)); int size = vec->bv_len; int err = 0; @@ -213,7 +213,7 @@ static int rd_blkdev_pagecache_IO(int rw char *src; char *dst; - count = PAGE_CACHE_SIZE - offset; + count = page_cache_size(mapping) - offset; if (count > size) count = size; size -= count; -- From clameter@sgi.com Wed Jun 20 11:00:01 2007 Message-Id: <20070620180001.193168025@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:50 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [23/37] compound pages: PageHead/PageTail instead of PageCompound Content-Disposition: inline; filename=compound_headtail This patch enhances the handling of compound pages in the VM. It may also be important also for the antifrag patches that need to manage a set of higher order free pages and also for other uses of compound pages. For now it simplifies accounting for SLUB pages but the groundwork here is important for the large block size patches and for allowing to page migration of larger pages. With this framework we may be able to get to a point where compound pages keep their flags while they are free and Mel may avoid having special functions for determining the page order of higher order freed pages. If we can avoid the setup and teardown of higher order pages then allocation and release of compound pages will be faster. Looking at the handling of compound pages we see that the fact that a page is part of a higher order page is not that interesting. The differentiation is mainly for head pages and tail pages of higher order pages. Head pages usually need special handling to accomodate the larger size. It is usually an error if tail pages are encountered. Or else they need to be treated like PAGE_SIZE pages. So a compound flag in the page flags is not what we need. Instead we introduce a flag for the head page and another for the tail page. The PageCompound test is preserved for backward compatibility and will test if either PageTail or PageHead has been set. After this patchset the uses of CompoundPage() will be reduced significantly in the core VM. The I/O layer will still use CompoundPage() for direct I/O. However, if we at some point convert direct I/O to also support compound pages as a single unit then CompoundPage() there may become unecessary as well as the leftover check in mm/swap.c. We may end up mostly with checks for PageTail and PageHead. This patch: Use two separate page flags for the head and tail of compound pages. PageHead() and PageTail() become more efficient. PageCompound then becomes a check for PageTail || PageHead. Over time it is expected that PageCompound will mostly go away since the head page processing will be different from tail page processing is most situations. We can remove the compound page check from set_page_refcounted since PG_reclaim is no longer overloaded. Also the check in _free_one_page can only be for PageHead. We cannot free a tail page. Signed-off-by: Christoph Lameter --- include/linux/page-flags.h | 43 ++++++++++++------------------------------- mm/internal.h | 2 +- mm/page_alloc.c | 2 +- 3 files changed, 14 insertions(+), 33 deletions(-) Index: linux-2.6.22-rc4-mm2/include/linux/page-flags.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/page-flags.h 2007-06-15 17:35:33.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/page-flags.h 2007-06-18 19:13:03.000000000 -0700 @@ -83,7 +83,6 @@ #define PG_private 11 /* If pagecache, has fs-private data */ #define PG_writeback 12 /* Page is under writeback */ -#define PG_compound 14 /* Part of a compound page */ #define PG_swapcache 15 /* Swap page: swp_entry_t in private */ #define PG_mappedtodisk 16 /* Has blocks allocated on-disk */ @@ -91,6 +90,9 @@ #define PG_buddy 19 /* Page is free, on buddy lists */ #define PG_booked 20 /* Has blocks reserved on-disk */ +#define PG_head 21 /* Page is head of a compound page */ +#define PG_tail 22 /* Page is tail of a compound page */ + /* PG_readahead is only used for file reads; PG_reclaim is only for writes */ #define PG_readahead PG_reclaim /* Reminder to do async read-ahead */ @@ -221,37 +223,16 @@ static inline void SetPageUptodate(struc #define ClearPageReclaim(page) clear_bit(PG_reclaim, &(page)->flags) #define TestClearPageReclaim(page) test_and_clear_bit(PG_reclaim, &(page)->flags) -#define PageCompound(page) test_bit(PG_compound, &(page)->flags) -#define __SetPageCompound(page) __set_bit(PG_compound, &(page)->flags) -#define __ClearPageCompound(page) __clear_bit(PG_compound, &(page)->flags) - -/* - * PG_reclaim is used in combination with PG_compound to mark the - * head and tail of a compound page - * - * PG_compound & PG_reclaim => Tail page - * PG_compound & ~PG_reclaim => Head page - */ - -#define PG_head_tail_mask ((1L << PG_compound) | (1L << PG_reclaim)) - -#define PageTail(page) ((page->flags & PG_head_tail_mask) \ - == PG_head_tail_mask) - -static inline void __SetPageTail(struct page *page) -{ - page->flags |= PG_head_tail_mask; -} - -static inline void __ClearPageTail(struct page *page) -{ - page->flags &= ~PG_head_tail_mask; -} +#define PageHead(page) test_bit(PG_head, &(page)->flags) +#define __SetPageHead(page) __set_bit(PG_head, &(page)->flags) +#define __ClearPageHead(page) __clear_bit(PG_head, &(page)->flags) + +#define PageTail(page) test_bit(PG_tail, &(page->flags)) +#define __SetPageTail(page) __set_bit(PG_tail, &(page)->flags) +#define __ClearPageTail(page) __clear_bit(PG_tail, &(page)->flags) -#define PageHead(page) ((page->flags & PG_head_tail_mask) \ - == (1L << PG_compound)) -#define __SetPageHead(page) __SetPageCompound(page) -#define __ClearPageHead(page) __ClearPageCompound(page) +#define PageCompound(page) ((page)->flags & \ + ((1L << PG_head) | (1L << PG_tail))) #ifdef CONFIG_SWAP #define PageSwapCache(page) test_bit(PG_swapcache, &(page)->flags) Index: linux-2.6.22-rc4-mm2/mm/internal.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/internal.h 2007-06-15 17:35:33.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/internal.h 2007-06-18 19:13:03.000000000 -0700 @@ -24,7 +24,7 @@ static inline void set_page_count(struct */ static inline void set_page_refcounted(struct page *page) { - VM_BUG_ON(PageCompound(page) && PageTail(page)); + VM_BUG_ON(PageTail(page)); VM_BUG_ON(atomic_read(&page->_count)); set_page_count(page, 1); } Index: linux-2.6.22-rc4-mm2/mm/page_alloc.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/page_alloc.c 2007-06-18 18:42:45.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/page_alloc.c 2007-06-18 19:13:03.000000000 -0700 @@ -428,7 +428,7 @@ static inline void __free_one_page(struc int order_size = 1 << order; int migratetype = get_pageblock_migratetype(page); - if (unlikely(PageCompound(page))) + if (unlikely(PageHead(page))) destroy_compound_page(page, order); page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1); -- From clameter@sgi.com Wed Jun 20 11:00:01 2007 Message-Id: <20070620180001.267919130@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:51 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [24/37] compound pages: Add new support functions Content-Disposition: inline; filename=compound_functions compound_pages(page) -> Determines base pages of a compound page compound_shift(page) -> Determine the page shift of a compound page compound_size(page) -> Determine the size of a compound page Signed-off-by: Christoph Lameter --- include/linux/mm.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) Index: vps/include/linux/mm.h =================================================================== --- vps.orig/include/linux/mm.h 2007-06-11 15:56:37.000000000 -0700 +++ vps/include/linux/mm.h 2007-06-12 19:06:28.000000000 -0700 @@ -365,6 +365,21 @@ static inline void set_compound_order(st page[1].lru.prev = (void *)order; } +static inline int compound_pages(struct page *page) +{ + return 1 << compound_order(page); +} + +static inline int compound_shift(struct page *page) +{ + return PAGE_SHIFT + compound_order(page); +} + +static inline int compound_size(struct page *page) +{ + return PAGE_SIZE << compound_order(page); +} + /* * Multiple processes may "see" the same page. E.g. for untouched * mappings of /dev/null, all processes see the same page full of -- From clameter@sgi.com Wed Jun 20 11:00:01 2007 Message-Id: <20070620180001.436842815@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:52 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [25/37] compound pages: vmstat support Content-Disposition: inline; filename=compound_vmstat Add support for compound pages so that inc_xxxx and dec_xxx will increment the ZVCs by the number of pages of the compound page. Signed-off-by: Christoph Lameter --- include/linux/vmstat.h | 5 ++--- mm/vmstat.c | 18 +++++++++++++----- 2 files changed, 15 insertions(+), 8 deletions(-) Index: vps/include/linux/vmstat.h =================================================================== --- vps.orig/include/linux/vmstat.h 2007-06-11 15:56:37.000000000 -0700 +++ vps/include/linux/vmstat.h 2007-06-12 19:06:32.000000000 -0700 @@ -234,7 +234,7 @@ static inline void __inc_zone_state(stru static inline void __inc_zone_page_state(struct page *page, enum zone_stat_item item) { - __inc_zone_state(page_zone(page), item); + __mod_zone_page_state(page_zone(page), item, compound_pages(page)); } static inline void __dec_zone_state(struct zone *zone, enum zone_stat_item item) @@ -246,8 +246,7 @@ static inline void __dec_zone_state(stru static inline void __dec_zone_page_state(struct page *page, enum zone_stat_item item) { - atomic_long_dec(&page_zone(page)->vm_stat[item]); - atomic_long_dec(&vm_stat[item]); + __mod_zone_page_state(page_zone(page), item, -compound_pages(page)); } /* Index: vps/mm/vmstat.c =================================================================== --- vps.orig/mm/vmstat.c 2007-06-11 15:56:37.000000000 -0700 +++ vps/mm/vmstat.c 2007-06-12 19:06:32.000000000 -0700 @@ -225,7 +225,12 @@ void __inc_zone_state(struct zone *zone, void __inc_zone_page_state(struct page *page, enum zone_stat_item item) { - __inc_zone_state(page_zone(page), item); + struct zone *z = page_zone(page); + + if (likely(!PageHead(page))) + __inc_zone_state(z, item); + else + __mod_zone_page_state(z, item, compound_pages(page)); } EXPORT_SYMBOL(__inc_zone_page_state); @@ -246,7 +251,12 @@ void __dec_zone_state(struct zone *zone, void __dec_zone_page_state(struct page *page, enum zone_stat_item item) { - __dec_zone_state(page_zone(page), item); + struct zone *z = page_zone(page); + + if (likely(!PageHead(page))) + __dec_zone_state(z, item); + else + __mod_zone_page_state(z, item, -compound_pages(page)); } EXPORT_SYMBOL(__dec_zone_page_state); @@ -262,11 +272,9 @@ void inc_zone_state(struct zone *zone, e void inc_zone_page_state(struct page *page, enum zone_stat_item item) { unsigned long flags; - struct zone *zone; - zone = page_zone(page); local_irq_save(flags); - __inc_zone_state(zone, item); + __inc_zone_page_state(page, item); local_irq_restore(flags); } EXPORT_SYMBOL(inc_zone_page_state); -- From clameter@sgi.com Wed Jun 20 11:00:01 2007 Message-Id: <20070620180001.608310452@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:53 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [26/37] compound pages: Use new compound vmstat functions in SLUB Content-Disposition: inline; filename=compound_vmstat_slub Use the new dec/inc functions to simplify SLUB's accounting of pages. Signed-off-by: Christoph Lameter --- mm/slub.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) Index: linux-2.6.22-rc4-mm2/mm/slub.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/slub.c 2007-06-18 18:42:45.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/slub.c 2007-06-18 19:13:26.000000000 -0700 @@ -1052,7 +1052,6 @@ static inline void kmem_cache_open_debug static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page * page; - int pages = 1 << s->order; if (s->order) flags |= __GFP_COMP; @@ -1071,10 +1070,9 @@ static struct page *allocate_slab(struct if (!page) return NULL; - mod_zone_page_state(page_zone(page), + inc_zone_page_state(page, (s->flags & SLAB_RECLAIM_ACCOUNT) ? - NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE, - pages); + NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE); return page; } @@ -1149,8 +1147,6 @@ static struct page *new_slab(struct kmem static void __free_slab(struct kmem_cache *s, struct page *page) { - int pages = 1 << s->order; - if (unlikely(SlabDebug(page))) { void *p; @@ -1159,10 +1155,9 @@ static void __free_slab(struct kmem_cach check_object(s, page, p, 0); } - mod_zone_page_state(page_zone(page), + dec_zone_page_state(page, (s->flags & SLAB_RECLAIM_ACCOUNT) ? - NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE, - - pages); + NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE); page->mapping = NULL; __free_pages(page, s->order); -- From clameter@sgi.com Wed Jun 20 11:00:01 2007 Message-Id: <20070620180001.771550661@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:54 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [27/37] compound pages: Allow use of get_page_unless_zero with compound pages Content-Disposition: inline; filename=compound_get_one_unless This will be needed by targeted slab reclaim in order to ensure that a compound page allocated by SLUB will not go away under us. It also may be needed if Mel starts to implement defragmentation. The moving of compound pages may require the establishment of a reference before the use of page migration functions. Signed-off-by: Christoph Lameter --- include/linux/mm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: vps/include/linux/mm.h =================================================================== --- vps.orig/include/linux/mm.h 2007-06-12 19:06:28.000000000 -0700 +++ vps/include/linux/mm.h 2007-06-12 19:06:41.000000000 -0700 @@ -292,7 +292,7 @@ static inline int put_page_testzero(stru */ static inline int get_page_unless_zero(struct page *page) { - VM_BUG_ON(PageCompound(page)); + VM_BUG_ON(PageTail(page)); return atomic_inc_not_zero(&page->_count); } -- From clameter@sgi.com Wed Jun 20 11:00:02 2007 Message-Id: <20070620180001.947197522@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:55 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [28/37] compound pages: Allow freeing of compound pages via pagevec Content-Disposition: inline; filename=compound_free_via_pagevec Allow the freeing of compound pages via pagevec. In release_pages() we currently special case for compound pages in order to be sure to always decrement the page count of the head page and not the tail page. However that redirection to the head page is only necessary for tail pages. So use PageTail instead of PageCompound. No change therefore for the handling of tail pages. The head page of a compound pages now represents single page large page. We do the usual processing including checking if its on the LRU and removing it (not useful right now but later when compound pages are on the LRU this will work). Then we add the compound page to the pagevec. Only head pages will end up on the pagevec not tail pages. In __pagevec_free() we then check if we are freeing a head page and if so call the destructor for the compound page. Signed-off-by: Christoph Lameter --- mm/page_alloc.c | 13 +++++++++++-- mm/swap.c | 8 +++++++- 2 files changed, 18 insertions(+), 3 deletions(-) Index: linux-2.6.22-rc4-mm2/mm/page_alloc.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/page_alloc.c 2007-06-18 19:13:03.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/page_alloc.c 2007-06-18 19:14:03.000000000 -0700 @@ -1746,8 +1746,17 @@ void __pagevec_free(struct pagevec *pvec { int i = pagevec_count(pvec); - while (--i >= 0) - free_hot_cold_page(pvec->pages[i], pvec->cold); + while (--i >= 0) { + struct page *page = pvec->pages[i]; + + if (PageHead(page)) { + compound_page_dtor *dtor; + + dtor = get_compound_page_dtor(page); + (*dtor)(page); + } else + free_hot_cold_page(page, pvec->cold); + } } fastcall void __free_pages(struct page *page, unsigned int order) Index: linux-2.6.22-rc4-mm2/mm/swap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/swap.c 2007-06-15 17:35:33.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/swap.c 2007-06-18 19:14:03.000000000 -0700 @@ -293,7 +293,13 @@ void release_pages(struct page **pages, for (i = 0; i < nr; i++) { struct page *page = pages[i]; - if (unlikely(PageCompound(page))) { + /* + * If we have a tail page on the LRU then we need to + * decrement the page count of the head page. There + * is no further need to do anything since tail pages + * cannot be on the LRU. + */ + if (unlikely(PageTail(page))) { if (zone) { spin_unlock_irq(&zone->lru_lock); zone = NULL; -- From clameter@sgi.com Wed Jun 20 11:00:02 2007 Message-Id: <20070620180002.124765357@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:56 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [29/37] Large blocksize support: Fix up reclaim counters Content-Disposition: inline; filename=vps_higher_order_reclaim We now have to reclaim compound pages of arbitrary order. Adjust the counting in vmscan.c to could the number of base pages. Also change the active and inactive accounting to do the same. Signed-off-by: Christoph Lameter --- include/linux/mm_inline.h | 41 +++++++++++++++++++++++++++++++---------- mm/vmscan.c | 22 ++++++++++++---------- 2 files changed, 43 insertions(+), 20 deletions(-) Index: linux-2.6.22-rc4-mm2/mm/vmscan.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/vmscan.c 2007-06-19 23:27:02.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/vmscan.c 2007-06-19 23:27:29.000000000 -0700 @@ -474,14 +474,14 @@ static unsigned long shrink_page_list(st VM_BUG_ON(PageActive(page)); - sc->nr_scanned++; + sc->nr_scanned += compound_pages(page); if (!sc->may_swap && page_mapped(page)) goto keep_locked; /* Double the slab pressure for mapped and swapcache pages */ if (page_mapped(page) || PageSwapCache(page)) - sc->nr_scanned++; + sc->nr_scanned += compound_pages(page); if (PageWriteback(page)) goto keep_locked; @@ -585,7 +585,7 @@ static unsigned long shrink_page_list(st free_it: unlock_page(page); - nr_reclaimed++; + nr_reclaimed += compound_pages(page); if (!pagevec_add(&freed_pvec, page)) __pagevec_release_nonlru(&freed_pvec); continue; @@ -677,22 +677,23 @@ static unsigned long isolate_lru_pages(u unsigned long nr_taken = 0; unsigned long scan; - for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { + for (scan = 0; scan < nr_to_scan && !list_empty(src); ) { struct page *page; unsigned long pfn; unsigned long end_pfn; unsigned long page_pfn; + int pages; int zone_id; page = lru_to_page(src); prefetchw_prev_lru_page(page, src, flags); - + pages = compound_pages(page); VM_BUG_ON(!PageLRU(page)); switch (__isolate_lru_page(page, mode)) { case 0: list_move(&page->lru, dst); - nr_taken++; + nr_taken += pages; break; case -EBUSY: @@ -738,8 +739,8 @@ static unsigned long isolate_lru_pages(u switch (__isolate_lru_page(cursor_page, mode)) { case 0: list_move(&cursor_page->lru, dst); - nr_taken++; - scan++; + nr_taken += compound_pages(cursor_page); + scan+= compound_pages(cursor_page); break; case -EBUSY: @@ -749,6 +750,7 @@ static unsigned long isolate_lru_pages(u break; } } + scan += pages; } *scanned = scan; @@ -985,7 +987,7 @@ force_reclaim_mapped: ClearPageActive(page); list_move(&page->lru, &zone->inactive_list); - pgmoved++; + pgmoved += compound_pages(page); if (!pagevec_add(&pvec, page)) { __mod_zone_page_state(zone, NR_INACTIVE, pgmoved); spin_unlock_irq(&zone->lru_lock); @@ -1013,7 +1015,7 @@ force_reclaim_mapped: SetPageLRU(page); VM_BUG_ON(!PageActive(page)); list_move(&page->lru, &zone->active_list); - pgmoved++; + pgmoved += compound_pages(page); if (!pagevec_add(&pvec, page)) { __mod_zone_page_state(zone, NR_ACTIVE, pgmoved); pgmoved = 0; Index: linux-2.6.22-rc4-mm2/include/linux/mm_inline.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/mm_inline.h 2007-06-19 23:27:02.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/mm_inline.h 2007-06-20 00:22:16.000000000 -0700 @@ -2,46 +2,67 @@ static inline void add_page_to_active_list(struct zone *zone, struct page *page) { list_add(&page->lru, &zone->active_list); - __inc_zone_state(zone, NR_ACTIVE); + if (!PageHead(page)) + __inc_zone_state(zone, NR_ACTIVE); + else + __inc_zone_page_state(page, NR_ACTIVE); } static inline void add_page_to_inactive_list(struct zone *zone, struct page *page) { list_add(&page->lru, &zone->inactive_list); - __inc_zone_state(zone, NR_INACTIVE); + if (!PageHead(page)) + __inc_zone_state(zone, NR_INACTIVE); + else + __inc_zone_page_state(page, NR_INACTIVE); } static inline void add_page_to_inactive_list_tail(struct zone *zone, struct page *page) { list_add_tail(&page->lru, &zone->inactive_list); - __inc_zone_state(zone, NR_INACTIVE); + if (!PageHead(page)) + __inc_zone_state(zone, NR_INACTIVE); + else + __inc_zone_page_state(page, NR_INACTIVE); } static inline void del_page_from_active_list(struct zone *zone, struct page *page) { list_del(&page->lru); - __dec_zone_state(zone, NR_ACTIVE); + if (!PageHead(page)) + __dec_zone_state(zone, NR_ACTIVE); + else + __dec_zone_page_state(page, NR_ACTIVE); } static inline void del_page_from_inactive_list(struct zone *zone, struct page *page) { list_del(&page->lru); - __dec_zone_state(zone, NR_INACTIVE); + if (!PageHead(page)) + __dec_zone_state(zone, NR_INACTIVE); + else + __dec_zone_page_state(page, NR_INACTIVE); } static inline void del_page_from_lru(struct zone *zone, struct page *page) { + enum zone_stat_item counter = NR_ACTIVE; + list_del(&page->lru); - if (PageActive(page)) { + if (PageActive(page)) __ClearPageActive(page); - __dec_zone_state(zone, NR_ACTIVE); - } else { - __dec_zone_state(zone, NR_INACTIVE); - } + else + counter = NR_INACTIVE; + + if (!PageHead(page)) + __dec_zone_state(zone, counter); + else + __dec_zone_page_state(page, counter); } + -- From clameter@sgi.com Wed Jun 20 11:00:02 2007 Message-Id: <20070620180002.288571934@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:57 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [30/37] Add VM_BUG_ONs to check for correct page order Content-Disposition: inline; filename=vps_safety_checks Before we start allowing different page orders we better get checkpoints in at various places in the VM. Checkpoints will help debugging whenever a wrong order page shows up in a mapping. This will be helpful for converting new filesystems to utilize larger pages. Signed-off-by: Christoph Lameter --- fs/buffer.c | 1 + mm/filemap.c | 18 +++++++++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) Index: linux-2.6.22-rc4-mm2/mm/filemap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/filemap.c 2007-06-18 23:09:36.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/filemap.c 2007-06-19 19:20:29.000000000 -0700 @@ -128,6 +128,7 @@ void remove_from_page_cache(struct page struct address_space *mapping = page->mapping; BUG_ON(!PageLocked(page)); + VM_BUG_ON(mapping_order(mapping) != compound_order(page)); write_lock_irq(&mapping->tree_lock); __remove_from_page_cache(page); @@ -269,6 +270,7 @@ int wait_on_page_writeback_range(struct if (page->index > end) continue; + VM_BUG_ON(mapping_order(mapping) != compound_order(page)); wait_on_page_writeback(page); if (PageError(page)) ret = -EIO; @@ -441,6 +443,7 @@ int add_to_page_cache(struct page *page, { int error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); + VM_BUG_ON(mapping_order(mapping) != compound_order(page)); if (error == 0) { write_lock_irq(&mapping->tree_lock); error = radix_tree_insert(&mapping->page_tree, offset, page); @@ -600,8 +603,10 @@ struct page * find_get_page(struct addre read_lock_irq(&mapping->tree_lock); page = radix_tree_lookup(&mapping->page_tree, offset); - if (page) + if (page) { + VM_BUG_ON(mapping_order(mapping) != compound_order(page)); page_cache_get(page); + } read_unlock_irq(&mapping->tree_lock); return page; } @@ -626,6 +631,7 @@ struct page *find_lock_page(struct addre repeat: page = radix_tree_lookup(&mapping->page_tree, offset); if (page) { + VM_BUG_ON(mapping_order(mapping) != compound_order(page)); page_cache_get(page); if (TestSetPageLocked(page)) { read_unlock_irq(&mapping->tree_lock); @@ -711,8 +717,10 @@ unsigned find_get_pages(struct address_s read_lock_irq(&mapping->tree_lock); ret = radix_tree_gang_lookup(&mapping->page_tree, (void **)pages, start, nr_pages); - for (i = 0; i < ret; i++) + for (i = 0; i < ret; i++) { + VM_BUG_ON(mapping_order(mapping) != compound_order(pages[i])); page_cache_get(pages[i]); + } read_unlock_irq(&mapping->tree_lock); return ret; } @@ -743,6 +751,7 @@ unsigned find_get_pages_contig(struct ad if (pages[i]->mapping == NULL || pages[i]->index != index) break; + VM_BUG_ON(mapping_order(mapping) != compound_order(pages[i])); page_cache_get(pages[i]); index++; } @@ -771,8 +780,10 @@ unsigned find_get_pages_tag(struct addre read_lock_irq(&mapping->tree_lock); ret = radix_tree_gang_lookup_tag(&mapping->page_tree, (void **)pages, *index, nr_pages, tag); - for (i = 0; i < ret; i++) + for (i = 0; i < ret; i++) { + VM_BUG_ON(mapping_order(mapping) != compound_order(pages[i])); page_cache_get(pages[i]); + } if (ret) *index = pages[ret - 1]->index + 1; read_unlock_irq(&mapping->tree_lock); @@ -2610,6 +2621,7 @@ int try_to_release_page(struct page *pag struct address_space * const mapping = page->mapping; BUG_ON(!PageLocked(page)); + VM_BUG_ON(mapping_order(mapping) != compound_order(page)); if (PageWriteback(page)) return 0; Index: linux-2.6.22-rc4-mm2/fs/buffer.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/buffer.c 2007-06-19 19:20:29.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/buffer.c 2007-06-19 19:24:12.000000000 -0700 @@ -901,6 +901,7 @@ struct buffer_head *alloc_page_buffers(s long offset; unsigned page_size = page_cache_size(page->mapping); + BUG_ON(size > page_size); try_again: head = NULL; offset = page_size; -- From clameter@sgi.com Wed Jun 20 11:00:02 2007 Message-Id: <20070620180002.452906618@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:58 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [31/37] Large blocksize support: Core piece Content-Disposition: inline; filename=vps_large_page_size Provide an alternate definition for the page_cache_xxx(mapping, ...) functions that can determine the current page size from the mapping and generate the appropriate shifts, sizes and mask for the page cache operations. Change the basic functions that allocate pages for the page cache to be able to handle higher order allocations. Provide a new function mapping_setup(stdruct address_space *, gfp_t mask, int order) that allows the setup of a mapping of any compound page order. mapping_set_gfp_mask() is still provided but it sets mappings to order 0. Calls to mapping_set_gfp_mask() must be converted to mapping_setup() in order for the filesystem to be able to use larger pages. For some key block devices and filesystems the conversion is done here. mapping_setup() for higher order is only allowed if the mapping does not use DMA mappings or HIGHMEM since we do not support bouncing at the moment. Thus we currently BUG() on DMA mappings and clear the highmem bit of higher order mappings. Modify the set_blocksize() function so that an arbitrary blocksize can be set. Blocksizes up to MAX_ORDER can be set. This is typically 8MB on many platforms (order 11). Typically file systems are not only limited by the core VM but also by the structure of their internal data structures. The core VM limitations fall away with this patch. The functionality provided here can do nothing about the internal limitations of filesystems. Known internal limitations: Ext2 64k XFS 64k Reiserfs 8k Ext3 4k Ext4 4k Signed-off-by: Christoph Lameter --- block/Kconfig | 17 ++++++ drivers/block/rd.c | 6 +- fs/block_dev.c | 29 +++++++---- fs/buffer.c | 2 fs/inode.c | 7 +- fs/xfs/linux-2.6/xfs_buf.c | 3 - include/linux/buffer_head.h | 12 ++++ include/linux/fs.h | 5 + include/linux/pagemap.h | 116 +++++++++++++++++++++++++++++++++++++++++--- mm/filemap.c | 17 ++++-- 10 files changed, 186 insertions(+), 28 deletions(-) Index: linux-2.6.22-rc4-mm2/include/linux/pagemap.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/pagemap.h 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/pagemap.h 2007-06-19 23:50:55.000000000 -0700 @@ -39,10 +39,30 @@ static inline gfp_t mapping_gfp_mask(str * This is non-atomic. Only to be used before the mapping is activated. * Probably needs a barrier... */ -static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) +static inline void mapping_setup(struct address_space *m, + gfp_t mask, int order) { m->flags = (m->flags & ~(__force unsigned long)__GFP_BITS_MASK) | (__force unsigned long)mask; + +#ifdef CONFIG_LARGE_BLOCKSIZE + m->order = order; + m->shift = order + PAGE_SHIFT; + m->offset_mask = (PAGE_SIZE << order) - 1; + if (order) { + /* + * Bouncing is not supported. Requests for DMA + * memory will not work + */ + BUG_ON(m->flags & (__GFP_DMA|__GFP_DMA32)); + /* + * Bouncing not supported. We cannot use HIGHMEM + */ + m->flags &= ~__GFP_HIGHMEM; + m->flags |= __GFP_COMP; + raise_kswapd_order(order); + } +#endif } /* @@ -62,6 +82,78 @@ static inline void mapping_set_gfp_mask( #define PAGE_CACHE_ALIGN(addr) (((addr)+PAGE_CACHE_SIZE-1)&PAGE_CACHE_MASK) /* + * The next set of functions allow to write code that is capable of dealing + * with multiple page sizes. + */ +#ifdef CONFIG_LARGE_BLOCKSIZE +/* + * Determine page order from the blkbits in the inode structure + */ +static inline int page_cache_blkbits_to_order(int shift) +{ + BUG_ON(shift < 9); + + if (shift < PAGE_SHIFT) + return 0; + + return shift - PAGE_SHIFT; +} + +/* + * Determine page order from a given blocksize + */ +static inline int page_cache_blocksize_to_order(unsigned long size) +{ + return page_cache_blkbits_to_order(ilog2(size)); +} + +static inline int mapping_order(struct address_space *a) +{ + return a->order; +} + +static inline int page_cache_shift(struct address_space *a) +{ + return a->shift; +} + +static inline unsigned int page_cache_size(struct address_space *a) +{ + return a->offset_mask + 1; +} + +static inline loff_t page_cache_mask(struct address_space *a) +{ + return ~a->offset_mask; +} + +static inline unsigned int page_cache_offset(struct address_space *a, + loff_t pos) +{ + return pos & a->offset_mask; +} +#else +/* + * Kernel configured for a fixed PAGE_SIZEd page cache + */ +static inline int page_cache_blkbits_to_order(int shift) +{ + if (shift < 9) + return -EINVAL; + if (shift > PAGE_SHIFT) + return -EINVAL; + return 0; +} + +static inline int page_cache_blocksize_to_order(unsigned long size) +{ + if (size >= 512 && size <= PAGE_SIZE) + return 0; + + return -EINVAL; +} + +/* * Functions that are currently setup for a fixed PAGE_SIZEd. The use of * these will allow a variable page size pagecache in the future. */ @@ -90,6 +182,7 @@ static inline unsigned int page_cache_of { return pos & ~PAGE_MASK; } +#endif static inline pgoff_t page_cache_index(struct address_space *a, loff_t pos) @@ -112,27 +205,38 @@ static inline loff_t page_cache_pos(stru return ((loff_t)index << page_cache_shift(a)) + offset; } +/* + * Legacy function. Only supports order 0 pages. + */ +static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) +{ + if (mapping_order(m)) + printk(KERN_ERR "mapping_setup(%p, %x, %d)\n", m, mask, mapping_order(m)); + mapping_setup(m, mask, 0); +} + #define page_cache_get(page) get_page(page) #define page_cache_release(page) put_page(page) void release_pages(struct page **pages, int nr, int cold); #ifdef CONFIG_NUMA -extern struct page *__page_cache_alloc(gfp_t gfp); +extern struct page *__page_cache_alloc(gfp_t gfp, int); #else -static inline struct page *__page_cache_alloc(gfp_t gfp) +static inline struct page *__page_cache_alloc(gfp_t gfp, int order) { - return alloc_pages(gfp, 0); + return alloc_pages(gfp, order); } #endif static inline struct page *page_cache_alloc(struct address_space *x) { - return __page_cache_alloc(mapping_gfp_mask(x)); + return __page_cache_alloc(mapping_gfp_mask(x), mapping_order(x)); } static inline struct page *page_cache_alloc_cold(struct address_space *x) { - return __page_cache_alloc(mapping_gfp_mask(x)|__GFP_COLD); + return __page_cache_alloc(mapping_gfp_mask(x)|__GFP_COLD, + mapping_order(x)); } typedef int filler_t(void *, struct page *); Index: linux-2.6.22-rc4-mm2/include/linux/fs.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/fs.h 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/fs.h 2007-06-19 23:33:45.000000000 -0700 @@ -519,6 +519,11 @@ struct address_space { spinlock_t i_mmap_lock; /* protect tree, count, list */ unsigned int truncate_count; /* Cover race condition with truncate */ unsigned long nrpages; /* number of total pages */ +#ifdef CONFIG_LARGE_BLOCKSIZE + loff_t offset_mask; /* Mask to get to offset bits */ + unsigned int order; /* Page order of the pages in here */ + unsigned int shift; /* Shift of index */ +#endif pgoff_t writeback_index;/* writeback starts here */ const struct address_space_operations *a_ops; /* methods */ unsigned long flags; /* error bits/gfp mask */ Index: linux-2.6.22-rc4-mm2/mm/filemap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/filemap.c 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/filemap.c 2007-06-20 00:45:36.000000000 -0700 @@ -472,13 +472,13 @@ int add_to_page_cache_lru(struct page *p } #ifdef CONFIG_NUMA -struct page *__page_cache_alloc(gfp_t gfp) +struct page *__page_cache_alloc(gfp_t gfp, int order) { if (cpuset_do_page_mem_spread()) { int n = cpuset_mem_spread_node(); - return alloc_pages_node(n, gfp, 0); + return alloc_pages_node(n, gfp, order); } - return alloc_pages(gfp, 0); + return alloc_pages(gfp, order); } EXPORT_SYMBOL(__page_cache_alloc); #endif @@ -677,7 +677,7 @@ struct page *find_or_create_page(struct repeat: page = find_lock_page(mapping, index); if (!page) { - page = __page_cache_alloc(gfp_mask); + page = __page_cache_alloc(gfp_mask, mapping_order(mapping)); if (!page) return NULL; err = add_to_page_cache_lru(page, mapping, index, gfp_mask); @@ -815,7 +815,8 @@ grab_cache_page_nowait(struct address_sp page_cache_release(page); return NULL; } - page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS); + page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS, + mapping_order(mapping)); if (page && add_to_page_cache_lru(page, mapping, index, GFP_KERNEL)) { page_cache_release(page); page = NULL; @@ -1536,6 +1537,12 @@ int generic_file_mmap(struct file * file { struct address_space *mapping = file->f_mapping; + /* + * Forbid mmap access to higher order mappings. + */ + if (mapping_order(mapping)) + return -ENOSYS; + if (!mapping->a_ops->readpage) return -ENOEXEC; file_accessed(file); Index: linux-2.6.22-rc4-mm2/block/Kconfig =================================================================== --- linux-2.6.22-rc4-mm2.orig/block/Kconfig 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/block/Kconfig 2007-06-19 23:33:45.000000000 -0700 @@ -49,6 +49,23 @@ config LSF If unsure, say Y. +# +# The functions to switch on larger pages in a filesystem will return an error +# if the gfp flags for a mapping require only DMA pages. Highmem will always +# be switched off for higher order mappings. +# +config LARGE_BLOCKSIZE + bool "Support blocksizes larger than page size" + default n + depends on EXPERIMENTAL + help + Allows the page cache to support higher orders of pages. Higher + order page cache pages may be useful to support special devices + like CD or DVDs and Flash. Also to increase I/O performance. + WARNING: This functionality may have significant memory + requirements. It is not advisable to enable this in configuration + where ZONE_NORMAL is smaller than 1 Gigabyte. + endif # BLOCK source block/Kconfig.iosched Index: linux-2.6.22-rc4-mm2/fs/block_dev.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/block_dev.c 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/block_dev.c 2007-06-19 23:50:01.000000000 -0700 @@ -65,36 +65,46 @@ static void kill_bdev(struct block_devic return; invalidate_bh_lrus(); truncate_inode_pages(bdev->bd_inode->i_mapping, 0); -} +} int set_blocksize(struct block_device *bdev, int size) { - /* Size must be a power of two, and between 512 and PAGE_SIZE */ - if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size)) + int order; + + if (size > (PAGE_SIZE << MAX_ORDER) || size < 512 || + !is_power_of_2(size)) return -EINVAL; /* Size cannot be smaller than the size supported by the device */ if (size < bdev_hardsect_size(bdev)) return -EINVAL; + order = page_cache_blocksize_to_order(size); + /* Don't change the size if it is same as current */ if (bdev->bd_block_size != size) { + int bits = blksize_bits(size); + struct address_space *mapping = + bdev->bd_inode->i_mapping; + sync_blockdev(bdev); - bdev->bd_block_size = size; - bdev->bd_inode->i_blkbits = blksize_bits(size); kill_bdev(bdev); + bdev->bd_block_size = size; + bdev->bd_inode->i_blkbits = bits; + mapping_setup(mapping, GFP_NOFS, order); } return 0; } - EXPORT_SYMBOL(set_blocksize); int sb_set_blocksize(struct super_block *sb, int size) { if (set_blocksize(sb->s_bdev, size)) return 0; - /* If we get here, we know size is power of two - * and it's value is between 512 and PAGE_SIZE */ + /* + * If we get here, we know size is power of two + * and it's value is valid for the page cache + */ sb->s_blocksize = size; sb->s_blocksize_bits = blksize_bits(size); return sb->s_blocksize; @@ -588,7 +598,8 @@ struct block_device *bdget(dev_t dev) inode->i_rdev = dev; inode->i_bdev = bdev; inode->i_data.a_ops = &def_blk_aops; - mapping_set_gfp_mask(&inode->i_data, GFP_USER); + mapping_setup(&inode->i_data, GFP_USER, + page_cache_blkbits_to_order(inode->i_blkbits)); inode->i_data.backing_dev_info = &default_backing_dev_info; spin_lock(&bdev_lock); list_add(&bdev->bd_list, &all_bdevs); Index: linux-2.6.22-rc4-mm2/fs/buffer.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/buffer.c 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/buffer.c 2007-06-20 00:16:41.000000000 -0700 @@ -1098,7 +1098,7 @@ __getblk_slow(struct block_device *bdev, { /* Size must be multiple of hard sectorsize */ if (unlikely(size & (bdev_hardsect_size(bdev)-1) || - (size < 512 || size > PAGE_SIZE))) { + size < 512 || size > (PAGE_SIZE << MAX_ORDER))) { printk(KERN_ERR "getblk(): invalid block size %d requested\n", size); printk(KERN_ERR "hardsect size: %d\n", Index: linux-2.6.22-rc4-mm2/fs/inode.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/inode.c 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/inode.c 2007-06-19 23:53:56.000000000 -0700 @@ -145,7 +145,8 @@ static struct inode *alloc_inode(struct mapping->a_ops = &empty_aops; mapping->host = inode; mapping->flags = 0; - mapping_set_gfp_mask(mapping, GFP_HIGHUSER_PAGECACHE); + mapping_setup(mapping, GFP_HIGHUSER_PAGECACHE, + page_cache_blkbits_to_order(inode->i_blkbits)); mapping->assoc_mapping = NULL; mapping->backing_dev_info = &default_backing_dev_info; @@ -243,7 +244,7 @@ void clear_inode(struct inode *inode) { might_sleep(); invalidate_inode_buffers(inode); - + BUG_ON(inode->i_data.nrpages); BUG_ON(!(inode->i_state & I_FREEING)); BUG_ON(inode->i_state & I_CLEAR); @@ -528,7 +529,7 @@ repeat: * for allocations related to inode->i_mapping is GFP_HIGHUSER_PAGECACHE. * If HIGHMEM pages are unsuitable or it is known that pages allocated * for the page cache are not reclaimable or migratable, - * mapping_set_gfp_mask() must be called with suitable flags on the + * mapping_setup() must be called with suitable flags and bits on the * newly created inode's mapping * */ Index: linux-2.6.22-rc4-mm2/drivers/block/rd.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/drivers/block/rd.c 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/drivers/block/rd.c 2007-06-20 00:35:55.000000000 -0700 @@ -121,7 +121,8 @@ static void make_page_uptodate(struct pa } } while ((bh = bh->b_this_page) != head); } else { - memset(page_address(page), 0, page_cache_size(page_mapping(page))); + memset(page_address(page), 0, + page_cache_size(page_mapping(page))); } flush_dcache_page(page); SetPageUptodate(page); @@ -380,7 +381,8 @@ static int rd_open(struct inode *inode, gfp_mask = mapping_gfp_mask(mapping); gfp_mask &= ~(__GFP_FS|__GFP_IO); gfp_mask |= __GFP_HIGH; - mapping_set_gfp_mask(mapping, gfp_mask); + mapping_setup(mapping, gfp_mask, + page_cache_blkbits_to_order(inode->i_blkbits)); } return 0; Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_buf.c 2007-06-19 23:33:45.000000000 -0700 @@ -1558,7 +1558,8 @@ xfs_mapping_buftarg( mapping = &inode->i_data; mapping->a_ops = &mapping_aops; mapping->backing_dev_info = bdi; - mapping_set_gfp_mask(mapping, GFP_NOFS); + mapping_setup(mapping, GFP_NOFS, + page_cache_blkbits_to_order(inode->i_blkbits)); btp->bt_mapping = mapping; return 0; } Index: linux-2.6.22-rc4-mm2/include/linux/buffer_head.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/buffer_head.h 2007-06-19 23:33:44.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/buffer_head.h 2007-06-19 23:33:45.000000000 -0700 @@ -129,7 +129,17 @@ BUFFER_FNS(Ordered, ordered) BUFFER_FNS(Eopnotsupp, eopnotsupp) BUFFER_FNS(Unwritten, unwritten) -#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK) +static inline unsigned long bh_offset(struct buffer_head *bh) +{ + /* + * No mapping available. Use page struct to obtain + * order. + */ + unsigned long mask = compound_size(bh->b_page) - 1; + + return (unsigned long)bh->b_data & mask; +} + #define touch_buffer(bh) mark_page_accessed(bh->b_page) /* If we *know* page->private refers to buffer_heads */ -- From clameter@sgi.com Wed Jun 20 11:00:02 2007 Message-Id: <20070620180002.629247461@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 10:59:59 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky , Fengguang Wu Subject: [32/37] Readahead changes to support large blocksize. Content-Disposition: inline; filename=vps-readahead.patch Fix up readhead for large I/O operations. Only calculate the readahead until the 2M boundary then fall back to one page. Signed-off-by: Fengguang Wu Signed-off-by: Christoph Lameter =================================================================== --- include/linux/mm.h | 2 +- mm/fadvise.c | 4 ++-- mm/filemap.c | 5 ++--- mm/madvise.c | 2 +- mm/readahead.c | 22 ++++++++++++++-------- 5 files changed, 20 insertions(+), 15 deletions(-) Index: linux-2.6.22-rc4-mm2/mm/fadvise.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/fadvise.c 2007-06-18 23:09:37.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/fadvise.c 2007-06-19 20:01:53.000000000 -0700 @@ -86,10 +86,10 @@ asmlinkage long sys_fadvise64_64(int fd, nrpages = end_index - start_index + 1; if (!nrpages) nrpages = ~0UL; - + ret = force_page_cache_readahead(mapping, file, start_index, - max_sane_readahead(nrpages)); + nrpages); if (ret > 0) ret = 0; break; Index: linux-2.6.22-rc4-mm2/mm/filemap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/filemap.c 2007-06-19 19:28:15.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/filemap.c 2007-06-19 20:01:53.000000000 -0700 @@ -1288,8 +1288,7 @@ do_readahead(struct address_space *mappi if (!mapping || !mapping->a_ops || !mapping->a_ops->readpage) return -EINVAL; - force_page_cache_readahead(mapping, filp, index, - max_sane_readahead(nr)); + force_page_cache_readahead(mapping, filp, index, nr); return 0; } @@ -1427,7 +1426,7 @@ retry_find: count_vm_event(PGMAJFAULT); } did_readaround = 1; - ra_pages = max_sane_readahead(file->f_ra.ra_pages); + ra_pages = file->f_ra.ra_pages; if (ra_pages) { pgoff_t start = 0; Index: linux-2.6.22-rc4-mm2/mm/madvise.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/madvise.c 2007-06-04 17:57:25.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/madvise.c 2007-06-19 20:01:53.000000000 -0700 @@ -124,7 +124,7 @@ static long madvise_willneed(struct vm_a end = ((end - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; force_page_cache_readahead(file->f_mapping, - file, start, max_sane_readahead(end - start)); + file, start, end - start); return 0; } Index: linux-2.6.22-rc4-mm2/mm/readahead.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/readahead.c 2007-06-15 17:35:33.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/readahead.c 2007-06-19 20:01:53.000000000 -0700 @@ -44,7 +44,8 @@ EXPORT_SYMBOL_GPL(default_backing_dev_in void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { - ra->ra_pages = mapping->backing_dev_info->ra_pages; + ra->ra_pages = DIV_ROUND_UP(mapping->backing_dev_info->ra_pages, + page_cache_size(mapping)); ra->prev_index = -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); @@ -82,7 +83,7 @@ int read_cache_pages(struct address_spac put_pages_list(pages); break; } - task_io_account_read(PAGE_CACHE_SIZE); + task_io_account_read(page_cache_size(mapping)); } return ret; } @@ -143,7 +144,7 @@ __do_page_cache_readahead(struct address if (isize == 0) goto out; - end_index = ((isize - 1) >> PAGE_CACHE_SHIFT); + end_index = page_cache_index(mapping, isize - 1); /* * Preallocate as many pages as we will need. @@ -196,10 +197,12 @@ int force_page_cache_readahead(struct ad if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages)) return -EINVAL; + nr_to_read = max_sane_readahead(nr_to_read, mapping_order(mapping)); while (nr_to_read) { int err; - unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_CACHE_SIZE; + unsigned long this_chunk = DIV_ROUND_UP(2 * 1024 * 1024, + page_cache_size(mapping)); if (this_chunk > nr_to_read) this_chunk = nr_to_read; @@ -229,17 +232,20 @@ int do_page_cache_readahead(struct addre if (bdi_read_congested(mapping->backing_dev_info)) return -1; + nr_to_read = max_sane_readahead(nr_to_read, mapping_order(mapping)); return __do_page_cache_readahead(mapping, filp, offset, nr_to_read, 0); } /* - * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a + * Given a desired number of page order readahead pages, return a * sensible upper limit. */ -unsigned long max_sane_readahead(unsigned long nr) +unsigned long max_sane_readahead(unsigned long nr, int order) { - return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE) - + node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2); + unsigned long base_pages = node_page_state(numa_node_id(), NR_INACTIVE) + + node_page_state(numa_node_id(), NR_FREE_PAGES); + + return min(nr, (base_pages / 2) >> order); } /* Index: linux-2.6.22-rc4-mm2/include/linux/mm.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/mm.h 2007-06-18 23:09:37.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/mm.h 2007-06-19 20:01:53.000000000 -0700 @@ -1167,7 +1167,7 @@ unsigned long page_cache_readahead_ondem struct page *page, pgoff_t offset, unsigned long size); -unsigned long max_sane_readahead(unsigned long nr); +unsigned long max_sane_readahead(unsigned long nr, int order); /* Do stack extension */ extern int expand_stack(struct vm_area_struct *vma, unsigned long address); -- From clameter@sgi.com Wed Jun 20 11:00:02 2007 Message-Id: <20070620180002.795638591@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 11:00:00 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [33/37] Large blocksize: Compound page zeroing and flushing Content-Disposition: inline; filename=vps_flush_compound_page We may now have to zero and flush higher order pages. Implement clear_mapping_page and flush_mapping_page to do that job. Replace the flushing and clearing at some key locations for the pagecache. Signed-off-by: Christoph Lameter --- fs/libfs.c | 4 ++-- include/linux/highmem.h | 31 +++++++++++++++++++++++++++++-- include/linux/pagemap.h | 1 + mm/filemap.c | 8 ++++---- mm/filemap_xip.c | 4 ++-- 5 files changed, 38 insertions(+), 10 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/libfs.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/libfs.c 2007-06-19 20:10:05.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/libfs.c 2007-06-19 20:10:45.000000000 -0700 @@ -330,8 +330,8 @@ int simple_rename(struct inode *old_dir, int simple_readpage(struct file *file, struct page *page) { - clear_highpage(page); - flush_dcache_page(page); + clear_mapping_page(page); + flush_mapping_page(page); SetPageUptodate(page); unlock_page(page); return 0; Index: linux-2.6.22-rc4-mm2/include/linux/highmem.h =================================================================== --- linux-2.6.22-rc4-mm2.orig/include/linux/highmem.h 2007-06-19 20:06:06.000000000 -0700 +++ linux-2.6.22-rc4-mm2/include/linux/highmem.h 2007-06-19 20:30:06.000000000 -0700 @@ -124,14 +124,41 @@ static inline void clear_highpage(struct kunmap_atomic(kaddr, KM_USER0); } +/* + * Clear a higher order page + */ +static inline void clear_mapping_page(struct page *page) +{ + int nr_pages = compound_pages(page); + int i; + + for (i = 0; i < nr_pages; i++) + clear_highpage(page + i); +} + +/* + * Primitive support for flushing higher order pages. + * + * A bit stupid: On many platforms flushing the first page + * will flush any TLB starting there + */ +static inline void flush_mapping_page(struct page *page) +{ + int nr_pages = compound_pages(page); + int i; + + for (i = 0; i < nr_pages; i++) + flush_dcache_page(page + i); +} + static inline void zero_user_segments(struct page *page, unsigned start1, unsigned end1, unsigned start2, unsigned end2) { void *kaddr = kmap_atomic(page, KM_USER0); - BUG_ON(end1 > PAGE_SIZE || - end2 > PAGE_SIZE); + BUG_ON(end1 > compound_size(page) || + end2 > compound_size(page)); if (end1 > start1) memset(kaddr + start1, 0, end1 - start1); Index: linux-2.6.22-rc4-mm2/mm/filemap.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/filemap.c 2007-06-19 20:10:52.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/filemap.c 2007-06-19 20:11:44.000000000 -0700 @@ -946,7 +946,7 @@ page_ok: * before reading the page on the kernel side. */ if (mapping_writably_mapped(mapping)) - flush_dcache_page(page); + flush_mapping_page(page); /* * When a sequential read accesses a page several times, @@ -2004,7 +2004,7 @@ int pagecache_write_end(struct file *fil unsigned offset = page_cache_offset(mapping, pos); struct inode *inode = mapping->host; - flush_dcache_page(page); + flush_mapping_page(page); ret = aops->commit_write(file, page, offset, offset+len); unlock_page(page); page_cache_release(page); @@ -2216,7 +2216,7 @@ static ssize_t generic_perform_write_2co kunmap_atomic(src, KM_USER0); copied = bytes; } - flush_dcache_page(page); + flush_mapping_page(page); status = a_ops->commit_write(file, page, offset, offset+bytes); if (unlikely(status < 0 || status == AOP_TRUNCATED_PAGE)) @@ -2314,7 +2314,7 @@ again: pagefault_disable(); copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes); pagefault_enable(); - flush_dcache_page(page); + flush_mapping_page(page); status = a_ops->write_end(file, mapping, pos, bytes, copied, page, fsdata); Index: linux-2.6.22-rc4-mm2/mm/filemap_xip.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/mm/filemap_xip.c 2007-06-19 20:12:10.000000000 -0700 +++ linux-2.6.22-rc4-mm2/mm/filemap_xip.c 2007-06-19 20:12:46.000000000 -0700 @@ -103,7 +103,7 @@ do_xip_mapping_read(struct address_space * before reading the page on the kernel side. */ if (mapping_writably_mapped(mapping)) - flush_dcache_page(page); + flush_mapping_page(page); /* * Ok, we have the page, so now we can copy it to user space... @@ -347,7 +347,7 @@ __xip_file_write(struct file *filp, cons copied = bytes - __copy_from_user_inatomic_nocache(kaddr, buf, bytes); kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(page); + flush_mapping_page(page); if (likely(copied > 0)) { status = copied; -- From clameter@sgi.com Wed Jun 20 11:00:03 2007 Message-Id: <20070620180003.002719480@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 11:00:01 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [34/37] Large blocksize support in ramfs Content-Disposition: inline; filename=vps_filesystem_ramfs The simplest file system to use for larg blocksize support is ramfs. Add a mount parameter that specifies the page order of the pages that ramfs should use. Note that ramfs does not use the lower layers (buffer I/O etc) so this case is useful for initial testing of changes to large buffer size support if one just wants to exercise the higher layers. If you apply this patch and then you can f.e. try this: mount -tramfs -o10 none /media Mounts a ramfs filesystem with order 10 pages (4 MB) cp linux-2.6.21-rc7.tar.gz /media Populate the ramfs. Note that we allocate 14 pages of 4M each instead of 13508.. umount /media Gets rid of the large pages again Signed-off-by: Christoph Lameter --- fs/ramfs/inode.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/ramfs/inode.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ramfs/inode.c 2007-06-19 19:34:10.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ramfs/inode.c 2007-06-19 20:01:04.000000000 -0700 @@ -60,7 +60,8 @@ struct inode *ramfs_get_inode(struct sup inode->i_blocks = 0; inode->i_mapping->a_ops = &ramfs_aops; inode->i_mapping->backing_dev_info = &ramfs_backing_dev_info; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_setup(inode->i_mapping, GFP_HIGHUSER, + sb->s_blocksize_bits - PAGE_SHIFT); inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; switch (mode & S_IFMT) { default: @@ -164,10 +165,15 @@ static int ramfs_fill_super(struct super { struct inode * inode; struct dentry * root; + int order = 0; + char *options = data; + + if (options && *options) + order = simple_strtoul(options, NULL, 10); sb->s_maxbytes = MAX_LFS_FILESIZE; - sb->s_blocksize = PAGE_CACHE_SIZE; - sb->s_blocksize_bits = PAGE_CACHE_SHIFT; + sb->s_blocksize = PAGE_CACHE_SIZE << order; + sb->s_blocksize_bits = order + PAGE_CACHE_SHIFT; sb->s_magic = RAMFS_MAGIC; sb->s_op = &ramfs_ops; sb->s_time_gran = 1; -- From clameter@sgi.com Wed Jun 20 11:00:03 2007 Message-Id: <20070620180003.133536762@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 11:00:02 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [35/37] Large blocksize support in XFS Content-Disposition: inline; filename=vps_filesystem_xfs From: David Chinner The only thing that needs to change to enable Large Block I/O is to remove the check for a too large blocksize ;-) Signed-off-by: Dave Chinner Signed-off-by: Christoph Lameter --- fs/xfs/xfs_mount.c | 13 ------------- 1 file changed, 13 deletions(-) Index: linux-2.6.22-rc4-mm2/fs/xfs/xfs_mount.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/xfs/xfs_mount.c 2007-06-18 19:05:21.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/xfs/xfs_mount.c 2007-06-19 19:45:33.000000000 -0700 @@ -326,19 +326,6 @@ xfs_mount_validate_sb( return XFS_ERROR(ENOSYS); } - /* - * Until this is fixed only page-sized or smaller data blocks work. - */ - if (unlikely(sbp->sb_blocksize > PAGE_SIZE)) { - xfs_fs_mount_cmn_err(flags, - "file system with blocksize %d bytes", - sbp->sb_blocksize); - xfs_fs_mount_cmn_err(flags, - "only pagesize (%ld) or less will currently work.", - PAGE_SIZE); - return XFS_ERROR(ENOSYS); - } - return 0; } -- From clameter@sgi.com Wed Jun 20 11:00:03 2007 Message-Id: <20070620180003.299201331@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 11:00:03 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [36/37] Large blocksize support for ext2 Content-Disposition: inline; filename=vps_filesystem_ext2 This adds support for a block size of up to 64k on any platform. It enables the mounting filesystems that have a larger blocksize than the page size. F.e. the following is possible on x86_64 and i386 that have only a 4k page size. mke2fs -b 16384 /dev/hdd2 mount /dev/hdd2 /media ls -l /media .... Do more things with the volume that uses a 16k page cache size on a 4k page sized platform.. Hmmm... Actually there is nothing additional to be done after the earlier cleanup of the macros. So just modify copyright. Signed-off-by: Christoph Lameter --- fs/ext2/inode.c | 3 +++ 1 file changed, 3 insertions(+) Index: linux-2.6.22-rc4-mm2/fs/ext2/inode.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/ext2/inode.c 2007-06-19 19:40:56.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/ext2/inode.c 2007-06-19 19:41:56.000000000 -0700 @@ -20,6 +20,9 @@ * (jj@sunsite.ms.mff.cuni.cz) * * Assorted race fixes, rewrite of ext2_get_block() by Al Viro, 2000 + * + * (C) 2007 SGI. + * Large blocksize support by Christoph Lameter */ #include -- From clameter@sgi.com Wed Jun 20 11:00:03 2007 Message-Id: <20070620180003.459316407@sgi.com> References: <20070620175927.667715964@sgi.com> User-Agent: quilt/0.46-1 Date: Wed, 20 Jun 2007 11:00:04 -0700 From: clameter@sgi.com To: linux-filesystems@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Mel Gorman , William Lee Irwin III , David Chinner , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: [37/37] Reiserfs: Fix up for mapping_set_gfp_mask Content-Disposition: inline; filename=vps_filesystem_reiserfs mapping_set_gfp_mask only works on order 0 page cache operations. Reiserfs can use 8k pages (order 1). Replace the mapping_set_gfp_mask with mapping_setup to make this work properly. Signed-off-by: Christoph Lameter --- fs/reiserfs/xattr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c =================================================================== --- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/xattr.c 2007-06-19 23:54:38.000000000 -0700 +++ linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c 2007-06-19 23:56:40.000000000 -0700 @@ -405,9 +405,10 @@ static struct page *reiserfs_get_page(st { struct address_space *mapping = dir->i_mapping; struct page *page; + /* We can deadlock if we try to free dentries, and an unlink/rmdir has just occured - GFP_NOFS avoids this */ - mapping_set_gfp_mask(mapping, GFP_NOFS); + mapping_setup(mapping, GFP_NOFS, page_cache_shift(mapping)); page = read_mapping_page(mapping, n, NULL); if (!IS_ERR(page)) { kmap(page); --