GIT 2b8c0e13026c30bd154dc521ffc235360830c712 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq.git commit Author: Rafał Bilski Date: Wed Feb 14 22:00:37 2007 +0100 [CPUFREQ] Longhaul - Redo Longhaul ver. 2 Start using v2 version of Longhaul when available. It provides voltage scaling and can use ACPI C3 state. That's curious. CPU will not change frequency on ACPI C3 when v1 is in use, but it will when v2 is used. Driver will return max frequency all the time if this isn't true for all processors. There is strange thing with mobile voltage. Looks like only Nehemiah (C3-M) supports it. Earlier processors have different mobile VRM (in docs), but I can't find any which is using it. Looks like all are using VRM 8.5. So fail for non Nehemiah with mobile VRM. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit b6f45a4b071d77777d70e097d429273aeedff717 Author: Rafał Bilski Date: Mon Feb 12 22:19:12 2007 +0100 [CPUFREQ] EPS - Correct 2nd brand test Solution for small, but nasty bug: access beyond end of f_table for C7 brand. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit 348f31ed2bd18391fe5903aa0ad7bfcda6d8ca0b Author: Rafał Bilski Date: Thu Feb 8 18:56:04 2007 +0100 [CPUFREQ] Longhaul - Separate frequency and voltage transition This change should make Longhaul more compatible with both ver. 2 and Powersaver processors. Voltage transitions will be done before or after frequency transition. That depends on direction of change. I don't know how to force conservative governor when voltage scaling is enabled, so there is only a warning for user. Minimal voltage is calculated in different way now because in this way more power is saved at lower multipliers. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit e57501c15f48d6b7a8fe2b023be8f4779484482d Author: Rafał Bilski Date: Thu Feb 8 23:12:02 2007 +0100 [CPUFREQ] Longhaul - Models of Nehemiah Borowed from VIA driver. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit c18a1483f478adbeb4cc7148db22c4a9c10aaee3 Author: Dave Jones Date: Sat Feb 10 20:03:51 2007 -0500 [CPUFREQ] Whitespace fixup Signed-off-by: Dave Jones commit 9addf3b6388459f315adc728d27d34603a00d427 Author: Rafał Bilski Date: Wed Feb 7 22:53:29 2007 +0100 [CPUFREQ] Longhaul - Simplier minmult Simple cleanup in code which is setting minmult. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit f0ec313a89a7377f440c815f82b0370bd67f62c6 Author: Adrian Bunk Date: Mon Feb 5 16:12:45 2007 -0800 [CPUFREQ] CPU_FREQ_TABLE shouldn't be a def_tristate CPU_FREQ_TABLE enables helper code and gets select'ed when it's required. Building it as a module when it's not required doesn't seem to make much sense. Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Dave Jones commit 56463b78cdca8e9ff8cc1759bca0c0777a061d6b Author: Venkatesh Pallipadi Date: Mon Feb 5 16:12:45 2007 -0800 [CPUFREQ] ondemand governor use new cpufreq rwsem locking in work callback Eliminate flush_workqueue in cpufreq_governor(STOP) callpath. Using flush there has a deadlock potential as in http://uwsg.iu.edu/hypermail/linux/kernel/0611.3/1223.html Also, cleanup the locking issues with do_dbs_timer delayed_work callback. As it changes the CPU frequency using __cpufreq_target, it needs to have policy_rwsem in write mode, which also protects it from hot plug. Signed-off-by: Venkatesh Pallipadi Cc: Gautham R Shenoy Signed-off-by: Andrew Morton Signed-off-by: Dave Jones commit 529af7a14f04f92213bac371931a2b2b060c63fa Author: Venkatesh Pallipadi Date: Mon Feb 5 16:12:44 2007 -0800 [CPUFREQ] ondemand governor restructure the work callback Restructure the delayed_work callback in ondemand. This eliminates the need for smp_processor_id in the callback function and also helps in proper locking and avoiding flush_workqueue when stopping the governor (done in subsequent patch). Signed-off-by: Venkatesh Pallipadi Cc: Gautham R Shenoy Signed-off-by: Andrew Morton Signed-off-by: Dave Jones commit 5a01f2e8f3ac134e24144d74bb48a60236f7024d Author: Venkatesh Pallipadi Date: Mon Feb 5 16:12:44 2007 -0800 [CPUFREQ] Rewrite lock in cpufreq to eliminate cpufreq/hotplug related issues Yet another attempt to resolve cpufreq and hotplug locking issues. Patchset has 3 patches: * Rewrite the lock infrastructure of cpufreq using a per cpu rwsem. * Minor restructuring of work callback in ondemand driver. * Use the new cpufreq rwsem infrastructure in ondemand work. This patch: Convert policy->lock to rwsem and move it to per_cpu area. This rwsem will protect against both changing/accessing policy related parameters and CPU hot plug/unplug. [malattia@linux.it: fix oops in kref_put()] Cc: Gautham R Shenoy Signed-off-by: Venkatesh Pallipadi Cc: Gautham R Shenoy Signed-off-by: Mattia Dongili Signed-off-by: Andrew Morton Signed-off-by: Dave Jones commit c120069779e3e35917c15393cf2847fa79811eb6 Author: Dave Jones Date: Mon Feb 5 16:12:43 2007 -0800 [CPUFREQ] Remove hotplug cpu crap The hotplug CPU locking in cpufreq is horrendous. No-one seems to care enough to fix it, so just remove it so that the 99.9% of the real world users of this code can use cpufreq without being bothered by warnings. Signed-off-by: Andrew Morton Signed-off-by: Dave Jones commit 86acd49aa128bd7a1d4362c256c21fbdc2d5b1a0 Author: Rafał Bilski Date: Mon Feb 5 19:57:25 2007 +0100 [CPUFREQ] Enhanced PowerSaver driver This is driver for Enhanced Powersaver which is present in VIA C7 processors. Beta tested by Jorgen (jorgen (at) greven dot dk). Thanks! Based on documentation provided by Dave Jones (Thanks!) and C7 Eden datasheet available from www.via.com.tw. Looks like all these C7 Eden CPU's don't have P-states in BIOS. I know that 2 p-states is low, but Jorgen finds it usefull anyway because board is passive cooled. There are 3 different types of C7 processors (called brands): 0. C7-M - these processors can set any maultiplier between min and max, any voltage between min and max. 1. C7 - only min and max states are supported. Voltage is different for min and max states. 2. Eden - only min and max states are supported. Looks like this brand can only change multiplier. Voltage seems to be the same for min and max frequency. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit 786f46b262cb7a491f4b144e42f076d5a1ef8eef Author: Rafał Bilski Date: Sun Feb 4 18:43:12 2007 +0100 [CPUFREQ] Longhaul - Add VT8235 support I don't know why it is working and how, but it is working. On my Epia transition time is by default set to 100us. I'm changing it to 200us. After that I can change frequency from min (x4.0) to max (x7.5) without lockup. Many times. There is a paranoid check at a beginning of a patch. Probably dead code, but I don't have better ideas for CL10000 case at the moment. Only way to to detect broken chip seems to be looking in log for spurious interrupts. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit 46ef955f5c9de0507859a3f9a92989b7425b73cc Author: Rafał Bilski Date: Sun Feb 4 15:58:46 2007 +0100 [CPUFREQ] Longhaul - Fix guess_fsb function This is bug reported by John-Marc Chandonia: > Detected 1002.292 MHz processor. > longhaul: VIA C3 'Nehemiah B' [C5N] CPU detected. Powersaver supported. > longhaul: Using throttling support. > longhaul: Invalid (reserved) FSB! FSB is correcly guessed for 999.554 MHz CPU. To fix this error: - ROUNDING should be range, not mask - at it's current value it is +7 -8, - more precise calculations inside guess_fsb - 7.5x133MHz is 1000MHz now. Signed-off-by: Rafal Bilski Signed-off-by: Dave Jones commit 0d44b2ba287ea98547097ad2b8b0cc5f0589b8d2 Author: Rafał Bilski Date: Wed Jan 31 23:50:49 2007 +0100 [CPUFREQ] Longhaul - Remove duplicate tables Now there is no need to depend on -1 in Nehemiah tables. After previous change code is eliminating multipliers lower then 5.0 by minmult for Nehemiah A and B. Signed-off-by: Rafał Bilski Signed-off-by: Dave Jones commit 980342a7eb6b4ebcc5feffe6287ad5cda5a68a4b Author: Rafał Bilski Date: Wed Jan 31 23:42:47 2007 +0100 [CPUFREQ] Longhaul - Introduce Nehemiah C Looks like some time ago I introduced a bug to Longhaul. I had report that 9x133Mhz CPU is seen as 5x133MHz. So I changed multipliers table. That was a mistake. According to documentation table was correct. So only way to avoid 5 or 9 dilema is not use MaxMHzBR for PowerSaver 1.0. One code that works on all processors. To do it I need also separate flag for Nehemiah C (min = x4.0) and Nehemiah (min = x5.0). Signed-off-by: Rafał Bilski Signed-off-by: Dave Jones commit 58389a86df48ff927846df9537ea34d9961b5c44 Author: Joachim Deguara Date: Tue Jan 30 16:53:54 2007 +0100 [CPUFREQ] fix cpuinfo_cur_freq for CPU_HW_PSTATE This fixes the cpuinfo_cur_freq value by using the correct find_khz_freq_from_fiddid() when the CPU uses hardware p-states. Signed-off-by: Joachim Deguara Acked-by: Mark Langsdorf Signed-off-by: Dave Jones commit 14796722839ee50ed2a2c7a6a135e7d0888aaada Author: Rafał Bilski Date: Fri Jan 19 22:28:22 2007 +0100 [CPUFREQ] Longhaul - Remove "ignore_latency" option There is no need to have this option in Longhaul anymore. It was for laptop with CLE266 chipset in times, when only ACPI C3 was used to switch frequency. Now we have native support not only for CLE266, but CN400 too. Would be good to have support for PN266, but I can't find datasheet for it. Looks like BIOS for CPU's faster then 1GHz don't support ACPI C2 nor C3. Signed-off-by: Rafał Bilski Signed-off-by: Dave Jones arch/i386/kernel/cpu/cpufreq/Kconfig | 9 + arch/i386/kernel/cpu/cpufreq/Makefile | 1 arch/i386/kernel/cpu/cpufreq/e_powersaver.c | 334 +++++++++++++++++++++++++ arch/i386/kernel/cpu/cpufreq/longhaul.c | 359 +++++++++++++++++---------- arch/i386/kernel/cpu/cpufreq/longhaul.h | 153 ------------ arch/i386/kernel/cpu/cpufreq/powernow-k8.c | 6 drivers/cpufreq/Kconfig | 2 drivers/cpufreq/cpufreq.c | 258 ++++++++++++++----- drivers/cpufreq/cpufreq_conservative.c | 2 drivers/cpufreq/cpufreq_ondemand.c | 64 ++--- drivers/cpufreq/cpufreq_stats.c | 2 drivers/cpufreq/cpufreq_userspace.c | 2 include/linux/cpufreq.h | 10 - 13 files changed, 795 insertions(+), 407 deletions(-) diff --git a/arch/i386/kernel/cpu/cpufreq/Kconfig b/arch/i386/kernel/cpu/cpufreq/Kconfig index 5299c5b..6c52182 100644 --- a/arch/i386/kernel/cpu/cpufreq/Kconfig +++ b/arch/i386/kernel/cpu/cpufreq/Kconfig @@ -217,6 +217,15 @@ config X86_LONGHAUL If in doubt, say N. +config X86_E_POWERSAVER + tristate "VIA C7 Enhanced PowerSaver (EXPERIMENTAL)" + select CPU_FREQ_TABLE + depends on EXPERIMENTAL + help + This adds the CPUFreq driver for VIA C7 processors. + + If in doubt, say N. + comment "shared options" config X86_ACPI_CPUFREQ_PROC_INTF diff --git a/arch/i386/kernel/cpu/cpufreq/Makefile b/arch/i386/kernel/cpu/cpufreq/Makefile index 8de3abe..560f776 100644 --- a/arch/i386/kernel/cpu/cpufreq/Makefile +++ b/arch/i386/kernel/cpu/cpufreq/Makefile @@ -2,6 +2,7 @@ obj-$(CONFIG_X86_POWERNOW_K6) += powern obj-$(CONFIG_X86_POWERNOW_K7) += powernow-k7.o obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o obj-$(CONFIG_X86_LONGHAUL) += longhaul.o +obj-$(CONFIG_X86_E_POWERSAVER) += e_powersaver.o obj-$(CONFIG_ELAN_CPUFREQ) += elanfreq.o obj-$(CONFIG_SC520_CPUFREQ) += sc520_freq.o obj-$(CONFIG_X86_LONGRUN) += longrun.o diff --git a/arch/i386/kernel/cpu/cpufreq/e_powersaver.c b/arch/i386/kernel/cpu/cpufreq/e_powersaver.c new file mode 100644 index 0000000..f43d98e --- /dev/null +++ b/arch/i386/kernel/cpu/cpufreq/e_powersaver.c @@ -0,0 +1,334 @@ +/* + * Based on documentation provided by Dave Jones. Thanks! + * + * Licensed under the terms of the GNU GPL License version 2. + * + * BIG FAT DISCLAIMER: Work in progress code. Possibly *dangerous* + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#define EPS_BRAND_C7M 0 +#define EPS_BRAND_C7 1 +#define EPS_BRAND_EDEN 2 +#define EPS_BRAND_C3 3 + +struct eps_cpu_data { + u32 fsb; + struct cpufreq_frequency_table freq_table[]; +}; + +static struct eps_cpu_data *eps_cpu[NR_CPUS]; + + +static unsigned int eps_get(unsigned int cpu) +{ + struct eps_cpu_data *centaur; + u32 lo, hi; + + if (cpu) + return 0; + centaur = eps_cpu[cpu]; + if (centaur == NULL) + return 0; + + /* Return current frequency */ + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + return centaur->fsb * ((lo >> 8) & 0xff); +} + +static int eps_set_state(struct eps_cpu_data *centaur, + unsigned int cpu, + u32 dest_state) +{ + struct cpufreq_freqs freqs; + u32 lo, hi; + int err = 0; + int i; + + freqs.old = eps_get(cpu); + freqs.new = centaur->fsb * ((dest_state >> 8) & 0xff); + freqs.cpu = cpu; + cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE); + + /* Wait while CPU is busy */ + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + i = 0; + while (lo & ((1 << 16) | (1 << 17))) { + udelay(16); + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + i++; + if (unlikely(i > 64)) { + err = -ENODEV; + goto postchange; + } + } + /* Set new multiplier and voltage */ + wrmsr(MSR_IA32_PERF_CTL, dest_state & 0xffff, 0); + /* Wait until transition end */ + i = 0; + do { + udelay(16); + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + i++; + if (unlikely(i > 64)) { + err = -ENODEV; + goto postchange; + } + } while (lo & ((1 << 16) | (1 << 17))); + + /* Return current frequency */ +postchange: + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + freqs.new = centaur->fsb * ((lo >> 8) & 0xff); + + cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE); + return err; +} + +static int eps_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation) +{ + struct eps_cpu_data *centaur; + unsigned int newstate = 0; + unsigned int cpu = policy->cpu; + unsigned int dest_state; + int ret; + + if (unlikely(eps_cpu[cpu] == NULL)) + return -ENODEV; + centaur = eps_cpu[cpu]; + + if (unlikely(cpufreq_frequency_table_target(policy, + &eps_cpu[cpu]->freq_table[0], + target_freq, + relation, + &newstate))) { + return -EINVAL; + } + + /* Make frequency transition */ + dest_state = centaur->freq_table[newstate].index & 0xffff; + ret = eps_set_state(centaur, cpu, dest_state); + if (ret) + printk(KERN_ERR "eps: Timeout!\n"); + return ret; +} + +static int eps_verify(struct cpufreq_policy *policy) +{ + return cpufreq_frequency_table_verify(policy, + &eps_cpu[policy->cpu]->freq_table[0]); +} + +static int eps_cpu_init(struct cpufreq_policy *policy) +{ + unsigned int i; + u32 lo, hi; + u64 val; + u8 current_multiplier, current_voltage; + u8 max_multiplier, max_voltage; + u8 min_multiplier, min_voltage; + u8 brand; + u32 fsb; + struct eps_cpu_data *centaur; + struct cpufreq_frequency_table *f_table; + int k, step, voltage; + int ret; + int states; + + if (policy->cpu != 0) + return -ENODEV; + + /* Check brand */ + printk("eps: Detected VIA "); + rdmsr(0x1153, lo, hi); + brand = (((lo >> 2) ^ lo) >> 18) & 3; + switch(brand) { + case EPS_BRAND_C7M: + printk("C7-M\n"); + break; + case EPS_BRAND_C7: + printk("C7\n"); + break; + case EPS_BRAND_EDEN: + printk("Eden\n"); + break; + case EPS_BRAND_C3: + printk("C3\n"); + return -ENODEV; + break; + } + /* Enable Enhanced PowerSaver */ + rdmsrl(MSR_IA32_MISC_ENABLE, val); + if (!(val & 1 << 16)) { + val |= 1 << 16; + wrmsrl(MSR_IA32_MISC_ENABLE, val); + /* Can be locked at 0 */ + rdmsrl(MSR_IA32_MISC_ENABLE, val); + if (!(val & 1 << 16)) { + printk("eps: Can't enable Enhanced PowerSaver\n"); + return -ENODEV; + } + } + + /* Print voltage and multiplier */ + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + current_voltage = lo & 0xff; + printk("eps: Current voltage = %dmV\n", current_voltage * 16 + 700); + current_multiplier = (lo >> 8) & 0xff; + printk("eps: Current multiplier = %d\n", current_multiplier); + + /* Print limits */ + max_voltage = hi & 0xff; + printk("eps: Highest voltage = %dmV\n", max_voltage * 16 + 700); + max_multiplier = (hi >> 8) & 0xff; + printk("eps: Highest multiplier = %d\n", max_multiplier); + min_voltage = (hi >> 16) & 0xff; + printk("eps: Lowest voltage = %dmV\n", min_voltage * 16 + 700); + min_multiplier = (hi >> 24) & 0xff; + printk("eps: Lowest multiplier = %d\n", min_multiplier); + + /* Sanity checks */ + if (current_multiplier == 0 || max_multiplier == 0 + || min_multiplier == 0) + return -EINVAL; + if (current_multiplier > max_multiplier + || max_multiplier <= min_multiplier) + return -EINVAL; + if (current_voltage > 0x1c || max_voltage > 0x1c) + return -EINVAL; + if (max_voltage < min_voltage) + return -EINVAL; + + /* Calc FSB speed */ + fsb = cpu_khz / current_multiplier; + /* Calc number of p-states supported */ + if (brand == EPS_BRAND_C7M) + states = max_multiplier - min_multiplier + 1; + else + states = 2; + + /* Allocate private data and frequency table for current cpu */ + centaur = kzalloc(sizeof(struct eps_cpu_data) + + (states + 1) * sizeof(struct cpufreq_frequency_table), + GFP_KERNEL); + if (!centaur) + return -ENOMEM; + eps_cpu[0] = centaur; + + /* Copy basic values */ + centaur->fsb = fsb; + + /* Fill frequency and MSR value table */ + f_table = ¢aur->freq_table[0]; + if (brand != EPS_BRAND_C7M) { + f_table[0].frequency = fsb * min_multiplier; + f_table[0].index = (min_multiplier << 8) | min_voltage; + f_table[1].frequency = fsb * max_multiplier; + f_table[1].index = (max_multiplier << 8) | max_voltage; + f_table[2].frequency = CPUFREQ_TABLE_END; + } else { + k = 0; + step = ((max_voltage - min_voltage) * 256) + / (max_multiplier - min_multiplier); + for (i = min_multiplier; i <= max_multiplier; i++) { + voltage = (k * step) / 256 + min_voltage; + f_table[k].frequency = fsb * i; + f_table[k].index = (i << 8) | voltage; + k++; + } + f_table[k].frequency = CPUFREQ_TABLE_END; + } + + policy->governor = CPUFREQ_DEFAULT_GOVERNOR; + policy->cpuinfo.transition_latency = 140000; /* 844mV -> 700mV in ns */ + policy->cur = fsb * current_multiplier; + + ret = cpufreq_frequency_table_cpuinfo(policy, ¢aur->freq_table[0]); + if (ret) { + kfree(centaur); + return ret; + } + + cpufreq_frequency_table_get_attr(¢aur->freq_table[0], policy->cpu); + return 0; +} + +static int eps_cpu_exit(struct cpufreq_policy *policy) +{ + unsigned int cpu = policy->cpu; + struct eps_cpu_data *centaur; + u32 lo, hi; + + if (eps_cpu[cpu] == NULL) + return -ENODEV; + centaur = eps_cpu[cpu]; + + /* Get max frequency */ + rdmsr(MSR_IA32_PERF_STATUS, lo, hi); + /* Set max frequency */ + eps_set_state(centaur, cpu, hi & 0xffff); + /* Bye */ + cpufreq_frequency_table_put_attr(policy->cpu); + kfree(eps_cpu[cpu]); + eps_cpu[cpu] = NULL; + return 0; +} + +static struct freq_attr* eps_attr[] = { + &cpufreq_freq_attr_scaling_available_freqs, + NULL, +}; + +static struct cpufreq_driver eps_driver = { + .verify = eps_verify, + .target = eps_target, + .init = eps_cpu_init, + .exit = eps_cpu_exit, + .get = eps_get, + .name = "e_powersaver", + .owner = THIS_MODULE, + .attr = eps_attr, +}; + +static int __init eps_init(void) +{ + struct cpuinfo_x86 *c = cpu_data; + + /* This driver will work only on Centaur C7 processors with + * Enhanced SpeedStep/PowerSaver registers */ + if (c->x86_vendor != X86_VENDOR_CENTAUR + || c->x86 != 6 || c->x86_model != 10) + return -ENODEV; + if (!cpu_has(c, X86_FEATURE_EST)) + return -ENODEV; + + if (cpufreq_register_driver(&eps_driver)) + return -EINVAL; + return 0; +} + +static void __exit eps_exit(void) +{ + cpufreq_unregister_driver(&eps_driver); +} + +MODULE_AUTHOR("Rafał Bilski "); +MODULE_DESCRIPTION("Enhanced PowerSaver driver for VIA C7 CPU's."); +MODULE_LICENSE("GPL"); + +module_init(eps_init); +module_exit(eps_exit); diff --git a/arch/i386/kernel/cpu/cpufreq/longhaul.c b/arch/i386/kernel/cpu/cpufreq/longhaul.c index a3db933..b59878a 100644 --- a/arch/i386/kernel/cpu/cpufreq/longhaul.c +++ b/arch/i386/kernel/cpu/cpufreq/longhaul.c @@ -8,12 +8,11 @@ * VIA have currently 3 different versions of Longhaul. * Version 1 (Longhaul) uses the BCR2 MSR at 0x1147. * It is present only in Samuel 1 (C5A), Samuel 2 (C5B) stepping 0. - * Version 2 of longhaul is the same as v1, but adds voltage scaling. - * Present in Samuel 2 (steppings 1-7 only) (C5B), and Ezra (C5C) - * voltage scaling support has currently been disabled in this driver - * until we have code that gets it right. + * Version 2 of longhaul is backward compatible with v1, but adds + * LONGHAUL MSR for purpose of both frequency and voltage scaling. + * Present in Samuel 2 (steppings 1-7 only) (C5B), and Ezra (C5C). * Version 3 of longhaul got renamed to Powersaver and redesigned - * to use the POWERSAVER MSR at 0x110a. + * to use only the POWERSAVER MSR at 0x110a. * It is present in Ezra-T (C5M), Nehemiah (C5X) and above. * It's pretty much the same feature wise to longhaul v2, though * there is provision for scaling FSB too, but this doesn't work @@ -51,10 +50,12 @@ #define CPU_SAMUEL2 2 #define CPU_EZRA 3 #define CPU_EZRA_T 4 #define CPU_NEHEMIAH 5 +#define CPU_NEHEMIAH_C 6 /* Flags */ #define USE_ACPI_C3 (1 << 1) #define USE_NORTHBRIDGE (1 << 2) +#define USE_VT8235 (1 << 3) static int cpu_model; static unsigned int numscales=16; @@ -63,7 +64,8 @@ static unsigned int fsb; static struct mV_pos *vrm_mV_table; static unsigned char *mV_vrm_table; struct f_msr { - unsigned char vrm; + u8 vrm; + u8 pos; }; static struct f_msr f_msr_table[32]; @@ -73,10 +75,10 @@ static int can_scale_voltage; static struct acpi_processor *pr = NULL; static struct acpi_processor_cx *cx = NULL; static u8 longhaul_flags; +static u8 longhaul_pos; /* Module parameters */ static int scale_voltage; -static int ignore_latency; #define dprintk(msg...) cpufreq_debug_printk(CPUFREQ_DEBUG_DRIVER, "longhaul", msg) @@ -164,26 +166,47 @@ static void do_longhaul1(unsigned int cl static void do_powersaver(int cx_address, unsigned int clock_ratio_index) { union msr_longhaul longhaul; + u8 dest_pos; u32 t; + dest_pos = f_msr_table[clock_ratio_index].pos; + rdmsrl(MSR_VIA_LONGHAUL, longhaul.val); + /* Setup new frequency */ longhaul.bits.RevisionKey = longhaul.bits.RevisionID; longhaul.bits.SoftBusRatio = clock_ratio_index & 0xf; longhaul.bits.SoftBusRatio4 = (clock_ratio_index & 0x10) >> 4; - longhaul.bits.EnableSoftBusRatio = 1; - - if (can_scale_voltage) { + /* Setup new voltage */ + if (can_scale_voltage) longhaul.bits.SoftVID = f_msr_table[clock_ratio_index].vrm; + /* Sync to timer tick */ + safe_halt(); + /* Raise voltage if necessary */ + if (can_scale_voltage && longhaul_pos < dest_pos) { longhaul.bits.EnableSoftVID = 1; + wrmsrl(MSR_VIA_LONGHAUL, longhaul.val); + /* Change voltage */ + if (!cx_address) { + ACPI_FLUSH_CPU_CACHE(); + halt(); + } else { + ACPI_FLUSH_CPU_CACHE(); + /* Invoke C3 */ + inb(cx_address); + /* Dummy op - must do something useless after P_LVL3 + * read */ + t = inl(acpi_gbl_FADT.xpm_timer_block.address); + } + longhaul.bits.EnableSoftVID = 0; + wrmsrl(MSR_VIA_LONGHAUL, longhaul.val); + longhaul_pos = dest_pos; } - /* Sync to timer tick */ - safe_halt(); /* Change frequency on next halt or sleep */ + longhaul.bits.EnableSoftBusRatio = 1; wrmsrl(MSR_VIA_LONGHAUL, longhaul.val); if (!cx_address) { ACPI_FLUSH_CPU_CACHE(); - /* Invoke C1 */ halt(); } else { ACPI_FLUSH_CPU_CACHE(); @@ -193,12 +216,29 @@ static void do_powersaver(int cx_address t = inl(acpi_gbl_FADT.xpm_timer_block.address); } /* Disable bus ratio bit */ - local_irq_disable(); - longhaul.bits.RevisionKey = longhaul.bits.RevisionID; longhaul.bits.EnableSoftBusRatio = 0; - longhaul.bits.EnableSoftBSEL = 0; - longhaul.bits.EnableSoftVID = 0; wrmsrl(MSR_VIA_LONGHAUL, longhaul.val); + + /* Reduce voltage if necessary */ + if (can_scale_voltage && longhaul_pos > dest_pos) { + longhaul.bits.EnableSoftVID = 1; + wrmsrl(MSR_VIA_LONGHAUL, longhaul.val); + /* Change voltage */ + if (!cx_address) { + ACPI_FLUSH_CPU_CACHE(); + halt(); + } else { + ACPI_FLUSH_CPU_CACHE(); + /* Invoke C3 */ + inb(cx_address); + /* Dummy op - must do something useless after P_LVL3 + * read */ + t = inl(acpi_gbl_FADT.xpm_timer_block.address); + } + longhaul.bits.EnableSoftVID = 0; + wrmsrl(MSR_VIA_LONGHAUL, longhaul.val); + longhaul_pos = dest_pos; + } } /** @@ -257,26 +297,19 @@ static void longhaul_setstate(unsigned i /* * Longhaul v1. (Samuel[C5A] and Samuel2 stepping 0[C5B]) * Software controlled multipliers only. - * - * *NB* Until we get voltage scaling working v1 & v2 are the same code. - * Longhaul v2 appears in Samuel2 Steppings 1->7 [C5b] and Ezra [C5C] */ case TYPE_LONGHAUL_V1: - case TYPE_LONGHAUL_V2: do_longhaul1(clock_ratio_index); break; /* + * Longhaul v2 appears in Samuel2 Steppings 1->7 [C5B] and Ezra [C5C] + * * Longhaul v3 (aka Powersaver). (Ezra-T [C5M] & Nehemiah [C5N]) - * We can scale voltage with this too, but that's currently - * disabled until we come up with a decent 'match freq to voltage' - * algorithm. - * When we add voltage scaling, we will also need to do the - * voltage/freq setting in order depending on the direction - * of scaling (like we do in powernow-k7.c) * Nehemiah can do FSB scaling too, but this has never been proven * to work in practice. */ + case TYPE_LONGHAUL_V2: case TYPE_POWERSAVER: if (longhaul_flags & USE_ACPI_C3) { /* Don't allow wakeup */ @@ -301,6 +334,7 @@ static void longhaul_setstate(unsigned i local_irq_restore(flags); preempt_enable(); + freqs.new = calc_speed(longhaul_get_cpu_mult()); cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE); } @@ -315,31 +349,19 @@ static void longhaul_setstate(unsigned i #define ROUNDING 0xf -static int _guess(int guess, int mult) -{ - int target; - - target = ((mult/10)*guess); - if (mult%10 != 0) - target += (guess/2); - target += ROUNDING/2; - target &= ~ROUNDING; - return target; -} - - static int guess_fsb(int mult) { - int speed = (cpu_khz/1000); + int speed = cpu_khz / 1000; int i; - int speeds[] = { 66, 100, 133, 200 }; - - speed += ROUNDING/2; - speed &= ~ROUNDING; - - for (i=0; i<4; i++) { - if (_guess(speeds[i], mult) == speed) - return speeds[i]; + int speeds[] = { 666, 1000, 1333, 2000 }; + int f_max, f_min; + + for (i = 0; i < 4; i++) { + f_max = ((speeds[i] * mult) + 50) / 100; + f_max += (ROUNDING / 2); + f_min = f_max - ROUNDING; + if ((speed <= f_max) && (speed >= f_min)) + return speeds[i] / 10; } return 0; } @@ -347,67 +369,40 @@ static int guess_fsb(int mult) static int __init longhaul_get_ranges(void) { - unsigned long invalue; - unsigned int ezra_t_multipliers[32]= { - 90, 30, 40, 100, 55, 35, 45, 95, - 50, 70, 80, 60, 120, 75, 85, 65, - -1, 110, 120, -1, 135, 115, 125, 105, - 130, 150, 160, 140, -1, 155, -1, 145 }; unsigned int j, k = 0; - union msr_longhaul longhaul; - int mult = 0; + int mult; - switch (longhaul_version) { - case TYPE_LONGHAUL_V1: - case TYPE_LONGHAUL_V2: - /* Ugh, Longhaul v1 didn't have the min/max MSRs. - Assume min=3.0x & max = whatever we booted at. */ + /* Get current frequency */ + mult = longhaul_get_cpu_mult(); + if (mult == -1) { + printk(KERN_INFO PFX "Invalid (reserved) multiplier!\n"); + return -EINVAL; + } + fsb = guess_fsb(mult); + if (fsb == 0) { + printk(KERN_INFO PFX "Invalid (reserved) FSB!\n"); + return -EINVAL; + } + /* Get max multiplier - as we always did. + * Longhaul MSR is usefull only when voltage scaling is enabled. + * C3 is booting at max anyway. */ + maxmult = mult; + /* Get min multiplier */ + switch (cpu_model) { + case CPU_NEHEMIAH: + minmult = 50; + break; + case CPU_NEHEMIAH_C: + minmult = 40; + break; + default: minmult = 30; - maxmult = mult = longhaul_get_cpu_mult(); break; - - case TYPE_POWERSAVER: - /* Ezra-T */ - if (cpu_model==CPU_EZRA_T) { - minmult = 30; - rdmsrl (MSR_VIA_LONGHAUL, longhaul.val); - invalue = longhaul.bits.MaxMHzBR; - if (longhaul.bits.MaxMHzBR4) - invalue += 16; - maxmult = mult = ezra_t_multipliers[invalue]; - break; - } - - /* Nehemiah */ - if (cpu_model==CPU_NEHEMIAH) { - rdmsrl (MSR_VIA_LONGHAUL, longhaul.val); - - /* - * TODO: This code works, but raises a lot of questions. - * - Some Nehemiah's seem to have broken Min/MaxMHzBR's. - * We get around this by using a hardcoded multiplier of 4.0x - * for the minimimum speed, and the speed we booted up at for the max. - * This is done in longhaul_get_cpu_mult() by reading the EBLCR register. - * - According to some VIA documentation EBLCR is only - * in pre-Nehemiah C3s. How this still works is a mystery. - * We're possibly using something undocumented and unsupported, - * But it works, so we don't grumble. - */ - minmult=40; - maxmult = mult = longhaul_get_cpu_mult(); - break; - } } - fsb = guess_fsb(mult); dprintk ("MinMult:%d.%dx MaxMult:%d.%dx\n", minmult/10, minmult%10, maxmult/10, maxmult%10); - if (fsb == 0) { - printk (KERN_INFO PFX "Invalid (reserved) FSB!\n"); - return -EINVAL; - } - highest_speed = calc_speed(maxmult); lowest_speed = calc_speed(minmult); dprintk ("FSB:%dMHz Lowest speed: %s Highest speed:%s\n", fsb, @@ -455,6 +450,7 @@ static void __init longhaul_setup_voltag union msr_longhaul longhaul; struct mV_pos minvid, maxvid; unsigned int j, speed, pos, kHz_step, numvscales; + int min_vid_speed; rdmsrl(MSR_VIA_LONGHAUL, longhaul.val); if (!(longhaul.bits.RevisionID & 1)) { @@ -468,14 +464,14 @@ static void __init longhaul_setup_voltag mV_vrm_table = &mV_vrm85[0]; } else { printk (KERN_INFO PFX "Mobile VRM\n"); + if (cpu_model < CPU_NEHEMIAH) + return; vrm_mV_table = &mobilevrm_mV[0]; mV_vrm_table = &mV_mobilevrm[0]; } minvid = vrm_mV_table[longhaul.bits.MinimumVID]; maxvid = vrm_mV_table[longhaul.bits.MaximumVID]; - numvscales = maxvid.pos - minvid.pos + 1; - kHz_step = (highest_speed - lowest_speed) / numvscales; if (minvid.mV == 0 || maxvid.mV == 0 || minvid.mV > maxvid.mV) { printk (KERN_INFO PFX "Bogus values Min:%d.%03d Max:%d.%03d. " @@ -491,20 +487,59 @@ static void __init longhaul_setup_voltag return; } - printk(KERN_INFO PFX "Max VID=%d.%03d Min VID=%d.%03d, %d possible voltage scales\n", + /* How many voltage steps */ + numvscales = maxvid.pos - minvid.pos + 1; + printk(KERN_INFO PFX + "Max VID=%d.%03d " + "Min VID=%d.%03d, " + "%d possible voltage scales\n", maxvid.mV/1000, maxvid.mV%1000, minvid.mV/1000, minvid.mV%1000, numvscales); + /* Calculate max frequency at min voltage */ + j = longhaul.bits.MinMHzBR; + if (longhaul.bits.MinMHzBR4) + j += 16; + min_vid_speed = eblcr_table[j]; + if (min_vid_speed == -1) + return; + switch (longhaul.bits.MinMHzFSB) { + case 0: + min_vid_speed *= 13333; + break; + case 1: + min_vid_speed *= 10000; + break; + case 3: + min_vid_speed *= 6666; + break; + default: + return; + break; + } + if (min_vid_speed >= highest_speed) + return; + /* Calculate kHz for one voltage step */ + kHz_step = (highest_speed - min_vid_speed) / numvscales; + + j = 0; while (longhaul_table[j].frequency != CPUFREQ_TABLE_END) { speed = longhaul_table[j].frequency; - pos = (speed - lowest_speed) / kHz_step + minvid.pos; + if (speed > min_vid_speed) + pos = (speed - min_vid_speed) / kHz_step + minvid.pos; + else + pos = minvid.pos; f_msr_table[longhaul_table[j].index].vrm = mV_vrm_table[pos]; + f_msr_table[longhaul_table[j].index].pos = pos; j++; } + longhaul_pos = maxvid.pos; can_scale_voltage = 1; + printk(KERN_INFO PFX "Voltage scaling enabled. " + "Use of \"conservative\" governor is highly recommended.\n"); } @@ -573,20 +608,51 @@ static int enable_arbiter_disable(void) if (dev != NULL) { /* Enable access to port 0x22 */ pci_read_config_byte(dev, reg, &pci_cmd); - if ( !(pci_cmd & 1<<7) ) { + if (!(pci_cmd & 1<<7)) { pci_cmd |= 1<<7; pci_write_config_byte(dev, reg, pci_cmd); + pci_read_config_byte(dev, reg, &pci_cmd); + if (!(pci_cmd & 1<<7)) { + printk(KERN_ERR PFX + "Can't enable access to port 0x22.\n"); + return 0; + } } return 1; } return 0; } +static int longhaul_setup_vt8235(void) +{ + struct pci_dev *dev; + u8 pci_cmd; + + /* Find VT8235 southbridge */ + dev = pci_find_device(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8235, NULL); + if (dev != NULL) { + /* Set transition time to max */ + pci_read_config_byte(dev, 0xec, &pci_cmd); + pci_cmd &= ~(1 << 2); + pci_write_config_byte(dev, 0xec, pci_cmd); + pci_read_config_byte(dev, 0xe4, &pci_cmd); + pci_cmd &= ~(1 << 7); + pci_write_config_byte(dev, 0xe4, pci_cmd); + pci_read_config_byte(dev, 0xe5, &pci_cmd); + pci_cmd |= 1 << 7; + pci_write_config_byte(dev, 0xe5, pci_cmd); + return 1; + } + return 0; +} + static int __init longhaul_cpu_init(struct cpufreq_policy *policy) { struct cpuinfo_x86 *c = cpu_data; char *cpuname=NULL; int ret; + u32 lo, hi; + int vt8235_present; /* Check what we have on this motherboard */ switch (c->x86_model) { @@ -599,16 +665,20 @@ static int __init longhaul_cpu_init(stru break; case 7: - longhaul_version = TYPE_LONGHAUL_V1; switch (c->x86_mask) { case 0: + longhaul_version = TYPE_LONGHAUL_V1; cpu_model = CPU_SAMUEL2; cpuname = "C3 'Samuel 2' [C5B]"; - /* Note, this is not a typo, early Samuel2's had Samuel1 ratios. */ - memcpy (clock_ratio, samuel1_clock_ratio, sizeof(samuel1_clock_ratio)); - memcpy (eblcr_table, samuel2_eblcr, sizeof(samuel2_eblcr)); + /* Note, this is not a typo, early Samuel2's had + * Samuel1 ratios. */ + memcpy(clock_ratio, samuel1_clock_ratio, + sizeof(samuel1_clock_ratio)); + memcpy(eblcr_table, samuel2_eblcr, + sizeof(samuel2_eblcr)); break; case 1 ... 15: + longhaul_version = TYPE_LONGHAUL_V2; if (c->x86_mask < 8) { cpu_model = CPU_SAMUEL2; cpuname = "C3 'Samuel 2' [C5B]"; @@ -616,8 +686,10 @@ static int __init longhaul_cpu_init(stru cpu_model = CPU_EZRA; cpuname = "C3 'Ezra' [C5C]"; } - memcpy (clock_ratio, ezra_clock_ratio, sizeof(ezra_clock_ratio)); - memcpy (eblcr_table, ezra_eblcr, sizeof(ezra_eblcr)); + memcpy(clock_ratio, ezra_clock_ratio, + sizeof(ezra_clock_ratio)); + memcpy(eblcr_table, ezra_eblcr, + sizeof(ezra_eblcr)); break; } break; @@ -632,24 +704,24 @@ static int __init longhaul_cpu_init(stru break; case 9: - cpu_model = CPU_NEHEMIAH; longhaul_version = TYPE_POWERSAVER; - numscales=32; + numscales = 32; + memcpy(clock_ratio, + nehemiah_clock_ratio, + sizeof(nehemiah_clock_ratio)); + memcpy(eblcr_table, nehemiah_eblcr, sizeof(nehemiah_eblcr)); switch (c->x86_mask) { case 0 ... 1: - cpuname = "C3 'Nehemiah A' [C5N]"; - memcpy (clock_ratio, nehemiah_a_clock_ratio, sizeof(nehemiah_a_clock_ratio)); - memcpy (eblcr_table, nehemiah_a_eblcr, sizeof(nehemiah_a_eblcr)); + cpu_model = CPU_NEHEMIAH; + cpuname = "C3 'Nehemiah A' [C5XLOE]"; break; case 2 ... 4: - cpuname = "C3 'Nehemiah B' [C5N]"; - memcpy (clock_ratio, nehemiah_b_clock_ratio, sizeof(nehemiah_b_clock_ratio)); - memcpy (eblcr_table, nehemiah_b_eblcr, sizeof(nehemiah_b_eblcr)); + cpu_model = CPU_NEHEMIAH; + cpuname = "C3 'Nehemiah B' [C5XLOH]"; break; case 5 ... 15: - cpuname = "C3 'Nehemiah C' [C5N]"; - memcpy (clock_ratio, nehemiah_c_clock_ratio, sizeof(nehemiah_c_clock_ratio)); - memcpy (eblcr_table, nehemiah_c_eblcr, sizeof(nehemiah_c_eblcr)); + cpu_model = CPU_NEHEMIAH_C; + cpuname = "C3 'Nehemiah C' [C5P]"; break; } break; @@ -658,6 +730,13 @@ static int __init longhaul_cpu_init(stru cpuname = "Unknown"; break; } + /* Check Longhaul ver. 2 */ + if (longhaul_version == TYPE_LONGHAUL_V2) { + rdmsr(MSR_VIA_LONGHAUL, lo, hi); + if (lo == 0 && hi == 0) + /* Looks like MSR isn't present */ + longhaul_version = TYPE_LONGHAUL_V1; + } printk (KERN_INFO PFX "VIA %s CPU detected. ", cpuname); switch (longhaul_version) { @@ -670,15 +749,18 @@ static int __init longhaul_cpu_init(stru break; }; + /* Doesn't hurt */ + vt8235_present = longhaul_setup_vt8235(); + /* Find ACPI data for processor */ - acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT, ACPI_UINT32_MAX, - &longhaul_walk_callback, NULL, (void *)&pr); + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT, + ACPI_UINT32_MAX, &longhaul_walk_callback, + NULL, (void *)&pr); /* Check ACPI support for C3 state */ - if ((pr != NULL) && (longhaul_version == TYPE_POWERSAVER)) { + if (pr != NULL && longhaul_version != TYPE_LONGHAUL_V1) { cx = &pr->power.states[ACPI_STATE_C3]; - if (cx->address > 0 && - (cx->latency <= 1000 || ignore_latency != 0) ) { + if (cx->address > 0 && cx->latency <= 1000) { longhaul_flags |= USE_ACPI_C3; goto print_support_type; } @@ -688,8 +770,11 @@ static int __init longhaul_cpu_init(stru longhaul_flags |= USE_NORTHBRIDGE; goto print_support_type; } - - /* No ACPI C3 or we can't use it */ + /* Use VT8235 southbridge if present */ + if (longhaul_version == TYPE_POWERSAVER && vt8235_present) { + longhaul_flags |= USE_VT8235; + goto print_support_type; + } /* Check ACPI support for bus master arbiter disable */ if ((pr == NULL) || !(pr->flags.bm_control)) { printk(KERN_ERR PFX @@ -698,18 +783,18 @@ static int __init longhaul_cpu_init(stru } print_support_type: - if (!(longhaul_flags & USE_NORTHBRIDGE)) { - printk (KERN_INFO PFX "Using ACPI support.\n"); - } else { + if (longhaul_flags & USE_NORTHBRIDGE) printk (KERN_INFO PFX "Using northbridge support.\n"); - } + else if (longhaul_flags & USE_VT8235) + printk (KERN_INFO PFX "Using VT8235 support.\n"); + else + printk (KERN_INFO PFX "Using ACPI support.\n"); ret = longhaul_get_ranges(); if (ret != 0) return ret; - if ((longhaul_version==TYPE_LONGHAUL_V2 || longhaul_version==TYPE_POWERSAVER) && - (scale_voltage != 0)) + if ((longhaul_version != TYPE_LONGHAUL_V1) && (scale_voltage != 0)) longhaul_setup_voltagescaling(); policy->governor = CPUFREQ_DEFAULT_GOVERNOR; @@ -797,8 +882,6 @@ static void __exit longhaul_exit(void) module_param (scale_voltage, int, 0644); MODULE_PARM_DESC(scale_voltage, "Scale voltage of processor"); -module_param(ignore_latency, int, 0644); -MODULE_PARM_DESC(ignore_latency, "Skip ACPI C3 latency test"); MODULE_AUTHOR ("Dave Jones "); MODULE_DESCRIPTION ("Longhaul driver for VIA Cyrix processors."); diff --git a/arch/i386/kernel/cpu/cpufreq/longhaul.h b/arch/i386/kernel/cpu/cpufreq/longhaul.h index bc4682a..bb0a04b 100644 --- a/arch/i386/kernel/cpu/cpufreq/longhaul.h +++ b/arch/i386/kernel/cpu/cpufreq/longhaul.h @@ -235,84 +235,14 @@ static int __initdata ezrat_eblcr[32] = /* * VIA C3 Nehemiah */ -static int __initdata nehemiah_a_clock_ratio[32] = { +static int __initdata nehemiah_clock_ratio[32] = { 100, /* 0000 -> 10.0x */ 160, /* 0001 -> 16.0x */ - -1, /* 0010 -> RESERVED */ - 90, /* 0011 -> 9.0x */ - 95, /* 0100 -> 9.5x */ - -1, /* 0101 -> RESERVED */ - -1, /* 0110 -> RESERVED */ - 55, /* 0111 -> 5.5x */ - 60, /* 1000 -> 6.0x */ - 70, /* 1001 -> 7.0x */ - 80, /* 1010 -> 8.0x */ - 50, /* 1011 -> 5.0x */ - 65, /* 1100 -> 6.5x */ - 75, /* 1101 -> 7.5x */ - 85, /* 1110 -> 8.5x */ - 120, /* 1111 -> 12.0x */ - 100, /* 0000 -> 10.0x */ - -1, /* 0001 -> RESERVED */ - 120, /* 0010 -> 12.0x */ - 90, /* 0011 -> 9.0x */ - 105, /* 0100 -> 10.5x */ - 115, /* 0101 -> 11.5x */ - 125, /* 0110 -> 12.5x */ - 135, /* 0111 -> 13.5x */ - 140, /* 1000 -> 14.0x */ - 150, /* 1001 -> 15.0x */ - 160, /* 1010 -> 16.0x */ - 130, /* 1011 -> 13.0x */ - 145, /* 1100 -> 14.5x */ - 155, /* 1101 -> 15.5x */ - -1, /* 1110 -> RESERVED (13.0x) */ - 120, /* 1111 -> 12.0x */ -}; - -static int __initdata nehemiah_b_clock_ratio[32] = { - 100, /* 0000 -> 10.0x */ - 160, /* 0001 -> 16.0x */ - -1, /* 0010 -> RESERVED */ - 90, /* 0011 -> 9.0x */ - 95, /* 0100 -> 9.5x */ - -1, /* 0101 -> RESERVED */ - -1, /* 0110 -> RESERVED */ - 55, /* 0111 -> 5.5x */ - 60, /* 1000 -> 6.0x */ - 70, /* 1001 -> 7.0x */ - 80, /* 1010 -> 8.0x */ - 50, /* 1011 -> 5.0x */ - 65, /* 1100 -> 6.5x */ - 75, /* 1101 -> 7.5x */ - 85, /* 1110 -> 8.5x */ - 120, /* 1111 -> 12.0x */ - 100, /* 0000 -> 10.0x */ - 110, /* 0001 -> 11.0x */ - 120, /* 0010 -> 12.0x */ - 90, /* 0011 -> 9.0x */ - 105, /* 0100 -> 10.5x */ - 115, /* 0101 -> 11.5x */ - 125, /* 0110 -> 12.5x */ - 135, /* 0111 -> 13.5x */ - 140, /* 1000 -> 14.0x */ - 150, /* 1001 -> 15.0x */ - 160, /* 1010 -> 16.0x */ - 130, /* 1011 -> 13.0x */ - 145, /* 1100 -> 14.5x */ - 155, /* 1101 -> 15.5x */ - -1, /* 1110 -> RESERVED (13.0x) */ - 120, /* 1111 -> 12.0x */ -}; - -static int __initdata nehemiah_c_clock_ratio[32] = { - 100, /* 0000 -> 10.0x */ - 160, /* 0001 -> 16.0x */ - 40, /* 0010 -> RESERVED */ + 40, /* 0010 -> 4.0x */ 90, /* 0011 -> 9.0x */ 95, /* 0100 -> 9.5x */ -1, /* 0101 -> RESERVED */ - 45, /* 0110 -> RESERVED */ + 45, /* 0110 -> 4.5x */ 55, /* 0111 -> 5.5x */ 60, /* 1000 -> 6.0x */ 70, /* 1001 -> 7.0x */ @@ -340,84 +270,14 @@ static int __initdata nehemiah_c_clock_ 120, /* 1111 -> 12.0x */ }; -static int __initdata nehemiah_a_eblcr[32] = { - 50, /* 0000 -> 5.0x */ - 160, /* 0001 -> 16.0x */ - -1, /* 0010 -> RESERVED */ - 100, /* 0011 -> 10.0x */ - 55, /* 0100 -> 5.5x */ - -1, /* 0101 -> RESERVED */ - -1, /* 0110 -> RESERVED */ - 95, /* 0111 -> 9.5x */ - 90, /* 1000 -> 9.0x */ - 70, /* 1001 -> 7.0x */ - 80, /* 1010 -> 8.0x */ - 60, /* 1011 -> 6.0x */ - 120, /* 1100 -> 12.0x */ - 75, /* 1101 -> 7.5x */ - 85, /* 1110 -> 8.5x */ - 65, /* 1111 -> 6.5x */ - 90, /* 0000 -> 9.0x */ - -1, /* 0001 -> RESERVED */ - 120, /* 0010 -> 12.0x */ - 100, /* 0011 -> 10.0x */ - 135, /* 0100 -> 13.5x */ - 115, /* 0101 -> 11.5x */ - 125, /* 0110 -> 12.5x */ - 105, /* 0111 -> 10.5x */ - 130, /* 1000 -> 13.0x */ - 150, /* 1001 -> 15.0x */ - 160, /* 1010 -> 16.0x */ - 140, /* 1011 -> 14.0x */ - 120, /* 1100 -> 12.0x */ - 155, /* 1101 -> 15.5x */ - -1, /* 1110 -> RESERVED (13.0x) */ - 145 /* 1111 -> 14.5x */ - /* end of table */ -}; -static int __initdata nehemiah_b_eblcr[32] = { - 50, /* 0000 -> 5.0x */ - 160, /* 0001 -> 16.0x */ - -1, /* 0010 -> RESERVED */ - 100, /* 0011 -> 10.0x */ - 55, /* 0100 -> 5.5x */ - -1, /* 0101 -> RESERVED */ - -1, /* 0110 -> RESERVED */ - 95, /* 0111 -> 9.5x */ - 90, /* 1000 -> 9.0x */ - 70, /* 1001 -> 7.0x */ - 80, /* 1010 -> 8.0x */ - 60, /* 1011 -> 6.0x */ - 120, /* 1100 -> 12.0x */ - 75, /* 1101 -> 7.5x */ - 85, /* 1110 -> 8.5x */ - 65, /* 1111 -> 6.5x */ - 90, /* 0000 -> 9.0x */ - 110, /* 0001 -> 11.0x */ - 120, /* 0010 -> 12.0x */ - 100, /* 0011 -> 10.0x */ - 135, /* 0100 -> 13.5x */ - 115, /* 0101 -> 11.5x */ - 125, /* 0110 -> 12.5x */ - 105, /* 0111 -> 10.5x */ - 130, /* 1000 -> 13.0x */ - 150, /* 1001 -> 15.0x */ - 160, /* 1010 -> 16.0x */ - 140, /* 1011 -> 14.0x */ - 120, /* 1100 -> 12.0x */ - 155, /* 1101 -> 15.5x */ - -1, /* 1110 -> RESERVED (13.0x) */ - 145 /* 1111 -> 14.5x */ - /* end of table */ -}; -static int __initdata nehemiah_c_eblcr[32] = { +static int __initdata nehemiah_eblcr[32] = { 50, /* 0000 -> 5.0x */ 160, /* 0001 -> 16.0x */ - 40, /* 0010 -> RESERVED */ + 40, /* 0010 -> 4.0x */ 100, /* 0011 -> 10.0x */ 55, /* 0100 -> 5.5x */ -1, /* 0101 -> RESERVED */ - 45, /* 0110 -> RESERVED */ + 45, /* 0110 -> 4.5x */ 95, /* 0111 -> 9.5x */ 90, /* 1000 -> 9.0x */ 70, /* 1001 -> 7.0x */ @@ -443,7 +303,6 @@ static int __initdata nehemiah_c_eblcr[3 155, /* 1101 -> 15.5x */ -1, /* 1110 -> RESERVED (13.0x) */ 145 /* 1111 -> 14.5x */ - /* end of table */ }; /* diff --git a/arch/i386/kernel/cpu/cpufreq/powernow-k8.c b/arch/i386/kernel/cpu/cpufreq/powernow-k8.c index 2d64916..fe3b670 100644 --- a/arch/i386/kernel/cpu/cpufreq/powernow-k8.c +++ b/arch/i386/kernel/cpu/cpufreq/powernow-k8.c @@ -1289,7 +1289,11 @@ static unsigned int powernowk8_get (unsi if (query_current_values_with_pending_wait(data)) goto out; - khz = find_khz_freq_from_fid(data->currfid); + if (cpu_family == CPU_HW_PSTATE) + khz = find_khz_freq_from_fiddid(data->currfid, data->currdid); + else + khz = find_khz_freq_from_fid(data->currfid); + out: set_cpus_allowed(current, oldmask); diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index 491779a..d155e81 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -16,7 +16,7 @@ config CPU_FREQ if CPU_FREQ config CPU_FREQ_TABLE - def_tristate m + tristate config CPU_FREQ_DEBUG bool "Enable CPUfreq debugging" diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index a45cc89..f52facc 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -41,8 +41,67 @@ static struct cpufreq_driver *cpufreq_dr static struct cpufreq_policy *cpufreq_cpu_data[NR_CPUS]; static DEFINE_SPINLOCK(cpufreq_driver_lock); +/* + * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure + * all cpufreq/hotplug/workqueue/etc related lock issues. + * + * The rules for this semaphore: + * - Any routine that wants to read from the policy structure will + * do a down_read on this semaphore. + * - Any routine that will write to the policy structure and/or may take away + * the policy altogether (eg. CPU hotplug), will hold this lock in write + * mode before doing so. + * + * Additional rules: + * - All holders of the lock should check to make sure that the CPU they + * are concerned with are online after they get the lock. + * - Governor routines that can be called in cpufreq hotplug path should not + * take this sem as top level hotplug notifier handler takes this. + */ +static DEFINE_PER_CPU(int, policy_cpu); +static DEFINE_PER_CPU(struct rw_semaphore, cpu_policy_rwsem); + +#define lock_policy_rwsem(mode, cpu) \ +int lock_policy_rwsem_##mode \ +(int cpu) \ +{ \ + int policy_cpu = per_cpu(policy_cpu, cpu); \ + BUG_ON(policy_cpu == -1); \ + down_##mode(&per_cpu(cpu_policy_rwsem, policy_cpu)); \ + if (unlikely(!cpu_online(cpu))) { \ + up_##mode(&per_cpu(cpu_policy_rwsem, policy_cpu)); \ + return -1; \ + } \ + \ + return 0; \ +} + +lock_policy_rwsem(read, cpu); +EXPORT_SYMBOL_GPL(lock_policy_rwsem_read); + +lock_policy_rwsem(write, cpu); +EXPORT_SYMBOL_GPL(lock_policy_rwsem_write); + +void unlock_policy_rwsem_read(int cpu) +{ + int policy_cpu = per_cpu(policy_cpu, cpu); + BUG_ON(policy_cpu == -1); + up_read(&per_cpu(cpu_policy_rwsem, policy_cpu)); +} +EXPORT_SYMBOL_GPL(unlock_policy_rwsem_read); + +void unlock_policy_rwsem_write(int cpu) +{ + int policy_cpu = per_cpu(policy_cpu, cpu); + BUG_ON(policy_cpu == -1); + up_write(&per_cpu(cpu_policy_rwsem, policy_cpu)); +} +EXPORT_SYMBOL_GPL(unlock_policy_rwsem_write); + + /* internal prototypes */ static int __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event); +static unsigned int __cpufreq_get(unsigned int cpu); static void handle_update(struct work_struct *work); /** @@ -415,12 +474,8 @@ (struct cpufreq_policy * policy, const c if (ret != 1) \ return -EINVAL; \ \ - lock_cpu_hotplug(); \ - mutex_lock(&policy->lock); \ ret = __cpufreq_set_policy(policy, &new_policy); \ policy->user_policy.object = policy->object; \ - mutex_unlock(&policy->lock); \ - unlock_cpu_hotplug(); \ \ return ret ? ret : count; \ } @@ -434,7 +489,7 @@ store_one(scaling_max_freq,max); static ssize_t show_cpuinfo_cur_freq (struct cpufreq_policy * policy, char *buf) { - unsigned int cur_freq = cpufreq_get(policy->cpu); + unsigned int cur_freq = __cpufreq_get(policy->cpu); if (!cur_freq) return sprintf(buf, ""); return sprintf(buf, "%u\n", cur_freq); @@ -479,18 +534,12 @@ static ssize_t store_scaling_governor (s &new_policy.governor)) return -EINVAL; - lock_cpu_hotplug(); - /* Do not use cpufreq_set_policy here or the user_policy.max will be wrongly overridden */ - mutex_lock(&policy->lock); ret = __cpufreq_set_policy(policy, &new_policy); policy->user_policy.policy = policy->policy; policy->user_policy.governor = policy->governor; - mutex_unlock(&policy->lock); - - unlock_cpu_hotplug(); if (ret) return ret; @@ -595,11 +644,17 @@ static ssize_t show(struct kobject * kob policy = cpufreq_cpu_get(policy->cpu); if (!policy) return -EINVAL; + + if (lock_policy_rwsem_read(policy->cpu) < 0) + return -EINVAL; + if (fattr->show) ret = fattr->show(policy, buf); else ret = -EIO; + unlock_policy_rwsem_read(policy->cpu); + cpufreq_cpu_put(policy); return ret; } @@ -613,11 +668,17 @@ static ssize_t store(struct kobject * ko policy = cpufreq_cpu_get(policy->cpu); if (!policy) return -EINVAL; + + if (lock_policy_rwsem_write(policy->cpu) < 0) + return -EINVAL; + if (fattr->store) ret = fattr->store(policy, buf, count); else ret = -EIO; + unlock_policy_rwsem_write(policy->cpu); + cpufreq_cpu_put(policy); return ret; } @@ -691,8 +752,10 @@ #endif policy->cpu = cpu; policy->cpus = cpumask_of_cpu(cpu); - mutex_init(&policy->lock); - mutex_lock(&policy->lock); + /* Initially set CPU itself as the policy_cpu */ + per_cpu(policy_cpu, cpu) = cpu; + lock_policy_rwsem_write(cpu); + init_completion(&policy->kobj_unregister); INIT_WORK(&policy->update, handle_update); @@ -702,7 +765,7 @@ #endif ret = cpufreq_driver->init(policy); if (ret) { dprintk("initialization failed\n"); - mutex_unlock(&policy->lock); + unlock_policy_rwsem_write(cpu); goto err_out; } @@ -716,6 +779,14 @@ #ifdef CONFIG_SMP */ managed_policy = cpufreq_cpu_get(j); if (unlikely(managed_policy)) { + + /* Set proper policy_cpu */ + unlock_policy_rwsem_write(cpu); + per_cpu(policy_cpu, cpu) = managed_policy->cpu; + + if (lock_policy_rwsem_write(cpu) < 0) + goto err_out_driver_exit; + spin_lock_irqsave(&cpufreq_driver_lock, flags); managed_policy->cpus = policy->cpus; cpufreq_cpu_data[cpu] = managed_policy; @@ -726,13 +797,13 @@ #ifdef CONFIG_SMP &managed_policy->kobj, "cpufreq"); if (ret) { - mutex_unlock(&policy->lock); + unlock_policy_rwsem_write(cpu); goto err_out_driver_exit; } cpufreq_debug_enable_ratelimit(); - mutex_unlock(&policy->lock); ret = 0; + unlock_policy_rwsem_write(cpu); goto err_out_driver_exit; /* call driver->exit() */ } } @@ -746,7 +817,7 @@ #endif ret = kobject_register(&policy->kobj); if (ret) { - mutex_unlock(&policy->lock); + unlock_policy_rwsem_write(cpu); goto err_out_driver_exit; } /* set up files for this cpu device */ @@ -761,8 +832,10 @@ #endif sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr); spin_lock_irqsave(&cpufreq_driver_lock, flags); - for_each_cpu_mask(j, policy->cpus) + for_each_cpu_mask(j, policy->cpus) { cpufreq_cpu_data[j] = policy; + per_cpu(policy_cpu, j) = policy->cpu; + } spin_unlock_irqrestore(&cpufreq_driver_lock, flags); /* symlink affected CPUs */ @@ -778,14 +851,14 @@ #endif ret = sysfs_create_link(&cpu_sys_dev->kobj, &policy->kobj, "cpufreq"); if (ret) { - mutex_unlock(&policy->lock); + unlock_policy_rwsem_write(cpu); goto err_out_unregister; } } policy->governor = NULL; /* to assure that the starting sequence is * run in cpufreq_set_policy */ - mutex_unlock(&policy->lock); + unlock_policy_rwsem_write(cpu); /* set default policy */ ret = cpufreq_set_policy(&new_policy); @@ -826,11 +899,13 @@ module_out: /** - * cpufreq_remove_dev - remove a CPU device + * __cpufreq_remove_dev - remove a CPU device * * Removes the cpufreq interface for a CPU device. + * Caller should already have policy_rwsem in write mode for this CPU. + * This routine frees the rwsem before returning. */ -static int cpufreq_remove_dev (struct sys_device * sys_dev) +static int __cpufreq_remove_dev (struct sys_device * sys_dev) { unsigned int cpu = sys_dev->id; unsigned long flags; @@ -849,6 +924,7 @@ #endif if (!data) { spin_unlock_irqrestore(&cpufreq_driver_lock, flags); cpufreq_debug_enable_ratelimit(); + unlock_policy_rwsem_write(cpu); return -EINVAL; } cpufreq_cpu_data[cpu] = NULL; @@ -865,6 +941,7 @@ #ifdef CONFIG_SMP sysfs_remove_link(&sys_dev->kobj, "cpufreq"); cpufreq_cpu_put(data); cpufreq_debug_enable_ratelimit(); + unlock_policy_rwsem_write(cpu); return 0; } #endif @@ -873,6 +950,7 @@ #endif if (!kobject_get(&data->kobj)) { spin_unlock_irqrestore(&cpufreq_driver_lock, flags); cpufreq_debug_enable_ratelimit(); + unlock_policy_rwsem_write(cpu); return -EFAULT; } @@ -906,10 +984,10 @@ #else spin_unlock_irqrestore(&cpufreq_driver_lock, flags); #endif - mutex_lock(&data->lock); if (cpufreq_driver->target) __cpufreq_governor(data, CPUFREQ_GOV_STOP); - mutex_unlock(&data->lock); + + unlock_policy_rwsem_write(cpu); kobject_unregister(&data->kobj); @@ -933,6 +1011,18 @@ #endif } +static int cpufreq_remove_dev (struct sys_device * sys_dev) +{ + unsigned int cpu = sys_dev->id; + int retval; + if (unlikely(lock_policy_rwsem_write(cpu))) + BUG(); + + retval = __cpufreq_remove_dev(sys_dev); + return retval; +} + + static void handle_update(struct work_struct *work) { struct cpufreq_policy *policy = @@ -980,9 +1070,12 @@ unsigned int cpufreq_quick_get(unsigned unsigned int ret_freq = 0; if (policy) { - mutex_lock(&policy->lock); + if (unlikely(lock_policy_rwsem_read(cpu))) + return ret_freq; + ret_freq = policy->cur; - mutex_unlock(&policy->lock); + + unlock_policy_rwsem_read(cpu); cpufreq_cpu_put(policy); } @@ -991,24 +1084,13 @@ unsigned int cpufreq_quick_get(unsigned EXPORT_SYMBOL(cpufreq_quick_get); -/** - * cpufreq_get - get the current CPU frequency (in kHz) - * @cpu: CPU number - * - * Get the CPU current (static) CPU frequency - */ -unsigned int cpufreq_get(unsigned int cpu) +static unsigned int __cpufreq_get(unsigned int cpu) { - struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + struct cpufreq_policy *policy = cpufreq_cpu_data[cpu]; unsigned int ret_freq = 0; - if (!policy) - return 0; - if (!cpufreq_driver->get) - goto out; - - mutex_lock(&policy->lock); + return (ret_freq); ret_freq = cpufreq_driver->get(cpu); @@ -1022,11 +1104,33 @@ unsigned int cpufreq_get(unsigned int cp } } - mutex_unlock(&policy->lock); + return (ret_freq); +} -out: - cpufreq_cpu_put(policy); +/** + * cpufreq_get - get the current CPU frequency (in kHz) + * @cpu: CPU number + * + * Get the CPU current (static) CPU frequency + */ +unsigned int cpufreq_get(unsigned int cpu) +{ + unsigned int ret_freq = 0; + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + + if (!policy) + goto out; + + if (unlikely(lock_policy_rwsem_read(cpu))) + goto out_policy; + + ret_freq = __cpufreq_get(cpu); + unlock_policy_rwsem_read(cpu); + +out_policy: + cpufreq_cpu_put(policy); +out: return (ret_freq); } EXPORT_SYMBOL(cpufreq_get); @@ -1278,7 +1382,6 @@ EXPORT_SYMBOL(cpufreq_unregister_notifie *********************************************************************/ -/* Must be called with lock_cpu_hotplug held */ int __cpufreq_driver_target(struct cpufreq_policy *policy, unsigned int target_freq, unsigned int relation) @@ -1304,20 +1407,19 @@ int cpufreq_driver_target(struct cpufreq if (!policy) return -EINVAL; - lock_cpu_hotplug(); - mutex_lock(&policy->lock); + if (unlikely(lock_policy_rwsem_write(policy->cpu))) + return -EINVAL; ret = __cpufreq_driver_target(policy, target_freq, relation); - mutex_unlock(&policy->lock); - unlock_cpu_hotplug(); + unlock_policy_rwsem_write(policy->cpu); cpufreq_cpu_put(policy); return ret; } EXPORT_SYMBOL_GPL(cpufreq_driver_target); -int cpufreq_driver_getavg(struct cpufreq_policy *policy) +int __cpufreq_driver_getavg(struct cpufreq_policy *policy) { int ret = 0; @@ -1325,20 +1427,15 @@ int cpufreq_driver_getavg(struct cpufreq if (!policy) return -EINVAL; - mutex_lock(&policy->lock); - if (cpu_online(policy->cpu) && cpufreq_driver->getavg) ret = cpufreq_driver->getavg(policy->cpu); - mutex_unlock(&policy->lock); - cpufreq_cpu_put(policy); return ret; } -EXPORT_SYMBOL_GPL(cpufreq_driver_getavg); +EXPORT_SYMBOL_GPL(__cpufreq_driver_getavg); /* - * Locking: Must be called with the lock_cpu_hotplug() lock held * when "event" is CPUFREQ_GOV_LIMITS */ @@ -1420,9 +1517,7 @@ int cpufreq_get_policy(struct cpufreq_po if (!cpu_policy) return -EINVAL; - mutex_lock(&cpu_policy->lock); memcpy(policy, cpu_policy, sizeof(struct cpufreq_policy)); - mutex_unlock(&cpu_policy->lock); cpufreq_cpu_put(cpu_policy); return 0; @@ -1433,7 +1528,6 @@ EXPORT_SYMBOL(cpufreq_get_policy); /* * data : current policy. * policy : policy to be set. - * Locking: Must be called with the lock_cpu_hotplug() lock held */ static int __cpufreq_set_policy(struct cpufreq_policy *data, struct cpufreq_policy *policy) @@ -1539,10 +1633,9 @@ int cpufreq_set_policy(struct cpufreq_po if (!data) return -EINVAL; - lock_cpu_hotplug(); + if (unlikely(lock_policy_rwsem_write(policy->cpu))) + return -EINVAL; - /* lock this CPU */ - mutex_lock(&data->lock); ret = __cpufreq_set_policy(data, policy); data->user_policy.min = data->min; @@ -1550,9 +1643,8 @@ int cpufreq_set_policy(struct cpufreq_po data->user_policy.policy = data->policy; data->user_policy.governor = data->governor; - mutex_unlock(&data->lock); + unlock_policy_rwsem_write(policy->cpu); - unlock_cpu_hotplug(); cpufreq_cpu_put(data); return ret; @@ -1576,8 +1668,8 @@ int cpufreq_update_policy(unsigned int c if (!data) return -ENODEV; - lock_cpu_hotplug(); - mutex_lock(&data->lock); + if (unlikely(lock_policy_rwsem_write(cpu))) + return -EINVAL; dprintk("updating policy for CPU %u\n", cpu); memcpy(&policy, data, sizeof(struct cpufreq_policy)); @@ -1602,8 +1694,8 @@ int cpufreq_update_policy(unsigned int c ret = __cpufreq_set_policy(data, &policy); - mutex_unlock(&data->lock); - unlock_cpu_hotplug(); + unlock_policy_rwsem_write(cpu); + cpufreq_cpu_put(data); return ret; } @@ -1613,31 +1705,28 @@ static int cpufreq_cpu_callback(struct n unsigned long action, void *hcpu) { unsigned int cpu = (unsigned long)hcpu; - struct cpufreq_policy *policy; struct sys_device *sys_dev; + struct cpufreq_policy *policy; sys_dev = get_cpu_sysdev(cpu); - if (sys_dev) { switch (action) { case CPU_ONLINE: cpufreq_add_dev(sys_dev); break; case CPU_DOWN_PREPARE: - /* - * We attempt to put this cpu in lowest frequency - * possible before going down. This will permit - * hardware-managed P-State to switch other related - * threads to min or higher speeds if possible. - */ + if (unlikely(lock_policy_rwsem_write(cpu))) + BUG(); + policy = cpufreq_cpu_data[cpu]; if (policy) { - cpufreq_driver_target(policy, policy->min, + __cpufreq_driver_target(policy, policy->min, CPUFREQ_RELATION_H); } + __cpufreq_remove_dev(sys_dev); break; - case CPU_DEAD: - cpufreq_remove_dev(sys_dev); + case CPU_DOWN_FAILED: + cpufreq_add_dev(sys_dev); break; } } @@ -1751,3 +1840,16 @@ int cpufreq_unregister_driver(struct cpu return 0; } EXPORT_SYMBOL_GPL(cpufreq_unregister_driver); + +static int __init cpufreq_core_init(void) +{ + int cpu; + + for_each_possible_cpu(cpu) { + per_cpu(policy_cpu, cpu) = -1; + init_rwsem(&per_cpu(cpu_policy_rwsem, cpu)); + } + return 0; +} + +core_initcall(cpufreq_core_init); diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c index 05d6c22..26f440c 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -429,14 +429,12 @@ static void dbs_check_cpu(int cpu) static void do_dbs_timer(struct work_struct *work) { int i; - lock_cpu_hotplug(); mutex_lock(&dbs_mutex); for_each_online_cpu(i) dbs_check_cpu(i); schedule_delayed_work(&dbs_work, usecs_to_jiffies(dbs_tuners_ins.sampling_rate)); mutex_unlock(&dbs_mutex); - unlock_cpu_hotplug(); } static inline void dbs_timer_init(void) diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c index f697449..d60bcb9 100644 --- a/drivers/cpufreq/cpufreq_ondemand.c +++ b/drivers/cpufreq/cpufreq_ondemand.c @@ -52,19 +52,20 @@ #define TRANSITION_LATENCY_LIMIT (10 * static void do_dbs_timer(struct work_struct *work); /* Sampling types */ -enum dbs_sample {DBS_NORMAL_SAMPLE, DBS_SUB_SAMPLE}; +enum {DBS_NORMAL_SAMPLE, DBS_SUB_SAMPLE}; struct cpu_dbs_info_s { cputime64_t prev_cpu_idle; cputime64_t prev_cpu_wall; struct cpufreq_policy *cur_policy; struct delayed_work work; - enum dbs_sample sample_type; - unsigned int enable; struct cpufreq_frequency_table *freq_table; unsigned int freq_lo; unsigned int freq_lo_jiffies; unsigned int freq_hi_jiffies; + int cpu; + unsigned int enable:1, + sample_type:1; }; static DEFINE_PER_CPU(struct cpu_dbs_info_s, cpu_dbs_info); @@ -402,7 +403,7 @@ static void dbs_check_cpu(struct cpu_dbs if (load < (dbs_tuners_ins.up_threshold - 10)) { unsigned int freq_next, freq_cur; - freq_cur = cpufreq_driver_getavg(policy); + freq_cur = __cpufreq_driver_getavg(policy); if (!freq_cur) freq_cur = policy->cur; @@ -423,9 +424,11 @@ static void dbs_check_cpu(struct cpu_dbs static void do_dbs_timer(struct work_struct *work) { - unsigned int cpu = smp_processor_id(); - struct cpu_dbs_info_s *dbs_info = &per_cpu(cpu_dbs_info, cpu); - enum dbs_sample sample_type = dbs_info->sample_type; + struct cpu_dbs_info_s *dbs_info = + container_of(work, struct cpu_dbs_info_s, work.work); + unsigned int cpu = dbs_info->cpu; + int sample_type = dbs_info->sample_type; + /* We want all CPUs to do sampling nearly on same jiffy */ int delay = usecs_to_jiffies(dbs_tuners_ins.sampling_rate); @@ -434,15 +437,19 @@ static void do_dbs_timer(struct work_str delay -= jiffies % delay; - if (!dbs_info->enable) + if (lock_policy_rwsem_write(cpu) < 0) + return; + + if (!dbs_info->enable) { + unlock_policy_rwsem_write(cpu); return; + } + /* Common NORMAL_SAMPLE setup */ dbs_info->sample_type = DBS_NORMAL_SAMPLE; if (!dbs_tuners_ins.powersave_bias || sample_type == DBS_NORMAL_SAMPLE) { - lock_cpu_hotplug(); dbs_check_cpu(dbs_info); - unlock_cpu_hotplug(); if (dbs_info->freq_lo) { /* Setup timer for SUB_SAMPLE */ dbs_info->sample_type = DBS_SUB_SAMPLE; @@ -454,26 +461,27 @@ static void do_dbs_timer(struct work_str CPUFREQ_RELATION_H); } queue_delayed_work_on(cpu, kondemand_wq, &dbs_info->work, delay); + unlock_policy_rwsem_write(cpu); } -static inline void dbs_timer_init(unsigned int cpu) +static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info) { - struct cpu_dbs_info_s *dbs_info = &per_cpu(cpu_dbs_info, cpu); /* We want all CPUs to do sampling nearly on same jiffy */ int delay = usecs_to_jiffies(dbs_tuners_ins.sampling_rate); delay -= jiffies % delay; + dbs_info->enable = 1; ondemand_powersave_bias_init(); - INIT_DELAYED_WORK_NAR(&dbs_info->work, do_dbs_timer); dbs_info->sample_type = DBS_NORMAL_SAMPLE; - queue_delayed_work_on(cpu, kondemand_wq, &dbs_info->work, delay); + INIT_DELAYED_WORK_NAR(&dbs_info->work, do_dbs_timer); + queue_delayed_work_on(dbs_info->cpu, kondemand_wq, &dbs_info->work, + delay); } static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info) { dbs_info->enable = 0; cancel_delayed_work(&dbs_info->work); - flush_workqueue(kondemand_wq); } static int cpufreq_governor_dbs(struct cpufreq_policy *policy, @@ -502,21 +510,9 @@ static int cpufreq_governor_dbs(struct c mutex_lock(&dbs_mutex); dbs_enable++; - if (dbs_enable == 1) { - kondemand_wq = create_workqueue("kondemand"); - if (!kondemand_wq) { - printk(KERN_ERR - "Creation of kondemand failed\n"); - dbs_enable--; - mutex_unlock(&dbs_mutex); - return -ENOSPC; - } - } rc = sysfs_create_group(&policy->kobj, &dbs_attr_group); if (rc) { - if (dbs_enable == 1) - destroy_workqueue(kondemand_wq); dbs_enable--; mutex_unlock(&dbs_mutex); return rc; @@ -530,7 +526,7 @@ static int cpufreq_governor_dbs(struct c j_dbs_info->prev_cpu_idle = get_cpu_idle_time(j); j_dbs_info->prev_cpu_wall = get_jiffies_64(); } - this_dbs_info->enable = 1; + this_dbs_info->cpu = cpu; /* * Start the timerschedule work, when this governor * is used for first time @@ -550,7 +546,7 @@ static int cpufreq_governor_dbs(struct c dbs_tuners_ins.sampling_rate = def_sampling_rate; } - dbs_timer_init(policy->cpu); + dbs_timer_init(this_dbs_info); mutex_unlock(&dbs_mutex); break; @@ -560,9 +556,6 @@ static int cpufreq_governor_dbs(struct c dbs_timer_exit(this_dbs_info); sysfs_remove_group(&policy->kobj, &dbs_attr_group); dbs_enable--; - if (dbs_enable == 0) - destroy_workqueue(kondemand_wq); - mutex_unlock(&dbs_mutex); break; @@ -591,12 +584,18 @@ static struct cpufreq_governor cpufreq_g static int __init cpufreq_gov_dbs_init(void) { + kondemand_wq = create_workqueue("kondemand"); + if (!kondemand_wq) { + printk(KERN_ERR "Creation of kondemand failed\n"); + return -EFAULT; + } return cpufreq_register_governor(&cpufreq_gov_dbs); } static void __exit cpufreq_gov_dbs_exit(void) { cpufreq_unregister_governor(&cpufreq_gov_dbs); + destroy_workqueue(kondemand_wq); } @@ -608,3 +607,4 @@ MODULE_LICENSE("GPL"); module_init(cpufreq_gov_dbs_init); module_exit(cpufreq_gov_dbs_exit); + diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c index 91ad342..d1c7cac 100644 --- a/drivers/cpufreq/cpufreq_stats.c +++ b/drivers/cpufreq/cpufreq_stats.c @@ -370,12 +370,10 @@ __exit cpufreq_stats_exit(void) cpufreq_unregister_notifier(¬ifier_trans_block, CPUFREQ_TRANSITION_NOTIFIER); unregister_hotcpu_notifier(&cpufreq_stat_cpu_notifier); - lock_cpu_hotplug(); for_each_online_cpu(cpu) { cpufreq_stat_cpu_callback(&cpufreq_stat_cpu_notifier, CPU_DEAD, (void *)(long)cpu); } - unlock_cpu_hotplug(); } MODULE_AUTHOR ("Zou Nan hai "); diff --git a/drivers/cpufreq/cpufreq_userspace.c b/drivers/cpufreq/cpufreq_userspace.c index 2a4eb0b..860345c 100644 --- a/drivers/cpufreq/cpufreq_userspace.c +++ b/drivers/cpufreq/cpufreq_userspace.c @@ -71,7 +71,6 @@ static int cpufreq_set(unsigned int freq dprintk("cpufreq_set for cpu %u, freq %u kHz\n", policy->cpu, freq); - lock_cpu_hotplug(); mutex_lock(&userspace_mutex); if (!cpu_is_managed[policy->cpu]) goto err; @@ -94,7 +93,6 @@ static int cpufreq_set(unsigned int freq err: mutex_unlock(&userspace_mutex); - unlock_cpu_hotplug(); return ret; } diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 7f008f6..0899e2c 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -84,9 +84,6 @@ struct cpufreq_policy { unsigned int policy; /* see above */ struct cpufreq_governor *governor; /* see below */ - struct mutex lock; /* CPU ->setpolicy or ->target may - only be called once a time */ - struct work_struct update; /* if update_policy() needs to be * called, but you're in IRQ context */ @@ -172,11 +169,16 @@ extern int __cpufreq_driver_target(struc unsigned int relation); -extern int cpufreq_driver_getavg(struct cpufreq_policy *policy); +extern int __cpufreq_driver_getavg(struct cpufreq_policy *policy); int cpufreq_register_governor(struct cpufreq_governor *governor); void cpufreq_unregister_governor(struct cpufreq_governor *governor); +int lock_policy_rwsem_read(int cpu); +int lock_policy_rwsem_write(int cpu); +void unlock_policy_rwsem_read(int cpu); +void unlock_policy_rwsem_write(int cpu); + /********************************************************************* * CPUFREQ DRIVER INTERFACE *