Subject: Perfmon2: Fix for the occassional system hang when using hardware-sampling From: Kevin Corry When using hardware-sampling, when the kernel sampling-buffer fills up, we have to mask monitoring and let user-space know that the buffer is full so it can read it into user-space memory. The overall approach is for user-space to poll on the Perfmon2 context's file-descriptor. Then when the sampling- buffer fills up, we post a "buffer-full" message in the Perfmon2 context and wake-up any processes polling on the context. However, the hardware-sampling handler can be called from a few different code paths. In addition to being called from the PMU interrupt handler, it is called during a process context-switch when the monitored process is switched off the CPU. If the sampling-buffer is full during this code path, then we will, as usual, try to notify user-space that the buffer is full. Part of the wake_up() call is to lock the runqueue where the polling process is waiting. However, since we are the middle of a process context-switch, the scheduler may have that runqueue locked already. Therefore we hit a spinlock lockup, which quickly hangs the system. To fix this, we simply won't make the call to wake up the polling user-space process, so we won't ever recurse on the runqueue spinlock. Unfortunately, this means that user-space cannot use the poll() mechanism, and must loop while reading from the context and sleeping. Signed-off-by: Kevin Corry Acked-by: Carl Love Signed-off-by: Arnd Bergmann Index: linux-2.6/arch/powerpc/perfmon/perfmon_cell_hw_smpl.c =================================================================== --- linux-2.6.orig/arch/powerpc/perfmon/perfmon_cell_hw_smpl.c +++ linux-2.6/arch/powerpc/perfmon/perfmon_cell_hw_smpl.c @@ -95,17 +95,17 @@ static int pfm_cell_hw_smpl_init(struct } /** - * pfm_cell_hw_smpl_notify_user + * pfm_cell_hw_smpl_add_msg * - * Add a "buffer full" message to the context and wake up any user-space - * process that is polling on the context's file descriptor. That process - * can then read() from the file-descriptor to get a copy of the message. + * Add a "buffer full" message to the context. A user-space process can + * then read() from the file-descriptor to get a copy of the message. **/ -static int pfm_cell_hw_smpl_notify_user(struct pfm_context *ctx) +static int pfm_cell_hw_smpl_add_msg(struct pfm_context *ctx) { union pfarg_msg *msg; if (ctx->flags.no_msg) { + /* User-space won't be reading any messages. */ return 0; } @@ -120,7 +120,7 @@ static int pfm_cell_hw_smpl_notify_user( msg->type = PFM_MSG_CELL_HW_SMPL_BUF_FULL; - return pfm_notify_user(ctx); + return 0; } /** @@ -250,11 +250,8 @@ static int handle_full_buffer(struct pfm */ cbe_write_pm(cpu, pm_interval, set->pmcs[CELL_PMC_PM_INTERVAL]); - /* Add a message to the context's message queue and wake up any - * user-space program's that are polling on the context's file - * descriptor. - */ - pfm_cell_hw_smpl_notify_user(ctx); + /* Add a message to the context's message queue. */ + pfm_cell_hw_smpl_add_msg(ctx); /* Mask monitoring until a pfm_restart() occurs. */ pfm_mask_monitoring(ctx, set); Index: linux-2.6/include/linux/perfmon.h =================================================================== --- linux-2.6.orig/include/linux/perfmon.h +++ linux-2.6/include/linux/perfmon.h @@ -584,7 +584,6 @@ void pfm_switch_sets(struct pfm_context union pfarg_msg *pfm_get_new_msg(struct pfm_context *ctx); void pfm_save_pmds(struct pfm_context *ctx, struct pfm_event_set *set); -int pfm_notify_user(struct pfm_context *ctx); int pfm_ovfl_notify_user(struct pfm_context *ctx, struct pfm_event_set *set, unsigned long ip); Index: linux-2.6/perfmon/perfmon.c =================================================================== --- linux-2.6.orig/perfmon/perfmon.c +++ linux-2.6/perfmon/perfmon.c @@ -477,7 +477,7 @@ static void pfm_flush_pmds(struct task_s set->pmds[i].value += 1 + ovfl_mask; num_ovfls--; PFM_DBG("pmd%u overflowed", i); - } + } PFM_DBG("pmd%u set=%u val=0x%llx", i, set->id, @@ -640,7 +640,7 @@ do_zombie: pfm_release_session(0, 0); } -int pfm_notify_user(struct pfm_context *ctx) +static int pfm_notify_user(struct pfm_context *ctx) { if (ctx->state == PFM_CTX_ZOMBIE) { PFM_DBG("ignoring overflow notification, owner is zombie"); @@ -662,7 +662,6 @@ int pfm_notify_user(struct pfm_context * return 0; } -EXPORT_SYMBOL(pfm_notify_user); /* * send a counter overflow notification message to