The maximum rebalance interval allowed by the multiprocessor balancing
backoff is often not large enough to handle corner cases where there are
lots of tasks pinned on a CPU. Suresh reported:
I see system livelock's if for example I have 7000 processes
pinned onto one cpu (this is on the fastest 8-way system I
have access to).
After this patch, the machine is reported to go well above this number.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
+/*
+ * Max backoff if we encounter pinned tasks. Pretty arbitrary value, but
+ * so long as it is large enough.
+ */
+#define MAX_PINNED_INTERVAL 512
+
/*
* Check this_cpu to ensure it is balanced within domain. Attempt to move
* tasks if there is an imbalance.
/*
* Check this_cpu to ensure it is balanced within domain. Attempt to move
* tasks if there is an imbalance.
struct sched_group *group;
runqueue_t *busiest;
unsigned long imbalance;
struct sched_group *group;
runqueue_t *busiest;
unsigned long imbalance;
- int nr_moved, all_pinned;
+ int nr_moved, all_pinned = 0;
int active_balance = 0;
spin_lock(&this_rq->lock);
int active_balance = 0;
spin_lock(&this_rq->lock);
sd->nr_balance_failed = 0;
/* tune up the balancing interval */
sd->nr_balance_failed = 0;
/* tune up the balancing interval */
- if (sd->balance_interval < sd->max_interval)
+ if ((all_pinned && sd->balance_interval < MAX_PINNED_INTERVAL) ||
+ (sd->balance_interval < sd->max_interval))
sd->balance_interval *= 2;
return 0;
sd->balance_interval *= 2;
return 0;