4 The resource counter, declared at include/linux/res_counter.h,
5 is supposed to facilitate the resource management by controllers
6 by providing common stuff for accounting.
8 This "stuff" includes the res_counter structure and routines
13 1. Crucial parts of the res_counter structure
15 a. unsigned long long usage
17 The usage value shows the amount of a resource that is consumed
18 by a group at a given time. The units of measurement should be
19 determined by the controller that uses this counter. E.g. it can
20 be bytes, items or any other unit the controller operates on.
22 b. unsigned long long max_usage
24 The maximal value of the usage over time.
26 This value is useful when gathering statistical information about
27 the particular group, as it shows the actual resource requirements
28 for a particular group, not just some usage snapshot.
30 c. unsigned long long limit
32 The maximal allowed amount of resource to consume by the group. In
33 case the group requests for more resources, so that the usage value
34 would exceed the limit, the resource allocation is rejected (see
37 d. unsigned long long failcnt
39 The failcnt stands for "failures counter". This is the number of
40 resource allocation attempts that failed.
44 Protects changes of the above values.
48 2. Basic accounting routines
50 a. void res_counter_init(struct res_counter *rc,
51 struct res_counter *rc_parent)
53 Initializes the resource counter. As usual, should be the first
54 routine called for a new counter.
56 The struct res_counter *parent can be used to define a hierarchical
57 child -> parent relationship directly in the res_counter structure,
58 NULL can be used to define no relationship.
60 c. int res_counter_charge(struct res_counter *rc, unsigned long val,
61 struct res_counter **limit_fail_at)
63 When a resource is about to be allocated it has to be accounted
64 with the appropriate resource counter (controller should determine
65 which one to use on its own). This operation is called "charging".
67 This is not very important which operation - resource allocation
68 or charging - is performed first, but
69 * if the allocation is performed first, this may create a
70 temporary resource over-usage by the time resource counter is
72 * if the charging is performed first, then it should be uncharged
73 on error path (if the one is called).
75 If the charging fails and a hierarchical dependency exists, the
76 limit_fail_at parameter is set to the particular res_counter element
77 where the charging failed.
79 d. u64 res_counter_uncharge(struct res_counter *rc, unsigned long val)
81 When a resource is released (freed) it should be de-accounted
82 from the resource counter it was accounted to. This is called
83 "uncharging". The return value of this function indicate the amount
84 of charges still present in the counter.
86 The _locked routines imply that the res_counter->lock is taken.
88 e. u64 res_counter_uncharge_until
89 (struct res_counter *rc, struct res_counter *top,
92 Almost same as res_counter_uncharge() but propagation of uncharge
93 stops when rc == top. This is useful when kill a res_counter in
96 2.1 Other accounting routines
98 There are more routines that may help you with common needs, like
99 checking whether the limit is reached or resetting the max_usage
100 value. They are all declared in include/linux/res_counter.h.
104 3. Analyzing the resource counter registrations
106 a. If the failcnt value constantly grows, this means that the counter's
107 limit is too tight. Either the group is misbehaving and consumes too
108 many resources, or the configuration is not suitable for the group
109 and the limit should be increased.
111 b. The max_usage value can be used to quickly tune the group. One may
112 set the limits to maximal values and either load the container with
113 a common pattern or leave one for a while. After this the max_usage
114 value shows the amount of memory the container would require during
117 Setting the limit a bit above this value gives a pretty good
118 configuration that works in most of the cases.
120 c. If the max_usage is much less than the limit, but the failcnt value
121 is growing, then the group tries to allocate a big chunk of resource
124 d. If the max_usage is much less than the limit, but the failcnt value
125 is 0, then this group is given too high limit, that it does not
126 require. It is better to lower the limit a bit leaving more resource
131 4. Communication with the control groups subsystem (cgroups)
133 All the resource controllers that are using cgroups and resource counters
134 should provide files (in the cgroup filesystem) to work with the resource
135 counter fields. They are recommended to adhere to the following rules:
140 ---------------------------------------------------
141 usage usage_in_<unit_of_measurement>
142 max_usage max_usage_in_<unit_of_measurement>
143 limit limit_in_<unit_of_measurement>
147 b. Reading from file should show the corresponding field value in the
152 Field Expected behavior
153 ----------------------------------
155 max_usage reset to usage
157 failcnt reset to zero
163 a. Declare a task group (take a look at cgroups subsystem for this) and
164 fold a res_counter into it
167 struct res_counter res;
172 b. Put hooks in resource allocation/release paths
174 int alloc_something(...)
176 if (res_counter_charge(res_counter_ptr, amount) < 0)
179 <allocate the resource and return to the caller>
182 void release_something(...)
184 res_counter_uncharge(res_counter_ptr, amount);
186 <release the resource>
189 In order to keep the usage value self-consistent, both the
190 "res_counter_ptr" and the "amount" in release_something() should be
191 the same as they were in the alloc_something() when the releasing
192 resource was allocated.
194 c. Provide the way to read res_counter values and set them (the cgroups
195 still can help with it).
197 c. Compile and run :)