The AVX2 implementation might waste up to a page of stack memory because
of a wrong alignment calculation. This will, in the worst case, increase
the stack usage of sha1_transform_avx2() alone to 5.4 kB -- way to big
for a kernel function. Even worse, it might also allocate *less* bytes
than needed if the stack pointer is already aligned bacause in that case
the 'sub %rbx, %rsp' is effectively moving the stack pointer upwards,
not downwards.
Fix those issues by changing and simplifying the alignment calculation
to use a 32 byte alignment, the alignment really needed.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
/* Align stack */
mov %rsp, %rbx
- and $(0x1000-1), %rbx
- sub $(8+32), %rbx
- sub %rbx, %rsp
+ and $~(0x20-1), %rsp
push %rbx
sub $RESERVE_STACK, %rsp
avx2_zeroupper
add $RESERVE_STACK, %rsp
- pop %rbx
- add %rbx, %rsp
+ pop %rsp
pop %r15
pop %r14