summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)Author
2020-04-24KVM: s390: vsie: Fix possible race when shadowing region 3 tablesDavid Hildenbrand
[ Upstream commit 1493e0f944f3c319d11e067c185c904d01c17ae5 ] We have to properly retry again by returning -EINVAL immediately in case somebody else instantiated the table concurrently. We missed to add the goto in this function only. The code now matches the other, similar shadowing functions. We are overwriting an existing region 2 table entry. All allocated pages are added to the crst_list to be freed later, so they are not lost forever. However, when unshadowing the region 2 table, we wouldn't trigger unshadowing of the original shadowed region 3 table that we replaced. It would get unshadowed when the original region 3 table is modified. As it's not connected to the page table hierarchy anymore, it's not going to get used anymore. However, for a limited time, this page table will stick around, so it's in some sense a temporary memory leak. Identified by manual code inspection. I don't think this classifies as stable material. Fixes: 998f637cc4b9 ("s390/mm: avoid races on region/segment/page table shadowing") Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-4-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24s390/cpuinfo: fix wrong output when CPU0 is offlineAlexander Gordeev
[ Upstream commit 872f27103874a73783aeff2aac2b41a489f67d7c ] /proc/cpuinfo should not print information about CPU 0 when it is offline. Fixes: 281eaa8cb67c ("s390/cpuinfo: simplify locking and skip offline cpus early") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> [heiko.carstens@de.ibm.com: shortened commit message] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24s390/diag: fix display of diagnose call statisticsMichael Mueller
commit 6c7c851f1b666a8a455678a0b480b9162de86052 upstream. Show the full diag statistic table and not just parts of it. The issue surfaced in a KVM guest with a number of vcpus defined smaller than NR_DIAG_STAT. Fixes: 1ec2772e0c3c ("s390/diag: add a statistic for diagnose calls") Cc: stable@vger.kernel.org Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24KVM: s390: vsie: Fix delivery of addressing exceptionsDavid Hildenbrand
commit 4d4cee96fb7a3cc53702a9be8299bf525be4ee98 upstream. Whenever we get an -EFAULT, we failed to read in guest 2 physical address space. Such addressing exceptions are reported via a program intercept to the nested hypervisor. We faked the intercept, we have to return to guest 2. Instead, right now we would be returning -EFAULT from the intercept handler, eventually crashing the VM. the correct thing to do is to return 1 as rc == 1 is the internal representation of "we have to go back into g2". Addressing exceptions can only happen if the g2->g3 page tables reference invalid g2 addresses (say, either a table or the final page is not accessible - so something that basically never happens in sane environments. Identified by manual code inspection. Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested virtualization") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-3-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [borntraeger@de.ibm.com: fix patch description] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checksDavid Hildenbrand
commit a1d032a49522cb5368e5dfb945a85899b4c74f65 upstream. In case we have a region 1 the following calculation (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11) results in 64. As shifts beyond the size are undefined the compiler is free to use instructions like sllg. sllg will only use 6 bits of the shift value (here 64) resulting in no shift at all. That means that ALL addresses will be rejected. The can result in endless loops, e.g. when prefix cannot get mapped. Fixes: 4be130a08420 ("s390/mm: add shadow gmap support") Tested-by: Janosch Frank <frankja@linux.ibm.com> Reported-by: Janosch Frank <frankja@linux.ibm.com> Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-2-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [borntraeger@de.ibm.com: fix patch description, remove WARN_ON_ONCE] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in ↵Nathan Chancellor
storage_key_init_range commit 380324734956c64cd060e1db4304f3117ac15809 upstream. Clang warns: In file included from ../arch/s390/purgatory/purgatory.c:10: In file included from ../include/linux/kexec.h:18: In file included from ../include/linux/crash_core.h:6: In file included from ../include/linux/elfcore.h:5: In file included from ../include/linux/user.h:1: In file included from ../arch/s390/include/asm/user.h:11: ../arch/s390/include/asm/page.h:45:6: warning: converting the result of '<<' to a boolean always evaluates to false [-Wtautological-constant-compare] if (PAGE_DEFAULT_KEY) ^ ../arch/s390/include/asm/page.h:23:44: note: expanded from macro 'PAGE_DEFAULT_KEY' #define PAGE_DEFAULT_KEY (PAGE_DEFAULT_ACC << 4) ^ 1 warning generated. Explicitly compare this against zero to silence the warning as it is intended to be used in a boolean context. Fixes: de3fa841e429 ("s390/mm: fix compile for PAGE_DEFAULT_KEY != 0") Link: https://github.com/ClangBuiltLinux/linux/issues/860 Link: https://lkml.kernel.org/r/20200214064207.10381-1-natechancellor@gmail.com Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28s390/ftrace: generate traced function stack frameVasily Gorbik
[ Upstream commit 45f7a0da600d3c409b5ad8d5ddddacd98ddc8840 ] Currently backtrace from ftraced function does not contain ftraced function itself. e.g. for "path_openat": arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15e/0x3d8 ftrace_graph_caller+0x0/0x1c <-- ftrace code do_filp_open+0x7c/0xe8 <-- ftraced function caller do_open_execat+0x76/0x1b8 open_exec+0x52/0x78 load_elf_binary+0x180/0x1160 search_binary_handler+0x8e/0x288 load_script+0x2a8/0x2b8 search_binary_handler+0x8e/0x288 __do_execve_file.isra.39+0x6fa/0xb40 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Ftraced function is expected in the backtrace by ftrace kselftests, which are now failing. It would also be nice to have it for clarity reasons. "ftrace_caller" itself is called without stack frame allocated for it and does not store its caller (ftraced function). Instead it simply allocates a stack frame for "ftrace_trace_function" and sets backchain to point to ftraced function stack frame (which contains ftraced function caller in saved r14). To fix this issue make "ftrace_caller" allocate a stack frame for itself just to store ftraced function for the stack unwinder. As a result backtrace looks like the following: arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15e/0x3d8 ftrace_graph_caller+0x0/0x1c <-- ftrace code path_openat+0x6/0xd60 <-- ftraced function do_filp_open+0x7c/0xe8 <-- ftraced function caller do_open_execat+0x76/0x1b8 open_exec+0x52/0x78 load_elf_binary+0x180/0x1160 search_binary_handler+0x8e/0x288 load_script+0x2a8/0x2b8 search_binary_handler+0x8e/0x288 __do_execve_file.isra.39+0x6fa/0xb40 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Reported-by: Sven Schnelle <sven.schnelle@ibm.com> Tested-by: Sven Schnelle <sven.schnelle@ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28s390/time: Fix clk type in get_tod_clockNathan Chancellor
commit 0f8a206df7c920150d2aa45574fba0ab7ff6be4f upstream. Clang warns: In file included from ../arch/s390/boot/startup.c:3: In file included from ../include/linux/elf.h:5: In file included from ../arch/s390/include/asm/elf.h:132: In file included from ../include/linux/compat.h:10: In file included from ../include/linux/time.h:74: In file included from ../include/linux/time32.h:13: In file included from ../include/linux/timex.h:65: ../arch/s390/include/asm/timex.h:160:20: warning: passing 'unsigned char [16]' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign] get_tod_clock_ext(clk); ^~~ ../arch/s390/include/asm/timex.h:149:44: note: passing argument to parameter 'clk' here static inline void get_tod_clock_ext(char *clk) ^ Change clk's type to just be char so that it matches what happens in get_tod_clock_ext. Fixes: 57b28f66316d ("[S390] s390_hypfs: Add new attributes") Link: https://github.com/ClangBuiltLinux/linux/issues/861 Link: http://lkml.kernel.org/r/20200208140858.47970-1-natechancellor@gmail.com Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-12s390/smp: fix physical to logical CPU map for SMTHeiko Carstens
[ Upstream commit 72a81ad9d6d62dcb79f7e8ad66ffd1c768b72026 ] If an SMT capable system is not IPL'ed from the first CPU the setup of the physical to logical CPU mapping is broken: the IPL core gets CPU number 0, but then the next core gets CPU number 1. Correct would be that all SMT threads of CPU 0 get the subsequent logical CPU numbers. This is important since a lot of code (like e.g. the CPU topology code) assumes that CPU maps are setup like this. If the mapping is broken the system will not IPL due to broken topology masks: [ 1.716341] BUG: arch topology broken [ 1.716342] the SMT domain not a subset of the MC domain [ 1.716343] BUG: arch topology broken [ 1.716344] the MC domain not a subset of the BOOK domain This scenario can usually not happen since LPARs are always IPL'ed from CPU 0 and also re-IPL is intiated from CPU 0. However older kernels did initiate re-IPL on an arbitrary CPU. If therefore a re-IPL from an old kernel into a new kernel is initiated this may lead to crash. Fix this by setting up the physical to logical CPU mapping correctly. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12s390/cpum_sf: Avoid SBD overflow condition in irq handlerThomas Richter
[ Upstream commit 0539ad0b22877225095d8adef0c376f52cc23834 ] The s390 CPU Measurement sampling facility has an overflow condition which fires when all entries in a SBD are used. The measurement alert interrupt is triggered and reads out all samples in this SDB. It then tests the successor SDB, if this SBD is not full, the interrupt handler does not read any samples at all from this SDB The design waits for the hardware to fill this SBD and then trigger another meassurement alert interrupt. This scheme works nicely until an perf_event_overflow() function call discards the sample due to a too high sampling rate. The interrupt handler has logic to read out a partially filled SDB when the perf event overflow condition in linux common code is met. This causes the CPUM sampling measurement hardware and the PMU device driver to operate on the same SBD's trailer entry. This should not happen. This can be seen here using this trace: cpumsf_pmu_add: tear:0xb5286000 hw_perf_event_update: sdbt 0xb5286000 full 1 over 0 flush_all:0 hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0 above shows 1. interrupt hw_perf_event_update: sdbt 0xb5286008 full 1 over 0 flush_all:0 hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0 above shows 2. interrupt ... this goes on fine until... hw_perf_event_update: sdbt 0xb5286068 full 1 over 0 flush_all:0 perf_push_sample1: overflow one or more samples read from the IRQ handler are rejected by perf_event_overflow() and the IRQ handler advances to the next SDB and modifies the trailer entry of a partially filled SDB. hw_perf_event_update: sdbt 0xb5286070 full 0 over 0 flush_all:1 timestamp: 14:32:52.519953 Next time the IRQ handler is called for this SDB the trailer entry shows an overflow count of 19 missed entries. hw_perf_event_update: sdbt 0xb5286070 full 1 over 19 flush_all:1 timestamp: 14:32:52.970058 Remove access to a follow on SDB when event overflow happened. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12s390/cpum_sf: Adjust sampling interval to avoid hitting sample limitsThomas Richter
[ Upstream commit 39d4a501a9ef55c57b51e3ef07fc2aeed7f30b3b ] Function perf_event_ever_overflow() and perf_event_account_interrupt() are called every time samples are processed by the interrupt handler. However function perf_event_account_interrupt() has checks to avoid being flooded with interrupts (more then 1000 samples are received per task_tick). Samples are then dropped and a PERF_RECORD_THROTTLED is added to the perf data. The perf subsystem limit calculation is: maximum sample frequency := 100000 --> 1 samples per 10 us task_tick = 10ms = 10000us --> 1000 samples per task_tick The work flow is measurement_alert() uses SDBT head and each SBDT points to 511 SDB pages, each with 126 sample entries. After processing 8 SBDs and for each valid sample calling: perf_event_overflow() perf_event_account_interrupts() there is a considerable amount of samples being dropped, especially when the sample frequency is very high and near the 100000 limit. To avoid the high amount of samples being dropped near the end of a task_tick time frame, increment the sampling interval in case of dropped events. The CPU Measurement sampling facility on the s390 supports only intervals, specifiing how many CPU cycles have to be executed before a sample is generated. Increase the interval when the samples being generated hit the task_tick limit. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04s390/cpum_sf: Check for SDBT and SDB consistencyThomas Richter
[ Upstream commit 247f265fa502e7b17a0cb0cc330e055a36aafce4 ] Each SBDT is located at a 4KB page and contains 512 entries. Each entry of a SDBT points to a SDB, a 4KB page containing sampled data. The last entry is a link to another SDBT page. When an event is created the function sequence executed is: __hw_perf_event_init() +--> allocate_buffers() +--> realloc_sampling_buffers() +---> alloc_sample_data_block() Both functions realloc_sampling_buffers() and alloc_sample_data_block() allocate pages and the allocation can fail. This is handled correctly and all allocated pages are freed and error -ENOMEM is returned to the top calling function. Finally the event is not created. Once the event has been created, the amount of initially allocated SDBT and SDB can be too low. This is detected during measurement interrupt handling, where the amount of lost samples is calculated. If the number of lost samples is too high considering sampling frequency and already allocated SBDs, the number of SDBs is enlarged during the next execution of cpumsf_pmu_enable(). If more SBDs need to be allocated, functions realloc_sampling_buffers() +---> alloc-sample_data_block() are called to allocate more pages. Page allocation may fail and the returned error is ignored. A SDBT and SDB setup already exists. However the modified SDBTs and SDBs might end up in a situation where the first entry of an SDBT does not point to an SDB, but another SDBT, basicly an SBDT without payload. This can not be handled by the interrupt handler, where an SDBT must have at least one entry pointing to an SBD. Add a check to avoid SDBTs with out payload (SDBs) when enlarging the buffer setup. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04s390/disassembler: don't hide instruction addressesIlya Leoshkevich
[ Upstream commit 544f1d62e3e6c6e6d17a5e56f6139208acb5ff46 ] Due to kptr_restrict, JITted BPF code is now displayed like this: 000000000b6ed1b2: ebdff0800024 stmg %r13,%r15,128(%r15) 000000004cde2ba0: 41d0f040 la %r13,64(%r15) 00000000fbad41b0: a7fbffa0 aghi %r15,-96 Leaking kernel addresses to dmesg is not a concern in this case, because this happens only when JIT debugging is explicitly activated, which only root can do. Use %px in this particular instance, and also to print an instruction address in show_code and PCREL (e.g. brasl) arguments in print_insn. While at present functionally equivalent to %016lx, %px is recommended by Documentation/core-api/printk-formats.rst for such cases. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05mm, gup: add missing refcount overflow checks on x86 and s390Vlastimil Babka
The mainline commit 8fde12ca79af ("mm: prevent get_user_pages() from overflowing page refcount") was backported to 4.9.y stable as commit 2ed768cfd895. The backport however missed that in 4.9, there are several arch-specific gup.c versions with fast gup implementations, so these do not prevent refcount overflow. This is partially fixed for x86 in stable-only commit d73af79742e7 ("x86, mm, gup: prevent get_page() race with munmap in paravirt guest"). This stable-only commit adds missing parts to x86 version, as well as s390 version, both taken from the SUSE SLES/openSUSE 4.12-based kernels. The remaining architectures with own gup.c are sparc, mips, sh. It's unlikely the known overflow scenario based on FUSE, which needs 140GB of RAM, is a problem for those architectures, and I don't feel confident enough to patch them. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-05KVM: s390: unregister debug feature on failing arch initMichael Mueller
[ Upstream commit 308c3e6673b012beecb96ef04cc65f4a0e7cdd99 ] Make sure the debug feature and its allocated resources get released upon unsuccessful architecture initialization. A related indication of the issue will be reported as kernel message. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20181130143215.69496-2-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28s390/perf: Return error when debug_register failsThomas Richter
[ Upstream commit ec0c0bb489727de0d4dca6a00be6970ab8a3b30a ] Return an error when the function debug_register() fails allocating the debug handle. Also remove the registered debug handle when the initialization fails later on. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25s390/kasan: avoid vdso instrumentationVasily Gorbik
[ Upstream commit 348498458505e202df41b6b9a78da448d39298b7 ] vdso is mapped into user space processes, which won't have kasan shodow mapped. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-16kvm: Convert kvm_lock to a mutexJunaid Shahid
commit 0d9ce162cf46c99628cc5da9510b959c7976735b upstream. It doesn't seem as if there is any particular need for kvm_lock to be a spinlock, so convert the lock to a mutex so that sleepable functions (in particular cond_resched()) can be called while holding it. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 4.9: - Drop changes in kvm_hyperv_tsc_notifier(), vm_stat_clear(), vcpu_stat_clear(), kvm_uevent_notify_change() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06s390/cmm: fix information leak in cmm_timeout_handler()Yihui ZENG
commit b8e51a6a9db94bc1fb18ae831b3dab106b5a4b5f upstream. The problem is that we were putting the NUL terminator too far: buf[sizeof(buf) - 1] = '\0'; If the user input isn't NUL terminated and they haven't initialized the whole buffer then it leads to an info leak. The NUL terminator should be: buf[len - 1] = '\0'; Signed-off-by: Yihui Zeng <yzeng56@asu.edu> Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [heiko.carstens@de.ibm.com: keep semantics of how *lenp and *ppos are handled] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06s390/uaccess: avoid (false positive) compiler warningsChristian Borntraeger
[ Upstream commit 062795fcdcb2d22822fb42644b1d76a8ad8439b3 ] Depending on inlining decisions by the compiler, __get/put_user_fn might become out of line. Then the compiler is no longer able to tell that size can only be 1,2,4 or 8 due to the check in __get/put_user resulting in false positives like ./arch/s390/include/asm/uaccess.h: In function ‘__put_user_fn’: ./arch/s390/include/asm/uaccess.h:113:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 113 | return rc; | ^~ ./arch/s390/include/asm/uaccess.h: In function ‘__get_user_fn’: ./arch/s390/include/asm/uaccess.h:143:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 143 | return rc; | ^~ These functions are supposed to be always inlined. Mark it as such. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-17s390/topology: avoid firing events before kobjs are createdVasily Gorbik
commit f3122a79a1b0a113d3aea748e0ec26f2cb2889de upstream. arch_update_cpu_topology is first called from: kernel_init_freeable->sched_init_smp->sched_init_domains even before cpus has been registered in: kernel_init_freeable->do_one_initcall->s390_smp_init Do not trigger kobject_uevent change events until cpu devices are actually created. Fixes the following kasan findings: BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb40/0xee0 Read of size 8 at addr 0000000000000020 by task swapper/0/1 BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb36/0xee0 Read of size 8 at addr 0000000000000018 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: ([<0000000143c6db7e>] show_stack+0x14e/0x1a8) [<0000000145956498>] dump_stack+0x1d0/0x218 [<000000014429fb4c>] print_address_description+0x64/0x380 [<000000014429f630>] __kasan_report+0x138/0x168 [<0000000145960b96>] kobject_uevent_env+0xb36/0xee0 [<0000000143c7c47c>] arch_update_cpu_topology+0x104/0x108 [<0000000143df9e22>] sched_init_domains+0x62/0xe8 [<000000014644c94a>] sched_init_smp+0x3a/0xc0 [<0000000146433a20>] kernel_init_freeable+0x558/0x958 [<000000014599002a>] kernel_init+0x22/0x160 [<00000001459a71d4>] ret_from_fork+0x28/0x30 [<00000001459a71dc>] kernel_thread_starter+0x0/0x10 Cc: stable@vger.kernel.org Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17KVM: s390: Test for bad access register and size at the start of S390_MEM_OPThomas Huth
commit a13b03bbb4575b350b46090af4dfd30e735aaed1 upstream. If the KVM_S390_MEM_OP ioctl is called with an access register >= 16, then there is certainly a bug in the calling userspace application. We check for wrong access registers, but only if the vCPU was already in the access register mode before (i.e. the SIE block has recorded it). The check is also buried somewhere deep in the calling chain (in the function ar_translation()), so this is somewhat hard to find. It's better to always report an error to the userspace in case this field is set wrong, and it's safer in the KVM code if we block wrong values here early instead of relying on a check somewhere deep down the calling chain, so let's add another check to kvm_s390_guest_mem_op() directly. We also should check that the "size" is non-zero here (thanks to Janosch Frank for the hint!). If we do not check the size, we could call vmalloc() with this 0 value, and this will cause a kernel warning. Signed-off-by: Thomas Huth <thuth@redhat.com> Link: https://lkml.kernel.org/r/20190829122517.31042-1-thuth@redhat.com Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-07hypfs: Fix error number left in struct pointer memberDavid Howells
[ Upstream commit b54c64f7adeb241423cd46598f458b5486b0375e ] In hypfs_fill_super(), if hypfs_create_update_file() fails, sbi->update_file is left holding an error number. This is passed to hypfs_kill_super() which doesn't check for this. Fix this by not setting sbi->update_value until after we've checked for error. Fixes: 24bbb1faf3f0 ("[PATCH] s390_hypfs filesystem") Signed-off-by: David Howells <dhowells@redhat.com> cc: Martin Schwidefsky <schwidefsky@de.ibm.com> cc: Heiko Carstens <heiko.carstens@de.ibm.com> cc: linux-s390@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05s390/crypto: xts-aes-s390 fix extra run-time crypto self tests findingHarald Freudenberger
[ Upstream commit 9e323d45ba94262620a073a3f9945ca927c07c71 ] With 'extra run-time crypto self tests' enabled, the selftest for s390-xts fails with alg: skcipher: xts-aes-s390 encryption unexpectedly succeeded on test vector "random: len=0 klen=64"; expected_error=-22, cfg="random: inplace use_digest nosimd src_divs=[2.61%@+4006, 84.44%@+21, 1.55%@+13, 4.50%@+344, 4.26%@+21, 2.64%@+27]" This special case with nbytes=0 is not handled correctly and this fix now makes sure that -EINVAL is returned when there is en/decrypt called with 0 bytes to en/decrypt. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21s390/bpf: use 32-bit index for tail callsIlya Leoshkevich
[ Upstream commit 91b4db5313a2c793aabc2143efb8ed0cf0fdd097 ] "p runtime/jit: pass > 32bit index to tail_call" fails when bpf_jit_enable=1, because the tail call is not executed. This in turn is because the generated code assumes index is 64-bit, while it must be 32-bit, and as a result prog array bounds check fails, while it should pass. Even if bounds check would have passed, the code that follows uses 64-bit index to compute prog array offset. Fix by using clrj instead of clgrj for comparing index with array size, and also by using llgfr for truncating index to 32 bits before using it to compute prog array offset. Fixes: 6651ee070b31 ("s390/bpf: implement bpf_tail_call() helper") Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21s390/bpf: fix lcgr instruction encodingIlya Leoshkevich
[ Upstream commit bb2d267c448f4bc3a3389d97c56391cb779178ae ] "masking, test in bounds 3" fails on s390, because BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0) ignores the top 32 bits of BPF_REG_2. The reason is that JIT emits lcgfr instead of lcgr. The associated comment indicates that the code was intended to emit lcgr in the first place, it's just that the wrong opcode was used. Fix by using the correct opcode. Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctlThomas Huth
commit 53936b5bf35e140ae27e4bbf0447a61063f400da upstream. When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject an interrupt, we convert them from the legacy struct kvm_s390_interrupt to the new struct kvm_s390_irq via the s390int_to_s390irq() function. However, this function does not take care of all types of interrupts that we can inject into the guest later (see do_inject_vcpu()). Since we do not clear out the s390irq values before calling s390int_to_s390irq(), there is a chance that we copy random data from the kernel stack which could be leaked to the userspace later. Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT interrupt: s390int_to_s390irq() does not handle it, and the function __inject_pfault_init() later copies irq->u.ext which contains the random kernel stack data. This data can then be leaked either to the guest memory in __deliver_pfault_init(), or the userspace might retrieve it directly with the KVM_S390_GET_IRQ_STATE ioctl. Fix it by handling that interrupt type in s390int_to_s390irq(), too, and by making sure that the s390irq struct is properly pre-initialized. And while we're at it, make sure that s390int_to_s390irq() now directly returns -EINVAL for unknown interrupt types, so that we immediately get a proper error code in case we add more interrupt types to do_inject_vcpu() without updating s390int_to_s390irq() sometime in the future. Cc: stable@vger.kernel.org Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/kvm/20190912115438.25761-1-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-25bpf: get rid of pure_initcall dependency to enable jitsDaniel Borkmann
commit fa9dd599b4dae841924b022768354cfde9affecb upstream. Having a pure_initcall() callback just to permanently enable BPF JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave a small race window in future where JIT is still disabled on boot. Since we know about the setting at compilation time anyway, just initialize it properly there. Also consolidate all the individual bpf_jit_enable variables into a single one and move them under one location. Moreover, don't allow for setting unspecified garbage values on them. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> [bwh: Backported to 4.9 as dependency of commit 2e4a30983b0f "bpf: restrict access to core bpf sysctls": - Drop change in arch/mips/net/ebpf_jit.c - Drop change to bpf_jit_kallsyms - Adjust filenames, context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-21s390: fix stfle zero paddingHeiko Carstens
commit 4f18d869ffd056c7858f3d617c71345cf19be008 upstream. The stfle inline assembly returns the number of double words written (condition code 0) or the double words it would have written (condition code 3), if the memory array it got as parameter would have been large enough. The current stfle implementation assumes that the array is always large enough and clears those parts of the array that have not been written to with a subsequent memset call. If however the array is not large enough memset will get a negative length parameter, which means that memset clears memory until it gets an exception and the kernel crashes. To fix this simply limit the maximum length. Move also the inline assembly to an extra function to avoid clobbering of register 0, which might happen because of the added min_t invocation together with code instrumentation. The bug was introduced with commit 14375bc4eb8d ("[S390] cleanup facility list handling") but was rather harmless, since it would only write to a rather large array. It became a potential problem with commit 3ab121ab1866 ("[S390] kernel: Add z/VM LGR detection"). Since then it writes to an array with only four double words, while some machines already deliver three double words. As soon as machines have a facility bit within the fifth double a crash on IPL would happen. Fixes: 14375bc4eb8d ("[S390] cleanup facility list handling") Cc: <stable@vger.kernel.org> # v2.6.37+ Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-22KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGIONChristian Borntraeger
[ Upstream commit 19ec166c3f39fe1d3789888a74cc95544ac266d4 ] kselftests exposed a problem in the s390 handling for memory slots. Right now we only do proper memory slot handling for creation of new memory slots. Neither MOVE, nor DELETION are handled properly. Let us implement those. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-04s390: limit brk randomization to 32MBMartin Schwidefsky
[ Upstream commit cd479eccd2e057116d504852814402a1e68ead80 ] For a 64-bit process the randomization of the program break is quite large with 1GB. That is as big as the randomization of the anonymous mapping base, for a test case started with '/lib/ld64.so.1 <exec>' it can happen that the heap is placed after the stack. To avoid this limit the program break randomization to 32MB for 64-bit and keep 8MB for 31-bit. Reported-by: Stefan Liebler <stli@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
2019-01-31s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPUDavid Hildenbrand
commit 60f1bf29c0b2519989927cae640cd1f50f59dc7f upstream. When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read from pcpu_devices->lowcore. However, due to prefixing, that will result in reading from absolute address 0 on that CPU. We have to go via the actual lowcore instead. This means that right now, we will read lc->nodat_stack == 0 and therfore work on a very wrong stack. This BUG essentially broke rebooting under QEMU TCG (which will report a low address protection exception). And checking under KVM, it is also broken under KVM. With 1 VCPU it can be easily triggered. :/# echo 1 > /proc/sys/kernel/sysrq :/# echo b > /proc/sysrq-trigger [ 28.476745] sysrq: SysRq : Resetting [ 28.476793] Kernel stack overflow. [ 28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13 [ 28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux) [ 28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140) [ 28.476861] R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000 [ 28.476864] 0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0 [ 28.476864] 000000000010dff8 0000000000000000 0000000000000000 0000000000000000 [ 28.476865] 000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000 [ 28.476887] Krnl Code: 0000000000115bfe: 4170f000 la %r7,0(%r15) [ 28.476887] 0000000000115c02: 41f0a000 la %r15,0(%r10) [ 28.476887] #0000000000115c06: e370f0980024 stg %r7,152(%r15) [ 28.476887] >0000000000115c0c: c0e5fffff86e brasl %r14,114ce8 [ 28.476887] 0000000000115c12: 41f07000 la %r15,0(%r7) [ 28.476887] 0000000000115c16: a7f4ffa8 brc 15,115b66 [ 28.476887] 0000000000115c1a: 0707 bcr 0,%r7 [ 28.476887] 0000000000115c1c: 0707 bcr 0,%r7 [ 28.476901] Call Trace: [ 28.476902] Last Breaking-Event-Address: [ 28.476920] [<0000000000a01c4a>] arch_call_rest_init+0x22/0x80 [ 28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue. [ 28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13 [ 28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux) [ 28.476932] Call Trace: Fixes: 2f859d0dad81 ("s390/smp: reduce size of struct pcpu") Cc: stable@vger.kernel.org # 4.0+ Reported-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-31s390/smp: fix CPU hotplug deadlock with CPU rescanGerald Schaefer
commit b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream. smp_rescan_cpus() is called without the device_hotplug_lock, which can lead to a dedlock when a new CPU is found and immediately set online by a udev rule. This was observed on an older kernel version, where the cpu_hotplug_begin() loop was still present, and it resulted in hanging chcpu and systemd-udev processes. This specific deadlock will not show on current kernels. However, there may be other possible deadlocks, and since smp_rescan_cpus() can still trigger a CPU hotplug operation, the device_hotplug_lock should be held. For reference, this was the deadlock with the old cpu_hotplug_begin() loop: chcpu (rescan) systemd-udevd echo 1 > /sys/../rescan -> smp_rescan_cpus() -> (*) get_online_cpus() (increases refcount) -> smp_add_present_cpu() (new CPU found) -> register_cpu() -> device_add() -> udev "add" event triggered -----------> udev rule sets CPU online -> echo 1 > /sys/.../online -> lock_device_hotplug_sysfs() (this is missing in rescan path) -> device_online() -> (**) device_lock(new CPU dev) -> cpu_up() -> cpu_hotplug_begin() (loops until refcount == 0) -> deadlock with (*) -> bus_probe_device() -> device_attach() -> device_lock(new CPU dev) -> deadlock with (**) Fix this by taking the device_hotplug_lock in the CPU rescan path. Cc: <stable@vger.kernel.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-31s390/early: improve machine detectionChristian Borntraeger
commit 03aa047ef2db4985e444af6ee1c1dd084ad9fb4c upstream. Right now the early machine detection code check stsi 3.2.2 for "KVM" and set MACHINE_IS_VM if this is different. As the console detection uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux early for any non z/VM system that sets a different value than KVM. So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR, MACHINE_IS_VM, or MACHINE_IS_KVM. CC: stable@vger.kernel.org Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-17s390/cpum_cf: Reject request for sampling in event initializationThomas Richter
[ Upstream commit 613a41b0d16e617f46776a93b975a1eeea96417c ] On s390 command perf top fails [root@s35lp76 perf] # ./perf top -F100000 --stdio Error: cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' [root@s35lp76 perf] # Using event -e rb0000 works as designed. Event rb0000 is the event number of the sampling facility for basic sampling. During system start up the following PMUs are installed in the kernel's PMU list (from head to tail): cpum_cf --> s390 PMU counter facility device driver cpum_sf --> s390 PMU sampling facility device driver uprobe kprobe tracepoint task_clock cpu_clock Perf top executes following functions and calls perf_event_open(2) system call with different parameters many times: cmd_top --> __cmd_top --> perf_evlist__add_default --> __perf_evlist__add_default --> perf_evlist__new_cycles (creates event type:0 (HW) config 0 (CPU_CYCLES) --> perf_event_attr__set_max_precise_ip Uses perf_event_open(2) to detect correct precise_ip level. Fails 3 times on s390 which is ok. Then functions cmd_top --> __cmd_top --> perf_top__start_counters -->perf_evlist__config --> perf_can_comm_exec --> perf_probe_api This functions test support for the following events: "cycles:u", "instructions:u", "cpu-clock:u" using --> perf_do_probe_api --> perf_event_open_cloexec Test the close on exec flag support with perf_event_open(2). perf_do_probe_api returns true if the event is supported. The function returns true because event cpu-clock is supported by the PMU cpu_clock. This is achieved by many calls to perf_event_open(2). Function perf_top__start_counters now calls perf_evsel__open() for every event, which is the default event cpu_cycles (config:0) and type HARDWARE (type:0) which a predfined frequence of 4000. Given the above order of the PMU list, the PMU cpum_cf gets called first and returns 0, which indicates support for this sampling. The event is fully allocated in the function perf_event_open (file kernel/event/core.c near line 10521 and the following check fails: event = perf_event_alloc(&attr, cpu, task, group_leader, NULL, NULL, NULL, cgroup_fd); if (IS_ERR(event)) { err = PTR_ERR(event); goto err_cred; } if (is_sampling_event(event)) { if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { err = -EOPNOTSUPP; goto err_alloc; } } The check for the interrupt capabilities fails and the system call perf_event_open() returns -EOPNOTSUPP (-95). Add a check to return -ENODEV when sampling is requested in PMU cpum_cf. This allows common kernel code in the perf_event_open() system call to test the next PMU in above list. Fixes: 97b1198fece0 (" "s390, perf: Use common PMU interrupt disabled code") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-01s390/mm: Check for valid vma before zapping in gmap_discardJanosch Frank
commit 1843abd03250115af6cec0892683e70cf2297c25 upstream. Userspace could have munmapped the area before doing unmapping from the gmap. This would leave us with a valid vmaddr, but an invalid vma from which we would try to zap memory. Let's check before using the vma. Fixes: 1e133ab296f3 ("s390/mm: split arch/s390/mm/pgtable.c") Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <20180816082432.78828-1-frankja@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-27s390/mm: Fix ERROR: "__node_distance" undefined!Justin M. Forbes
[ Upstream commit a541f0ebcc08ed8bc0cc492eec9a86cb280a9f24 ] Fixes: ERROR: "__node_distance" [drivers/nvme/host/nvme-core.ko] undefined! make[1]: *** [scripts/Makefile.modpost:92: __modpost] Error 1 make: *** [Makefile:1275: modules] Error 2 + exit 1 Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-27s390/vdso: add missing FORCE to build targetsVasily Gorbik
[ Upstream commit b44b136a3773d8a9c7853f8df716bd1483613cbb ] According to Documentation/kbuild/makefiles.txt all build targets using if_changed should use FORCE as well. Add missing FORCE to make sure vdso targets are rebuild properly when not just immediate prerequisites have changed but also when build command differs. Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-13s390/sthyi: Fix machine name validity indicationJanosch Frank
[ Upstream commit b5130dc2224d1881f24224c0590c6d97f2168d6a ] When running as a level 3 guest with no host provided sthyi support sclp_ocf_cpc_name_copy() will only return zeroes. Zeroes are not a valid group name, so let's not indicate that the group name field is valid. Also the group name is not dependent on stsi, let's not return based on stsi before setting it. Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation") Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-20sched/cputime: Convert kcpustat to nsecsFrederic Weisbecker
commit 7fb1327ee9b92fca27662f9b9d60c7c3376d6c69 upstream. Kernel CPU stats are stored in cputime_t which is an architecture defined type, and hence a bit opaque and requiring accessors and mutators for any operation. Converting them to nsecs simplifies the code and is one step toward the removal of cputime_t in the core code. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1485832191-26889-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> [colona: minor conflict as 527b0a76f41d ("sched/cpuacct: Avoid %lld seq_printf warning") is missing from v4.9] Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-03s390/extmem: fix gcc 8 stringop-overflow warningVasily Gorbik
[ Upstream commit 6b2ddf33baec23dace85bd647e3fc4ac070963e8 ] arch/s390/mm/extmem.c: In function '__segment_load': arch/s390/mm/extmem.c:436:2: warning: 'strncat' specified bound 7 equals source length [-Wstringop-overflow=] strncat(seg->res_name, " (DCSS)", 7); What gcc complains about here is the misuse of strncat function, which in this case does not limit a number of bytes taken from "src", so it is in the end the same as strcat(seg->res_name, " (DCSS)"); Keeping in mind that a res_name is 15 bytes, strncat in this case would overflow the buffer and write 0 into alignment byte between the fields in the struct. To avoid that increasing res_name size to 16, and reusing strlcat. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-03s390/mm: correct allocate_pgste proc_handler callbackVasily Gorbik
[ Upstream commit 5bedf8aa03c28cb8dc98bdd32a41b66d8f7d3eaa ] Since proc_dointvec does not perform value range control, proc_dointvec_minmax should be used to limit value range, which is clearly intended here, as the internal representation of the value: unsigned int alloc_pgste:1; In fact it currently works, since we have mm->context.alloc_pgste = page_table_allocate_pgste || ... ... since commit 23fefe119ceb5 ("s390/kvm: avoid global config of vm.alloc_pgste=1") Before that it was mm->context.alloc_pgste = page_table_allocate_pgste; which was broken. That was introduced with commit 0b46e0a3ec0d7 ("s390/kvm: remove delayed reallocation of page tables for KVM"). Fixes: 0b46e0a3ec0d7 ("s390/kvm: remove delayed reallocation of page tables for KVM") Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-19KVM: s390: vsie: copy wrapping keys to right placePierre Morel
commit 204c97245612b6c255edf4e21e24d417c4a0c008 upstream. Copy the key mask to the right offset inside the shadow CRYCB Fixes: bbeaa58b3 ("KVM: s390: vsie: support aes dea wrapping keys") Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Cc: stable@vger.kernel.org # v4.8+ Message-Id: <1535019956-23539-2-git-send-email-pmorel@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15s390/lib: use expoline for all bcr instructionsMartin Schwidefsky
commit 5eda25b10297684c1f46a14199ec00210f3c346e upstream. The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15s390/kdump: Fix memleak in nt_vmcoreinfoPhilipp Rudo
[ Upstream commit 2d2e7075b87181ed0c675e4936e20bdadba02e1f ] The vmcoreinfo of a crashed system is potentially fragmented. Thus the crash kernel has an intermediate step where the vmcoreinfo is copied into a temporary, continuous buffer in the crash kernel memory. This temporary buffer is never freed. Free it now to prevent the memleak. While at it replace all occurrences of "VMCOREINFO" by its corresponding macro to prevent potential renaming issues. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/pci: fix out of bounds access during irq setupSebastian Ott
commit 866f3576a72b2233a76dffb80290f8086dc49e17 upstream. During interrupt setup we allocate interrupt vectors, walk the list of msi descriptors, and fill in the message data. Requesting more interrupts than supported on s390 can lead to an out of bounds access. When we restrict the number of interrupts we should also stop walking the msi list after all supported interrupts are handled. Cc: stable@vger.kernel.org Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/numa: move initial setup of node_to_cpumask_mapMartin Schwidefsky
commit fb7d7518b0d65955f91c7b875c36eae7694c69bd upstream. The numa_init_early initcall sets the node_to_cpumask_map[0] to the full cpu_possible_mask. Unfortunately this early_initcall is too late, the NUMA setup for numa=emu is done even earlier. The order of calls is numa_setup() -> emu_update_cpu_topology(), then the early_initcalls(), followed by sched_init_domains(). Starting with git commit 051f3ca02e46432c0965e8948f00c07d8a2f09c0 "sched/topology: Introduce NUMA identity node sched domain" the incorrect node_to_cpumask_map[0] really screws up the domain setup and the kernel panics with the follow oops: Cc: <stable@vger.kernel.org> # v4.15+ Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/qdio: reset old sbal_state flagsJulian Wiedmann
commit 64e03ff72623b8c2ea89ca3cb660094e019ed4ae upstream. When allocating a new AOB fails, handle_outbound() is still capable of transmitting the selected buffer (just without async completion). But if a previous transfer on this queue slot used async completion, its sbal_state flags field is still set to QDIO_OUTBUF_STATE_FLAG_PENDING. So when the upper layer driver sees this stale flag, it expects an async completion that never happens. Fix this by unconditionally clearing the flags field. Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390: fix br_r1_trampoline for machines without exrlMartin Schwidefsky
commit 26f843848bae973817b3587780ce6b7b0200d3e4 upstream. For machines without the exrl instruction the BFP jit generates code that uses an "br %r1" instruction located in the lowcore page. Unfortunately there is a cut & paste error that puts an additional "larl %r1,.+14" instruction in the code that clobbers the branch target address in %r1. Remove the larl instruction. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/kvm: fix deadlock when killed by oomClaudio Imbrenda
commit 306d6c49ac9ded11114cb53b0925da52f2c2ada1 upstream. When the oom killer kills a userspace process in the page fault handler while in guest context, the fault handler fails to release the mm_sem if the FAULT_FLAG_RETRY_NOWAIT option is set. This leads to a deadlock when tearing down the mm when the process terminates. This bug can only happen when pfault is enabled, so only KVM clients are affected. The problem arises in the rare cases in which handle_mm_fault does not release the mm_sem. This patch fixes the issue by manually releasing the mm_sem when needed. Fixes: 24eb3a824c4f3 ("KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault") Cc: <stable@vger.kernel.org> # 3.15+ Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>