summaryrefslogtreecommitdiff
path: root/drivers/gpu/host1x
AgeCommit message (Collapse)Author
2020-08-19gpu: host1x: debug: Fix multiple channels emitting messages simultaneouslyDmitry Osipenko
[ Upstream commit 35681862808472a0a4b9a8817ae2789c0b5b3edc ] Once channel's job is hung, it dumps the channel's state into KMSG before tearing down the offending job. If multiple channels hang at once, then they dump messages simultaneously, making the debug info unreadable, and thus, useless. This patch adds mutex which allows only one channel to emit debug messages at a time. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16gpu: host1x: Detach driver on unregisterThierry Reding
[ Upstream commit d9a0a05bf8c76e6dc79230669a8b5d685b168c30 ] Currently when a host1x device driver is unregistered, it is not detached from the host1x controller, which means that the device will stay around and when the driver is registered again, it may bind to the old, stale device rather than the new one that was created from scratch upon driver registration. This in turn can cause various weird crashes within the driver core because it is confronted with a device that was already deleted. Fix this by detaching the driver from the host1x controller when it is unregistered. This ensures that the deleted device also is no longer present in the device list that drivers will bind to. Reported-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31gpu: host1x: Allocate gather copy for host1xThierry Reding
[ Upstream commit b78e70c04c149299bd210759d7c7af7c86b89ca8 ] Currently when the gather buffers are copied, they are copied to a buffer that is allocated for the host1x client that wants to execute the command streams in the buffers. However, the gather buffers will be read by the host1x device, which causes SMMU faults if the DMA API is backed by an IOMMU. Fix this by allocating the gather buffer copy for the host1x device, which makes sure that it will be mapped into the host1x's IOVA space if the DMA API is backed by an IOMMU. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-25Merge tag 'drm/tegra/for-5.3-rc1' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Changes for v5.3-rc1 This contains a couple of small improvements and cleanups for the Tegra DRM driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621150753.19550-1-thierry.reding@gmail.com
2019-06-13host1x: debugfs_create_dir() can never return NULLGreg Kroah-Hartman
So there is no need to check for a value that can never happen. No need to check the return value all anyway, as any debugfs call can take the result of this function as an option just fine. Cc: Thierry Reding <thierry.reding@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-tegra@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282Thomas Gleixner
Based on 1 normalized pattern(s): this software is licensed under the terms of the gnu general public license version 2 as published by the free software foundation and may be copied distributed and modified under those terms this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 285 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05gpu: host1x: Do not link logical devices to DT nodesThierry Reding
Logical devices created by the host1x bus infrastructure don't need to be associated with a device tree node. Doing so will cause the driver core to attempt to hook up IOMMU operations and fail because it is not a real device. However, for backwards-compatibility, we need to provide various OF_* uevent variables that were previously provided by of_device_uevent() and which are parsed by libdrm in userspace when querying the available devices. Do this by implementing a uevent callback for the host1x bus. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05gpu: host1x: Increase maximum DMA segment sizeThierry Reding
Recent versions of the DMA API debug code have started to warn about violations of the maximum DMA segment size. This is because the segment size defaults to 64 KiB, which can easily be exceeded in large buffer allocations such as used in DRM/KMS for framebuffers. Technically the Tegra SMMU and ARM SMMU don't have a maximum segment size (they map individual pages irrespective of whether they are contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The maximum segment size is a 32-bit unsigned integer, though, so we can't set it to the correct maximum size, which would be the size of the aperture. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05gpu: host1x: Do not output error message for deferred probeThierry Reding
When deferring probe, avoid logging a confusing error message. While at it, make the error message more informational. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 228 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-11gpu: host1x: Program stream ID to bypass without SMMUArnd Bergmann
If SMMU support is not available, fall back to programming the bypass stream ID (0x7f). Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID") Suggested-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> [treding@nvidia.com: rebase this on top of a later build fix] Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-04-11gpu: host1x: Fix compile error when IOMMU API is not availableStefan Agner
In case the IOMMU API is not available compiling host1x fails with the following error: In file included from drivers/gpu/host1x/hw/host1x06.c:27: drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’: drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function ‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’? [-Werror=implicit-function-declaration] struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent); ^~~~~~~~~~~~~~~~~~~~ iommu_fwspec_free Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID") Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Continue CDMA execution starting with a next jobDmitry Osipenko
Currently gathers of a hung job are getting NOP'ed and a restarted CDMA executes the NOP'ed gathers. There shouldn't be a reason to not restart CDMA execution starting with a next job, avoiding the unnecessary churning with gathers NOP'ing. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Don't complete a completed jobDmitry Osipenko
There is a chance that the last job has been completed at the time of CDMA timeout handler invocation. In this case there is no need to complete the completed job. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Cancel only job that actually got stuckDmitry Osipenko
Host1x doesn't have information about jobs inter-dependency, that is something that will become available once host1x will get a proper jobs scheduler implementation. Currently a hang job causes other unrelated jobs to be canceled, that is a relic from downstream driver which is irrelevant to upstream. Let's cancel only the hanging job and not to touch other jobs in queue. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Optimize CDMA push buffer memory usageThierry Reding
The host1x CDMA push buffer is terminated by a special opcode (RESTART) that tells the CDMA to wrap around to the beginning of the push buffer. To accomodate the RESTART opcode, an extra 4 bytes are allocated on top of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words) that are used for other commands passed to CDMA. This requires that two memory pages are allocated, but most of the second page (4092 bytes) is never used. Decrease the number of slots to 511 so that the RESTART opcode fits within the page. Adjust the push buffer wraparound code to take into account push buffer sizes that are not a power of two. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Use correct semantics for HOST1X_CHANNEL_DMAENDThierry Reding
The HOST1X_CHANNEL_DMAEND is an offset relative to the value written to the HOST1X_CHANNEL_DMASTART register, but it is currently treated as an absolute address. This can cause SMMU faults if the CDMA fetches past a pushbuffer's IOMMU mapping. Properly setting the DMAEND prevents the CDMA from fetching beyond that address and avoid such issues. This is currently not observed because a whole (almost) page of essentially scratch space absorbs any excessive prefetching by CDMA. However, changing the number of slots in the push buffer can trigger these SMMU faults. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Support 40-bit addressing on Tegra186Thierry Reding
The host1x and clients instantiated on Tegra186 support addressing 40 bits of memory. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Restrict IOVA space to DMA maskThierry Reding
On Tegra186 and later, the ARM SMMU provides an input address space that is 48 bits wide. However, memory clients can only address up to 40 bits. If the geometry is used as-is, allocations of IOVA space can end up in a region that is not addressable by the memory clients. To fix this, restrict the IOVA space to the DMA mask of the host1x device. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Support 40-bit addressingThierry Reding
Tegra186 and later support 40 bits of address space. Additional registers need to be programmed to store the full 40 bits of push buffer addresses. Since command stream gathers can also reside in buffers in a 40-bit address space, a new variant of the GATHER opcode is also introduced. It takes two parameters: the first parameter contains the lower 32 bits of the address and the second parameter contains bits 32 to 39. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Introduce support for wide opcodesThierry Reding
The CDMA push buffer can currently only handle opcodes that take a single word parameter. However, the host1x implementation on Tegra186 and later supports opcodes that require multiple words as parameters. Unfortunately the way the push buffer is structured, these wide opcodes cannot simply be composed of two regular opcodes because that could result in the wide opcode being split across the end of the push buffer and the final RESTART opcode required to wrap the push buffer around would break the wide opcode. One way to fix this would be to remove the concept of slots to simplify push buffer operations. However, that's not entirely trivial and should be done in a separate patch. For now, simply use a different function to push four-word opcodes into the push buffer. Technically only three words are pushed, with the fourth word used as padding to preserve the 2-word alignment required by the slots abstraction. The fourth word is always a NOP opcode. Additional care must be taken when the end of the push buffer is reached. If a four-word opcode doesn't fit into the push buffer without being split by the boundary, NOP opcodes will be introduced and the new wide opcode placed at the beginning of the push buffer. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Program the channel stream IDThierry Reding
When processing command streams, make sure the host1x's stream ID is programmed for the channel so that addresses are properly translated through the SMMU. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04gpu: host1x: Set up stream ID tableThierry Reding
In order to enable the MMIO path stream ID protection provided by the incarnation of host1x found in Tegra186 and later, the host1x must be provided with the list of stream ID register offsets for each of its clients. Some clients (such as VIC) have multiple stream ID registers that are assumed to be contiguous. The host1x is programmed with the base offset and a limit which provide the range of registers that the host1x needs to monitor for writes. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04gpu: host1x: Represent host1x bus devices in debugfsThierry Reding
This new debugfs file represents the state of host1x bus devices, specifying the list of subdevices and marking which ones have successfully registered. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04gpu: host1x: Use completion instead of semaphoreArnd Bergmann
In this usage, the two are completely equivalent, but the completion documents better what is going on, and we generally try to avoid semaphores these days. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-29gpu: host1x: Add Tegra194 supportThierry Reding
The host1x hardware found on Tegra194 is mostly backwards compatible with the version found on Tegra186, with the notable exceptions of the increased number of syncpoints and mlocks. In addition, some rarely used features such as syncpoint wait bases were dropped and some registers had to move around to accomodate the increased number of syncpoints. Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27gpu: host1x: Fix syncpoint ID field size on Tegra186Thierry Reding
The number of syncpoints on Tegra186 is 576 and therefore no longer fits into 8 bits. Increase the size of the syncpoint ID field to 10 in order to accomodate all syncpoints. Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27gpu: host1x: Resize channel register region on Tegra186 and laterThierry Reding
The register region allocated per channel was decreased from 16384 bytes to 256 bytes on Tegra186 and later. Resize the region to make sure every channel (instead of only the first) is properly programmed. Suggested-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-09-26gpu: host1x: Detach Host1x from IOMMU DMA domain on arm32Dmitry Osipenko
Host1x is getting attached to an implicit IOMMU DMA domain if CONFIG_ARM_DMA_USE_IOMMU=y. Since Host1x driver manages IOMMU by itself, Host1x device must be detached from the implicit domain using arch-specific IOMMU-API. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-09-26gpu: host1x: Remove spurious tabThierry Reding
All other assignments have a single space around the = sign, so remove the spurious tab for consistency. Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-07-09gpu: host1x: Check whether size of unpin isn't 0Dmitry Osipenko
Only gather pins are mapped by the Host1x driver, regular BO relocations are not. Check whether size of unpin isn't 0, otherwise IOVA allocation at 0x0 could be erroneously released. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-07-09gpu: host1x: Skip IOMMU initialization if firewall is enabledDmitry Osipenko
Host1x's CDMA can't access the command buffers if IOMMU and Host1x firewall are enabled in the kernels config because firewall doesn't map the copied buffer into IOVA space. Fix this by skipping IOMMU initialization if firewall is enabled as firewall merges sparse cmdbufs into a single contiguous buffer and hence IOMMU isn't needed in this case. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-06-06Merge tag 'drm-next-2018-06-06-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "This starts to support NVIDIA volta hardware with nouveau, and adds amdgpu support for the GPU in the Kabylake-G (the intel + radeon single package chip), along with some initial Intel icelake enabling. Summary: New Drivers: - v3d - driver for broadcom V3D V3.x+ hardware - xen-front - XEN PV display frontend core: - handle zpos normalization in the core - stop looking at legacy pointers in atomic paths - improved scheduler documentation - improved aspect ratio validation - aspect ratio support for 64:27 and 256:135 - drop unused control node code. i915: - Icelake (ICL) enabling - GuC/HuC refactoring - PSR/PSR2 enabling and fixes - DPLL management refactoring - DP MST fixes - NV12 enabling - HDCP improvements - GEM/Execlist/reset improvements - GVT improvements - stolen memory first 4k fix amdgpu: - Vega 20 support - VEGAM support (Kabylake-G) - preOS scanout buffer reservation - power management gfxoff support for raven - SR-IOV fixes - Vega10 power profiles and clock voltage control - scatter/gather display support on CZ/ST amdkfd: - GFX9 dGPU support - userptr memory mapping nouveau: - major refactoring for Volta GV100 support tda998x: - HDMI i2c CEC support etnaviv: - removed unused logging code - license text cleanups - MMU handling improvements - timeout fence fix for 50 days uptime tegra: - IOMMU support in gr2d/gr3d drivers - zpos support vc4: - syncobj support - CTM, plane alpha and async cursor support analogix_dp: - HPD and aux chan fixes sun4i: - MIPI DSI support tilcdc: - clock divider fixes for OMAP-l138 LCDK board rcar-du: - R8A77965 support - dma-buf fences fixes - hardware indexed crtc/du group handling - generic zplane property support atmel-hclcdc: - generic zplane property support mediatek: - use generic video mode function exynos: - S5PV210 FIMD variant support - IPP v2 framework - more HW overlays support" * tag 'drm-next-2018-06-06-1' of git://anongit.freedesktop.org/drm/drm: (1286 commits) drm/amdgpu: fix 32-bit build warning drm/exynos: fimc: signedness bug in fimc_setup_clocks() drm/exynos: scaler: fix static checker warning drm/amdgpu: Use dev_info() to report amdkfd is not supported for this ASIC drm/amd/display: Remove use of division operator for long longs drm/amdgpu: Update GFX info structure to match what vega20 used drm/amdgpu/pp: remove duplicate assignment drm/sched: add rcu_barrier after entity fini drm/amdgpu: move VM BOs on LRU again drm/amdgpu: consistenly use VM moved flag drm/amdgpu: kmap PDs/PTs in amdgpu_vm_update_directories drm/amdgpu: further optimize amdgpu_vm_handle_moved drm/amdgpu: cleanup amdgpu_vm_validate_pt_bos v2 drm/amdgpu: rework VM state machine lock handling v2 drm/amdgpu: Add runtime VCN PG support drm/amdgpu: Enable VCN static PG by default on RV drm/amdgpu: Add VCN static PG support on RV drm/amdgpu: Enable VCN CG by default on RV drm/amdgpu: Add static CG control for VCN on RV drm/exynos: Fix default value for zpos plane property ...
2018-05-18gpu: host1x: Use not explicitly sized typesThierry Reding
The number of words and the offset in a gather don't need to be explicitly sized, so make them unsigned int instead. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18gpu: host1x: Rename relocarray -> relocs for consistencyThierry Reding
All other array variables use a plural, and this is the only one using the *array suffix. This is confusing, so rename it for consistency. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18gpu: host1x: Drop unnecessary host1x argumentThierry Reding
Functions taking a pointer to a host1x syncpoint as an argument don't need to specify a pointer to a host1x instance because it can be obtained from the syncpoint. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18gpu: host1x: Cleanup loop variable usageThierry Reding
Use unsigned int where possible and don't unnecessarily initialize the loop variable. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18gpu: host1x: Store pointer to client in jobsThierry Reding
Rather than storing some identifier derived from the application context that can't be used concretely anywhere, store a pointer to the client directly so that accesses can be made directly through that client object. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18gpu: host1x: Remove wait check supportThierry Reding
The job submission userspace ABI doesn't support this and there are no plans to implement it, so all of this code is dead and can be removed. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18gpu: host1x: Fix compiler errors by converting to dma_addr_tEmil Goode
The compiler is complaining with the following errors: drivers/gpu/host1x/cdma.c:94:48: error: passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type [-Werror=incompatible-pointer-types] drivers/gpu/host1x/cdma.c:113:48: error: passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type [-Werror=incompatible-pointer-types] The expected pointer type of the third argument to dma_alloc_wc() is dma_addr_t but phys_addr_t is passed. Change the phys member of struct push_buffer to be dma_addr_t so that we pass the correct type to dma_alloc_wc(). Also check pb->mapped for non-NULL in the destroy function as that is the right way of checking if dma_alloc_wc() was successful. Signed-off-by: Emil Goode <emil.fsw@goode.io> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-17gpu: host1x: Acquire a reference to the IOVA cacheThierry Reding
The IOVA API uses a memory cache to allocate IOVA nodes from. To make sure that this cache is available, obtain a reference to it and release the reference when the cache is no longer needed. On 64-bit ARM this is hidden by the fact that the DMA mapping API gets that reference and never releases it. On 32-bit ARM, this is papered over by the Tegra DRM driver (the sole user of the host1x API requiring the cache) acquiring a reference to the IOVA cache for its own purposes. However, there may be additional users of this API in the future, so fix this upfront to avoid surprises. Fixes: 404bfb78daf3 ("gpu: host1x: Add IOMMU support") Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-17gpu: host1x: Fix dma_free_wc() argument in the error pathDmitry Osipenko
If IOVA allocation or IOMMU mapping fails, dma_free_wc() is invoked with size=0 because of a typo, that triggers "kernel BUG at mm/vmalloc.c:124!". Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-03drivers: remove force dma flag from busesChristoph Hellwig
With each bus implementing its own DMA configuration callback, there is no need for bus to explicitly set the force_dma flag. Modify the of_dma_configure function to accept an input parameter which specifies if implicit DMA configuration is required when it is not described by the firmware. Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts Reviewed-by: Rob Herring <robh@kernel.org> [hch: tweaked the changelog a bit] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-03dma-mapping: move dma configuration to bus infrastructureNipun Gupta
ACPI/OF support for configuration of DMA is a bus specific aspect, and thus should be configured by the bus. Introduces a 'dma_configure' bus method so that busses can control their DMA capabilities. Also update the PCI, Platform, ACPI and host1x buses to use the new method. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [hch: simplified host1x_dma_configure based on a comment from Thierry, rewrote changelog] Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-12-21gpu: host1x: Use IOMMU groupsThierry Reding
Use IOMMU groups to attach the host1x device to its IOMMU domain. This is not strictly necessary because the domain isn't shared with any other device, but it makes the code consistent with how IOMMU is handled in other drivers and provides an easy way to detect when no IOMMU has been attached via device tree. Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-12-13gpu: host1x: Cleanup on initialization failureThierry Reding
When an error happens during the initialization of one of the sub- devices, make sure to properly cleanup all sub-devices that have been initialized up to that point. Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-12-13gpu: host1x: Rewrite conditional for better readabilityThierry Reding
The current check is slightly difficult to read, rewrite it to improve that a little. Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-11-15Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "This is the main drm pull request for v4.15. Core: - Atomic object lifetime fixes - Atomic iterator improvements - Sparse/smatch fixes - Legacy kms ioctls to be interruptible - EDID override improvements - fb/gem helper cleanups - Simple outreachy patches - Documentation improvements - Fix dma-buf rcu races - DRM mode object leasing for improving VR use cases. - vgaarb improvements for non-x86 platforms. New driver: - tve200: Faraday Technology TVE200 block. This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in the StorLink SL3516 (later Cortina Systems CS3516) as well as the Grain Media GM8180. New bridges: - SiI9234 support New panels: - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba LT089AC19000, Innolux AT043TN24 i915: - Remove Coffeelake from alpha support - Cannonlake workarounds - Infoframe refactoring for DisplayPort - VBT updates - DisplayPort vswing/emph/buffer translation refactoring - CCS fixes - Restore GPU clock boost on missed vblanks - Scatter list updates for userptr allocations - Gen9+ transition watermarks - Display IPC (Isochronous Priority Control) - Private PAT management - GVT: improved error handling and pci config sanitizing - Execlist refactoring - Transparent Huge Page support - User defined priorities support - HuC/GuC firmware refactoring - DP MST fixes - eDP power sequencing fixes - Use RCU instead of stop_machine - PSR state tracking support - Eviction fixes - BDW DP aux channel timeout fixes - LSPCON fixes - Cannonlake PLL fixes amdgpu: - Per VM BO support - Powerplay cleanups - CI powerplay support - PASID mgr for kfd - SR-IOV fixes - initial GPU reset for vega10 - Prime mmap support - TTM updates - Clock query interface for Raven - Fence to handle ioctl - UVD encode ring support on Polaris - Transparent huge page DMA support - Compute LRU pipe tweaks - BO flag to allow buffers to opt out of implicit sync - CTX priority setting API - VRAM lost infrastructure plumbing qxl: - fix flicker since atomic rework amdkfd: - Further improvements from internal AMD tree - Usermode events - Drop radeon support nouveau: - Pascal temperature sensor support - Improved BAR2 handling - MMU rework to support Pascal MMU exynos: - Improved HDMI/mixer support - HDMI audio interface support tegra: - Prep work for tegra186 - Cleanup/fixes msm: - Preemption support for a5xx - Display fixes for 8x96 (snapdragon 820) - Async cursor plane fixes - FW loading rework - GPU debugging improvements vc4: - Prep for DSI panels - fix T-format tiling scanout - New madvise ioctl Rockchip: - LVDS support omapdrm: - omap4 HDMI CEC support etnaviv: - GPU performance counters groundwork sun4i: - refactor driver load + TCON backend - HDMI improvements - A31 support - Misc fixes udl: - Probe/EDID read fixes. tilcdc: - Misc fixes. pl111: - Support more variants adv7511: - Improve EDID handling. - HDMI CEC support sii8620: - Add remote control support" * tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits) drm/rockchip: analogix_dp: Use mutex rather than spinlock drm/mode_object: fix documentation for object lookups. drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU drm/i915: Move init_clock_gating() back to where it was drm/i915: Prune the reservation shared fence array drm/i915: Idle the GPU before shinking everything drm/i915: Lock llist_del_first() vs llist_del_all() drm/i915: Calculate ironlake intermediate watermarks correctly, v2. drm/i915: Disable lazy PPGTT page table optimization for vGPU drm/i915/execlists: Remove the priority "optimisation" drm/i915: Filter out spurious execlists context-switch interrupts drm/amdgpu: use irq-safe lock for kiq->ring_lock drm/amdgpu: bypass lru touch for KIQ ring submission drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories() drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs() drm/amd/powerplay: initialize a variable before using it drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug drm/rockchip: add CONFIG_OF dependency for lvds ...
2017-11-14Merge tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - turn dma_cache_sync into a dma_map_ops instance and remove implementation that purely are dead because the architecture doesn't support noncoherent allocations - add a flag for busses that need DMA configuration (Robin Murphy) * tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: turn dma_cache_sync into a dma_map_ops method sh: make dma_cache_sync a no-op xtensa: make dma_cache_sync a no-op unicore32: make dma_cache_sync a no-op powerpc: make dma_cache_sync a no-op mn10300: make dma_cache_sync a no-op microblaze: make dma_cache_sync a no-op ia64: make dma_cache_sync a no-op frv: make dma_cache_sync a no-op x86: make dma_cache_sync a no-op floppy: consolidate the dummy fd_cacheflush definition drivers: flag buses which demand DMA configuration