summaryrefslogtreecommitdiff
path: root/drivers/nvme/target/configfs.c
AgeCommit message (Collapse)Author
2021-11-17nvmet: fix use-after-free when a port is removedIsrael Rukshin
[ Upstream commit e3e19dcc4c416d65f99f13d55be2b787f8d0050e ] When a port is removed through configfs, any connected controllers are starting teardown flow asynchronously and can still send commands. This causes a use-after-free bug for any command that dereferences req->port (like in nvmet_parse_io_cmd). To fix this, wait for all the teardown scheduled works to complete (like release_work at rdma/tcp drivers). This ensures there are no active controllers when the port is eventually removed. Signed-off-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-31nvmet: Fix use-after-free bug when a port is removedLogan Gunthorpe
When a port is removed through configfs, any connected controllers are still active and can still send commands. This causes a use-after-free bug which is detected by KASAN for any admin command that dereferences req->port (like in nvmet_execute_identify_ctrl). To fix this, disconnect all active controllers when a subsystem is removed from a port. This ensures there are no active controllers when the port is eventually removed. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-09nvmet: print a hint while rejecting NSID 0 or 0xffffffffMikhail Skorzhinskii
Adding this hint for the sake of convenience. It was spotted that a few times people spent some time before understanding what is exactly wrong in configuration process. This should save a few time in such situations, especially for people who is not very confident with NVMe requirements. Signed-off-by: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-04-25nvmet: return a specified error it subsys_alloc failsMinwoo Im
nvmet_subsys_alloc() returns its pointer or NULL if it fails. We can see three different steps in this function: 1. memory allocation 2. argument check 3. memory allocation for string But now the callers of this function do not seem to handle case 2 by returning -ENOMEM only even if it fails with an invalid parameter. This patch specifies error codes so that caller can pass it to its own caller. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-02-20nvmet: convert to SPDX identifiersChristoph Hellwig
Update license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-13nvmet: allow configfs tcp trtype configurationSagi Grimberg
Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@lightbitslabs.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07nvmet: expose support for fabrics SQ flow control disable in treqSagi Grimberg
Technical Proposal introduces an indication for SQ flow control disable support. Expose it since we are able to operate in this mode. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: don't override treq upon modification.Sagi Grimberg
Only override the allowed parts of it. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> [hch: slight tweak to the NVME_TREQ_SECURE_CHANNEL_MASK definition] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: enable Discovery Controller AENsJay Sternberg
Add functions to find connections requesting Discovery Change events and send a notification to hosts that maintain an explicit persistent connection and have and active Asynchronous Event Request pending. Only Hosts that have access to the Subsystem effected by the change will receive notifications of Discovery Change event. Call these functions each time there is a configfs change that effects the Discover Log Pages. Set the OAES field in the Identify Controller response to advertise the support for Asynchronous Event Notifications. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-17nvmet: Optionally use PCI P2P memoryLogan Gunthorpe
Create a configfs attribute in each nvme-fabrics namespace to enable P2P memory use. The attribute may be enabled (with a boolean) or a specific P2P device may be given (with the device's PCI name). When enabled, the namespace will ensure the underlying block device supports P2P and is compatible with any specified P2P device. If no device was specified it will ensure there is compatible P2P memory somewhere in the system. Enabling a namespace with P2P memory will fail with EINVAL (and an appropriate dmesg error) if any of these conditions are not met. Once a controller is set up on a specific port, the P2P device to use for each namespace will be found and stored in a radix tree by namespace ID. When memory is allocated for a request, the tree is used to look up the P2P device to allocate memory against. If no device is in the tree (because no appropriate device was found), or if allocation of P2P memory fails, fall back to using regular memory. Signed-off-by: Stephen Bates <sbates@raithlin.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> [hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-08-14Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "First pull request for this merge window, there will also be a followup request with some stragglers. This pull request contains: - Fix for a thundering heard issue in the wbt block code (Anchal Agarwal) - A few NVMe pull requests: * Improved tracepoints (Keith) * Larger inline data support for RDMA (Steve Wise) * RDMA setup/teardown fixes (Sagi) * Effects log suppor for NVMe target (Chaitanya Kulkarni) * Buffered IO suppor for NVMe target (Chaitanya Kulkarni) * TP4004 (ANA) support (Christoph) * Various NVMe fixes - Block io-latency controller support. Much needed support for properly containing block devices. (Josef) - Series improving how we handle sense information on the stack (Kees) - Lightnvm fixes and updates/improvements (Mathias/Javier et al) - Zoned device support for null_blk (Matias) - AIX partition fixes (Mauricio Faria de Oliveira) - DIF checksum code made generic (Max Gurtovoy) - Add support for discard in iostats (Michael Callahan / Tejun) - Set of updates for BFQ (Paolo) - Removal of async write support for bsg (Christoph) - Bio page dirtying and clone fixups (Christoph) - Set of bcache fix/changes (via Coly) - Series improving blk-mq queue setup/teardown speed (Ming) - Series improving merging performance on blk-mq (Ming) - Lots of other fixes and cleanups from a slew of folks" * tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits) blkcg: Make blkg_root_lookup() work for queues in bypass mode bcache: fix error setting writeback_rate through sysfs interface null_blk: add lock drop/acquire annotation Blk-throttle: reduce tail io latency when iops limit is enforced block: paride: pd: mark expected switch fall-throughs block: Ensure that a request queue is dissociated from the cgroup controller block: Introduce blk_exit_queue() blkcg: Introduce blkg_root_lookup() block: Remove two superfluous #include directives blk-mq: count the hctx as active before allocating tag block: bvec_nr_vecs() returns value for wrong slab bcache: trivial - remove tailing backslash in macro BTREE_FLAG bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section bcache: set max writeback rate when I/O request is idle bcache: add code comments for bset.c bcache: fix mistaken comments in request.c bcache: fix mistaken code comments in bcache.h bcache: add a comment in super.c bcache: avoid unncessary cache prefetch bch_btree_node_get() bcache: display rate debug parameters to 0 when writeback is not running ...
2018-07-27nvmet: support configuring ANA groupsChristoph Hellwig
Allow creating non-default ANA groups (group ID > 1). Groups are created either by assigning the group ID to a namespace, or by creating a configfs group object under a specific port. All namespaces assigned to a group that doesn't have a configfs object for a given port are marked as inaccessible. Allow changing the ANA state on a per-port basis by creating an ana_groups directory under each port, and another directory with an ana_state file in it. The default ANA group 1 directory is created automatically for each port. For all changes in ANA configuration the ANA change AEN is sent. We only keep a global changecount instead of additional per-group changecounts to keep the implementation as simple as possible. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-27nvmet: add minimal ANA supportChristoph Hellwig
Add support for Asynchronous Namespace Access as specified in NVMe 1.3 TP 4004. Just add a default ANA group 1 that is optimized on all ports. This is (and will remain) the default assignment for any namespace not epxlicitly assigned to another ANA group. The ANA state can be manually changed through the configfs interface, including the change state. Includes fixes and improvements from Hannes Reinecke. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-25nvmet: fixup crash on NULL device pathHannes Reinecke
When writing an empty string into the device_path attribute the kernel will crash with nvmet: failed to open block device (null): (-22) BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 This patch sanitizes the error handling for invalid device path settings. Fixes: a07b4970 ("nvmet: add a generic NVMe target") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet-rdma: support max(16KB, PAGE_SIZE) inline dataSteve Wise
The patch enables inline data sizes using up to 4 recv sges, and capping the size at 16KB or at least 1 page size. So on a 4K page system, up to 16KB is supported, and for a 64K page system 1 page of 64KB is supported. We avoid > 0 order page allocations for the inline buffers by using multiple recv sges, one for each page. If the device cannot support the configured inline data size due to lack of enough recv sges, then log a warning and reduce the inline size. Add a new configfs port attribute, called param_inline_data_size, to allow configuring the size of inline data for a given nvmf port. The maximum size allowed is still enforced by nvmet-rdma with NVMET_RDMA_MAX_INLINE_DATA_SIZE, which is now max(16KB, PAGE_SIZE). And the default size, if not specified via configfs, is still PAGE_SIZE. This preserves the existing behavior, but allows larger inline sizes for small page systems. If the configured inline data size exceeds NVMET_RDMA_MAX_INLINE_DATA_SIZE, a warning is logged and the size is reduced. If param_inline_data_size is set to 0, then inline data is disabled for that nvmf port. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet: add buffered I/O support for file backed nsChaitanya Kulkarni
Add a new "buffered_io" attribute, which disabled direct I/O and thus enables page cache based caching when enabled. The attribute can only be changed when the namespace is disabled as the file has to be reopend for the change to take effect. The possibly blocking read/write are deferred to a newly introduced global workqueue. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-08nvmet: filter newlines from user inputSagi Grimberg
We should avoid consuming the newlines in traddr, trsvcid and device_path. Add minimal processing to make sure they are gone. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-26nvmet: refactor configfs transport type handlingChristoph Hellwig
Have a common table of mappings from numerical transport ids to names, and zero the transport specific area in common code in nvmet_addr_trtype_store. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-26nvmet: move device_uuid configfs attr definition to suitable placeMax Gurtovoy
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-19nvmet: make config_item_type constBhumika Goyal
Make config_item_type structures const as they are either passed to a function having the argument as const or used inside an if statement or stored in the const "ci_type" field of a config_item structure. Done using Coccinelle Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28nvmet: use NVME_NSID_ALLChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
2017-07-20nvmet: preserve controller serial number between rebootsJohannes Thumshirn
The NVMe target has no way to preserve controller serial IDs across reboots which breaks udev scripts doing SYMLINK+="dev/disk/by-id/nvme-$env{ID_SERIAL}-part%n. Export the randomly generated serial number via configfs and allow setting of a serial via configfs to mitigate this breakage. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-20nvmet: prefix version configfs file with attrJohannes Thumshirn
The NVMe target's attribute files need an attr prefix in order to have nvmetcli recognize them. Add this attribute. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-15nvmet: allow overriding the NVMe VS via configfsJohannes Thumshirn
Allow overriding the announced NVMe Version of a via configfs. This is particularly helpful when debugging new features for the host or target side without bumping the hard coded version (as the target might not be fully compliant to the announced version yet). Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Guan Junxiong <guanjunxiong@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-15nvmet: add uuid field to nvme_ns and populate via configfsJohannes Thumshirn
Add the UUID field from the NVMe Namespace Identification Descriptor to the nvmet_ns structure and allow it's population via configfs. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-01-26nvmet: delete controllers deletion upon subsystem releaseSagi Grimberg
No reason for them to be kept around if we are deleting the subsystem, so instead of passively wait for the host to disconnect, actively delete the controllers. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2016-12-14Merge tag 'configfs-for-4.10' of git://git.infradead.org/users/hch/configfsLinus Torvalds
Pull configfs update from Christoph Hellwig: "Just one simple change from Andrzej to drop the pointless return value from the ->drop_link method" * tag 'configfs-for-4.10' of git://git.infradead.org/users/hch/configfs: fs: configfs: don't return anything from drop_link
2016-12-06nvme-fabrics: patch target code in prep for FC transport supportJames Smart
- Add FC transport type decoding - Add FC address family decoding Signed-off-by: James Smart <james.smart@broadcom.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-12-06nvmet: Fix possible infinite loop triggered on hot namespace removalSolganik Alexander
When removing a namespace we delete it from the subsystem namespaces list with list_del_init which allows us to know if it is enabled or not. The problem is that list_del_init initialize the list next and does not respect the RCU list-traversal we do on the IO path for locating a namespace. Instead we need to use list_del_rcu which is allowed to run concurrently with the _rcu list-traversal primitives (keeps list next intact) and guarantees concurrent nvmet_find_naespace forward progress. By changing that, we cannot rely on ns->dev_link for knowing if the namspace is enabled, so add enabled indicator entry to nvmet_ns for that. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com> Cc: <stable@vger.kernel.org> # v4.8+
2016-12-01fs: configfs: don't return anything from drop_linkAndrzej Pietrasiewicz
Documentation/filesystems/configfs/configfs.txt says: "When unlink(2) is called on the symbolic link, the source item is notified via the ->drop_link() method. Like the ->drop_item() method, this is a void function and cannot return failure." The ->drop_item() is indeed a void function, the ->drop_link() is actually not. This, together with the fact that the value of ->drop_link() is silently ignored suggests, that it is the ->drop_link() return type that should be corrected and changed to void. This patch changes drop_link() signature and all its users. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> [hch: reverted reformatting of some code] Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-07-07nvmet: fix an error codeDan Carpenter
We accidentally return zero here when ERR_PTR(-ENOMEM) is intended. Fixes: a07b4970f464 ('nvmet: add a generic NVMe target') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvmet: add a generic NVMe targetChristoph Hellwig
This patch introduces a implementation of NVMe subsystems, controllers and discovery service which allows to export NVMe namespaces across fabrics such as Ethernet, FC etc. The implementation conforms to the NVMe 1.2.1 specification and interoperates with NVMe over fabrics host implementations. Configuration works using configfs, and is best performed using the nvmetcli tool from http://git.infradead.org/users/hch/nvmetcli.git, which also has a detailed explanation of the required steps in the README file. Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com> Signed-off-by: Anthony Knapp <anthony.j.knapp@intel.com> Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>