Age | Commit message (Collapse) | Author |
|
errata:
When a read command returns less data than specified in the PRDs (for
example, there are two PRDs for this command, but the device returns a
number of bytes which is less than in the first PRD), the second PRD of
this command is not read out of the PRD FIFO, causing the next command
to use this PRD erroneously.
workaround
- forces sg_tablesize = 1
- modified the sg_io function in block/scsi_ioctl.c to use a 64k buffer
allocated with dma_alloc_coherent during the probe in ahci_imx
- In order to fix the scsi/sata hang, when CD_ROM and HDD are
accessed simultaneously after the workaround is applied.
Do not go to sleep in scsi_eh_handler, when there is host failed.
Signed-off-by: Richard Zhu <Richard.Zhu@freescale.com>
|
|
Tejun says:
"At least for libata, worrying about suspend/resume failures don't make
whole lot of sense. If suspend failed, just proceed with suspend. If
the device can't be woken up afterwards, that's that. There isn't
anything we could have done differently anyway. The same for resume, if
spinup fails, the device is dud and the following commands will invoke
EH actions and will eventually fail. Again, there really isn't any
*choice* to make. Just making sure the errors are handled gracefully
(ie. don't crash) and the following commands are handled correctly
should be enough."
The only libata user that actually cares about the result from a suspend
operation is libsas. However, it only cares about whether queuing a new
operation collides with an in-flight one. All libsas does with the
error is retry, but we can just let libata wait for the previous
operation before continuing.
Other cleanups include:
1/ Unifying all ata port pm operations on an ata_port_pm_ prefix
2/ Marking all ata port pm helper routines as returning void, only
ata_port_pm_ entry points need to fake a 0 return value.
3/ Killing ata_port_{suspend|resume}_common() in favor of calling
ata_port_request_pm() directly
4/ Killing the wrappers that just do a to_ata_port() conversion
5/ Clearly marking the entry points that do async operations with an
_async suffix.
Reference: http://marc.info/?l=linux-scsi&m=138995409532286&w=2
Cc: Phillip Susi <psusi@ubuntu.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit bc6e7c4b0d1a1f742d96556f63d68f17f4e232c3)
|
|
commit 49718f0fb8c9af192b33d8af3a2826db04025371 upstream.
The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).
However, this assumption turns out to be wrong for things like the ses
driver. Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting. If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.
This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine. Since ses doesn't define any such callbacks, the
crash won't occur.
This fixes Bugzilla #101371.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stanisław Pitucha <viraptor@gmail.com>
Reported-by: Ilan Cohen <ilanco@gmail.com>
Tested-by: Ilan Cohen <ilanco@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8f2777f53e3d5ad8ef2a176a4463a5c8e1a16431 upstream.
Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:
BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
#0: (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
[<ffffffff816c612c>] dump_stack+0x4f/0x7b
[<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
[<ffffffff816c87aa>] __schedule+0x71a/0xa10
[<ffffffff816c8ad2>] schedule+0x32/0x80
[<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
[<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
[<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
[<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
[<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
[<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
[<ffffffff814a2650>] scsi_ioctl+0x150/0x440
[<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
[<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
[<ffffffff811da608>] block_ioctl+0x38/0x40
[<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
[<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
[<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f6979adeaab578f8ca14fdd32b06ddee0d9d3314 upstream.
Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:
general protection fault: 0000 [#3] PREEMPT SMP
RIP: fc_lport_recv_req+0x72/0x280 [libfc]
Call Trace:
fc_exch_recv+0x642/0xde0 [libfc]
fcoe_percpu_receive_thread+0x46a/0x5ed [fcoe]
kthread+0x10a/0x120
ret_from_fork+0x42/0x70
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 451a2886b6bf90e2fb378f7c46c655450fb96e81 upstream.
unfortunately, allowing an arbitrary 16bit value means a possibility of
overflow in the calculation of total number of pages in bio_map_user_iov() -
we rely on there being no more than PAGE_SIZE members of sum in the
first loop there. If that sum wraps around, we end up allocating
too small array of pointers to pages and it's easy to overflow it in
the second loop.
X-Coverup: TINC (and there's no lumber cartel either)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[bwh: s/MAX_UIOVEC/UIO_MAXIOV/. This was fixed upstream by commit
fdc81f45e9f5 ("sg_start_req(): use import_iovec()"), but we don't have
that function.]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3f1c0581310d5d94bd72740231507e763a6252a4 upstream.
Fixes another signed / unsigned array indexing bug in the ipr driver.
Currently, when hrrq_index wraps, it becomes a negative number. We
do the modulo, but still have a negative number, so we end up indexing
backwards in the array. Given where the hrrq array is located in memory,
we probably won't actually reference memory we don't own, but nonetheless
ipr is still looking at data within struct ipr_ioa_cfg and interpreting it as
struct ipr_hrr_queue data, so bad things could certainly happen.
Each ipr adapter has anywhere from 1 to 16 HRRQs. By default, we use 2 on new
adapters. Let's take an example:
Assume ioa_cfg->hrrq_index=0x7fffffffe and ioa_cfg->hrrq_num=4:
The atomic_add_return will then return -1. We mod this with 3 and get -2, add
one and get -1 for an array index.
On adapters which support more than a single HRRQ, we dedicate HRRQ to adapter
initialization and error interrupts so that we can optimize the other queues
for fast path I/O. So all normal I/O uses HRRQ 1-15. So we want to spread the
I/O requests across those HRRQs.
With the default module parameter settings, this bug won't hit, only when
someone sets the ipr.number_of_msix parameter to a value larger than 3 is when
bad things start to happen.
Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bb7c54339e6a10ecce5c4961adf5e75b3cf0af30 upstream.
When ipr's internal driver trace was changed to an atomic, a signed/unsigned
bug slipped in which results in us indexing backwards in our memory buffer
writing on memory that does not belong to us. This patch fixes this by removing
the modulo and instead just mask off the low bits.
Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 36b8e180e1e929e00b351c3b72aab3147fc14116 upstream.
Make sure we have the host lock held when calling scsi_report_bus_reset. Fixes
a crash seen as the __devices list in the scsi host was changing as we were
iterating through it.
Reviewed-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e7ac6c6666bec0a354758a1298d3231e4a635362 upstream.
Two SLES11 SP3 servers encountered similar crashes simultaneously
following some kind of SAN/tape target issue:
...
qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002.
qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002.
qla2xxx [0000:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2.
qla2xxx [0000:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2.
qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [0000:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2.
qla2xxx [0000:81:00.0]-00af:3: Performing ISP error recovery - ha=ffff88bf04d18000.
rport-3:0-0: blocked FC remote port time out: removing target and saving binding
qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [0000:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2.
rport-2:0-0: blocked FC remote port time out: removing target and saving binding
sg_rq_end_io: device detached
BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
IP: [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP
CPU 0
...
Supported: No, Proprietary modules are loaded [1739975.390463]
Pid: 27965, comm: ABCD Tainted: PF X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8
RIP: 0010:[<ffffffff8133b268>] [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
RSP: 0018:ffff8839dc1e7c68 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff883f0592fc00 RCX: 0000000000000090
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000138
RBP: 0000000000000138 R08: 0000000000000010 R09: ffffffff81bd39d0
R10: 00000000000009c0 R11: ffffffff81025790 R12: 0000000000000001
R13: ffff883022212b80 R14: 0000000000000004 R15: ffff883022212b80
FS: 00007f8e54560720(0000) GS:ffff88407f800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000002a8 CR3: 0000007e6ced6000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ABCD (pid: 27965, threadinfo ffff8839dc1e6000, task ffff883592e0c640)
Stack:
ffff883f0592fc00 00000000fffffffa 0000000000000001 ffff883022212b80
ffff883eff772400 ffffffffa03fa309 0000000000000000 0000000000000000
ffffffffa04003a0 ffff883f063196c0 ffff887f0379a930 ffffffff8115ea1e
Call Trace:
[<ffffffffa03fa309>] st_open+0x129/0x240 [st]
[<ffffffff8115ea1e>] chrdev_open+0x13e/0x200
[<ffffffff811588a8>] __dentry_open+0x198/0x310
[<ffffffff81167d74>] do_last+0x1f4/0x800
[<ffffffff81168fe9>] path_openat+0xd9/0x420
[<ffffffff8116946c>] do_filp_open+0x4c/0xc0
[<ffffffff8115a00f>] do_sys_open+0x17f/0x250
[<ffffffff81468d92>] system_call_fastpath+0x16/0x1b
[<00007f8e4f617fd0>] 0x7f8e4f617fcf
Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 <f0> ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0
RIP [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
RSP <ffff8839dc1e7c68>
CR2: 00000000000002a8
Analysis reveals the cause of the crash to be due to STp->device
being NULL. The pointer was NULLed via scsi_tape_put(STp) when it
calls scsi_tape_release(). In st_open() we jump to err_out after
scsi_block_when_processing_errors() completes and returns the
device as offline (sdev_state was SDEV_DEL):
1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host
1181 module count. */
1182 static int st_open(struct inode *inode, struct file *filp)
1183 {
1184 int i, retval = (-EIO);
1185 int resumed = 0;
1186 struct scsi_tape *STp;
1187 struct st_partstat *STps;
1188 int dev = TAPE_NR(inode);
1189 char *name;
...
1217 if (scsi_autopm_get_device(STp->device) < 0) {
1218 retval = -EIO;
1219 goto err_out;
1220 }
1221 resumed = 1;
1222 if (!scsi_block_when_processing_errors(STp->device)) {
1223 retval = (-ENXIO);
1224 goto err_out;
1225 }
...
1264 err_out:
1265 normalize_buffer(STp->buffer);
1266 spin_lock(&st_use_lock);
1267 STp->in_use = 0;
1268 spin_unlock(&st_use_lock);
1269 scsi_tape_put(STp); <-- STp->device = 0 after this
1270 if (resumed)
1271 scsi_autopm_put_device(STp->device);
1272 return retval;
The ref count for the struct scsi_tape had already been reduced
to 1 when the .remove method of the st module had been called.
The kref_put() in scsi_tape_put() caused scsi_tape_release()
to be called:
0266 static void scsi_tape_put(struct scsi_tape *STp)
0267 {
0268 struct scsi_device *sdev = STp->device;
0269
0270 mutex_lock(&st_ref_mutex);
0271 kref_put(&STp->kref, scsi_tape_release); <-- calls this
0272 scsi_device_put(sdev);
0273 mutex_unlock(&st_ref_mutex);
0274 }
In scsi_tape_release() the struct scsi_device in the struct
scsi_tape gets set to NULL:
4273 static void scsi_tape_release(struct kref *kref)
4274 {
4275 struct scsi_tape *tpnt = to_scsi_tape(kref);
4276 struct gendisk *disk = tpnt->disk;
4277
4278 tpnt->device = NULL; <<<---- where the dev is nulled
4279
4280 if (tpnt->buffer) {
4281 normalize_buffer(tpnt->buffer);
4282 kfree(tpnt->buffer->reserved_pages);
4283 kfree(tpnt->buffer);
4284 }
4285
4286 disk->private_data = NULL;
4287 put_disk(disk);
4288 kfree(tpnt);
4289 return;
4290 }
Although the problem was reported on SLES11.3 the problem appears
in linux-next as well.
The crash is fixed by reordering the code so we no longer access
the struct scsi_tape after the kref_put() is done on it in st_open().
Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Signed-off-by: Darren Lavender <darren.lavender@hp.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ef86cb2059a14b4024c7320999ee58e938873032 upstream.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 535fb906225fb7436cb658144d0c0cea14a26f3e upstream.
Avoid that srp_terminate_io() can get invoked while srp_queuecommand()
is in progress. This patch avoids that an I/O timeout can trigger the
following kernel warning:
WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:1447 srp_terminate_io+0xef/0x100 [ib_srp]()
Call Trace:
[<ffffffff814c65a2>] dump_stack+0x4e/0x68
[<ffffffff81051f71>] warn_slowpath_common+0x81/0xa0
[<ffffffff8105204a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa075f51f>] srp_terminate_io+0xef/0x100 [ib_srp]
[<ffffffffa07495da>] __rport_fail_io_fast+0xba/0xc0 [scsi_transport_srp]
[<ffffffffa0749a90>] rport_fast_io_fail_timedout+0xe0/0xf0 [scsi_transport_srp]
[<ffffffff8106e09b>] process_one_work+0x1db/0x780
[<ffffffff8106e75b>] worker_thread+0x11b/0x450
[<ffffffff81073c64>] kthread+0xe4/0x100
[<ffffffff814cf26c>] ret_from_fork+0x7c/0xb0
See also patch "scsi_transport_srp: Add transport layer error
handling" (commit ID 29c17324803c).
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit be34c62ddf39d1931780b07a6f4241393e4ba2ee upstream.
Introduce the helper function srp_wait_for_queuecommand().
Move the definition of scsi_request_fn_active(). Add a comment
above srp_wait_for_queuecommand() that support for scsi-mq needs
to be added.
This patch does not change any functionality. A second call to
srp_wait_for_queuecommand() will be introduced in the next patch.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 45c44b5ff9caa743ed9c2bfd44307c536c9caf1e upstream.
Increase the default init stage change timeout from 15 seconds to 30 seconds.
This resolves issues we have seen with some adapters not transitioning
to the first init stage within 15 seconds, which results in adapter
initialization failures.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 859c75aba20264d87dd026bab0d0ca3bff385955 upstream.
Add a call to pci_set_master(...) missing in the previous
patch "hpsa: refine the pci enable/disable handling".
Found thanks to Rob Elliot.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Robert Elliott <elliott@hp.com>
Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 132aa220b45d60e9b20def1e9d8be9422eed9616 upstream.
When a second(kdump) kernel starts and the hard reset method is used
the driver calls pci_disable_device without previously enabling it,
so the kernel shows a warning -
[ 16.876248] WARNING: at drivers/pci/pci.c:1431 pci_disable_device+0x84/0x90()
[ 16.882686] Device hpsa
disabling already-disabled device
...
This patch fixes it, in addition to this I tried to balance also some other pairs
of enable/disable device in the driver.
Unfortunately I wasn't able to verify the functionality for the case of a sw reset,
because of a lack of proper hw.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 27f344eb15dd0da80ebec80c7245e8c85043f841 upstream.
Add a memory barrier to ensure the valid bit is read before
any of the cqe payload is read. This fixes an issue seen
on Power where the cqe payload was getting loaded before
the valid bit. When this occurred, we saw an iotag out of
range error when a command completed, but since the iotag
looked invalid the command didn't get completed to scsi core.
Later we hit the command timeout, attempted to abort the command,
then waited for the aborted command to get returned. Since the
adapter already returned the command, we timeout waiting,
and end up escalating EEH all the way to host reset. This
patch fixes this issue.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 74856fbf441929918c49ff262ace9835048e4e6a upstream.
256 bytes per sector support has been broken since 2.6.X,
and no-one stepped up to fix this.
So disable support for it.
Signed-off-by: Mark Hounschell <dmarkh@cfl.rr.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dc45708ca9988656d706940df5fd102672c5de92 upstream.
Set the SRB flags correctly when there is no data transfer. Without this
change some IHV drivers will fail valid commands such as TEST_UNIT_READY.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9493c2422cae272d6f1f567cbb424195defe4176 upstream.
Remove 2 redundant extern inline functions: qla8044_set_qsnt_ready() and
qla8044_need_reset_handler(). At present, within upstream next kernel
source code, they are only used within "drivers/scsi/qla2xxx/qla_nx2.c".
The related error and warnings (with allmodconfig under tile):
CC [M] drivers/scsi/qla2xxx/qla_nx2.o
drivers/scsi/qla2xxx/qla_nx2.c:1633:1: error: static declaration of 'qla8044_need_reset_handler' follows non-static declaration
qla8044_need_reset_handler(struct scsi_qla_host *vha)
^
In file included from drivers/scsi/qla2xxx/qla_def.h:3706:0,
from drivers/scsi/qla2xxx/qla_nx2.c:11:
drivers/scsi/qla2xxx/qla_gbl.h:756:20: note: previous declaration of 'qla8044_need_reset_handler' was here
extern inline void qla8044_need_reset_handler(struct scsi_qla_host *vha);
^
drivers/scsi/qla2xxx/qla_gbl.h:756:20: warning: inline function 'qla8044_need_reset_handler' declared but never defined
make[3]: *** [drivers/scsi/qla2xxx/qla_nx2.o] Error 1
make[2]: *** [drivers/scsi/qla2xxx] Error 2
make[1]: *** [drivers/scsi] Error 2
make: *** [drivers] Error 2
CC [M] drivers/scsi/qla2xxx/qla_tmpl.o
In file included from drivers/scsi/qla2xxx/qla_def.h:3706:0,
from drivers/scsi/qla2xxx/qla_tmpl.c:7:
drivers/scsi/qla2xxx/qla_gbl.h:755:20: warning: inline function 'qla8044_set_qsnt_ready' declared but never defined
extern inline void qla8044_set_qsnt_ready(struct scsi_qla_host *vha);
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 579d69bc1fd56d5af5761969aa529d1d1c188300 upstream.
The 3w-sas driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point. Also remove the dma mapping helpers
which have another inherent race due to the request_id index.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Torsten Luettgert <ml-lkml@enda.eu>
Tested-by: Bernd Kardatzki <Bernd.Kardatzki@med.uni-tuebingen.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 118c855b5623f3e2e6204f02623d88c09e0c34de upstream.
The 3w-9xxx driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point. Also remove the dma mapping helpers
which have another inherent race due to the request_id index.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9cd9554615cba14f0877cc9972a6537ad2bdde61 upstream.
The 3w-xxxx driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point. Also remove the dma mapping helpers
which have another inherent race due to the request_id index.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 56cbd0ccc1b508de19561211d7ab9e1c77e6b384 upstream.
mvsas is giving a General protection fault when it encounters an expander
attached ATA device. Analysis of mvs_task_prep_ata() shows that the driver is
assuming all ATA devices are locally attached and obtaining the phy mask by
indexing the local phy table (in the HBA structure) with the phy id. Since
expanders have many more phys than the HBA, this is causing the index into the
HBA phy table to overflow and returning rubbish as the pointer.
mvs_task_prep_ssp() instead does the phy mask using the port properties.
Mirror this in mvs_task_prep_ata() to fix the panic.
Reported-by: Adam Talbot <ajtalbot1@gmail.com>
Tested-by: Adam Talbot <ajtalbot1@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8de580742fee8bc34d116f57a20b22b9a5f08403 upstream.
We may exit this function without properly freeing up the maapings
we may have acquired. Fix the bug.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b367dcaa512922e9207160bef0895622cfae4c9f upstream.
udelay() does not work on some architectures for values above
2000, in particular on ARM:
ERROR: "__bad_udelay" [drivers/scsi/bfa/bfa.ko] undefined!
Reported-by: Vagrant Cascadian <vagrant@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2e7cee027b26cbe7e6685a7a14bd2850bfe55d33 upstream.
Kernel panic was happening as iscsi_host_remove() was called on
a host which was not yet added.
Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bba0bdd7ad4713d82338bcd9b72d57e9335a664b upstream.
SCSI transport drivers and SCSI LLDs block a SCSI device if the
transport layer is not operational. This means that in this state
no requests should be processed, even if the REQ_PREEMPT flag has
been set. This patch avoids that a rescan shortly after a cable
pull sporadically triggers the following kernel oops:
BUG: unable to handle kernel paging request at ffffc9001a6bc084
IP: [<ffffffffa04e08f2>] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib]
Process rescan-scsi-bus (pid: 9241, threadinfo ffff88053484a000, task ffff880534aae100)
Call Trace:
[<ffffffffa0718135>] srp_post_send+0x65/0x70 [ib_srp]
[<ffffffffa071b9df>] srp_queuecommand+0x1cf/0x3e0 [ib_srp]
[<ffffffffa0001ff1>] scsi_dispatch_cmd+0x101/0x280 [scsi_mod]
[<ffffffffa0009ad1>] scsi_request_fn+0x411/0x4d0 [scsi_mod]
[<ffffffff81223b37>] __blk_run_queue+0x27/0x30
[<ffffffff8122a8d2>] blk_execute_rq_nowait+0x82/0x110
[<ffffffff8122a9c2>] blk_execute_rq+0x62/0xf0
[<ffffffffa000b0e8>] scsi_execute+0xe8/0x190 [scsi_mod]
[<ffffffffa000b2f3>] scsi_execute_req+0xa3/0x130 [scsi_mod]
[<ffffffffa000c1aa>] scsi_probe_lun+0x17a/0x450 [scsi_mod]
[<ffffffffa000ce86>] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod]
[<ffffffffa000dc2f>] __scsi_scan_target+0xdf/0x1f0 [scsi_mod]
[<ffffffffa000dfa3>] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod]
[<ffffffffa000edfb>] scsi_scan+0xdb/0xe0 [scsi_mod]
[<ffffffffa000ee13>] store_scan+0x13/0x20 [scsi_mod]
[<ffffffff811c8d9b>] sysfs_write_file+0xcb/0x160
[<ffffffff811589de>] vfs_write+0xce/0x140
[<ffffffff81158b53>] sys_write+0x53/0xa0
[<ffffffff81464592>] system_call_fastpath+0x16/0x1b
[<00007f611c9d9300>] 0x7f611c9d92ff
Reported-by: Max Gurtuvoy <maxg@mellanox.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 75c3d0bf9caebb502e96683b2bc37f9692437e68 upstream.
This patch fixes the incorrect use of __transport_register_session()
in tcm_qla2xxx_check_initiator_node_acl() code, that does not perform
explicit se_tpg->session_lock when accessing se_tpg->tpg_sess_list
to add new se_sess nodes.
Given that tcm_qla2xxx_check_initiator_node_acl() is not called with
qla_hw->hardware_lock held for all accesses of ->tpg_sess_list, the
code should be using transport_register_session() instead.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6302ce4d80aa82b3fdb5c5cd68e7268037091b47 upstream.
This crash was reported:
[ 366.947370] sd 3:0:1:0: [sdb] Spinning up disk....
[ 368.804046] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 368.804072] IP: [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
[ 368.804098] PGD 0
[ 368.804114] Oops: 0002 [#1] SMP
[ 368.804143] CPU 1
[ 368.804151] Modules linked in: sg netconsole s3g(PO) uinput joydev hid_multitouch usbhid hid snd_hda_codec_via cpufreq_userspace cpufreq_powersave cpufreq_stats uhci_hcd cpufreq_conservative snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sdhci_pci snd_page_alloc sdhci snd_timer snd psmouse evdev serio_raw pcspkr soundcore xhci_hcd shpchp s3g_drm(O) mvsas mmc_core ahci libahci drm i2c_core acpi_cpufreq mperf video processor button thermal_sys dm_dmirror exfat_fs exfat_core dm_zcache dm_mod padlock_aes aes_generic padlock_sha iscsi_target_mod target_core_mod configfs sswipe libsas libata scsi_transport_sas picdev via_cputemp hwmon_vid fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sd_mod crc_t10dif usb_storage scsi_mod ehci_hcd usbcore usb_common
[ 368.804749]
[ 368.804764] Pid: 392, comm: kworker/u:3 Tainted: P W O 3.4.87-logicube-ng.22 #1 To be filled by O.E.M. To be filled by O.E.M./EPIA-M920
[ 368.804802] RIP: 0010:[<ffffffff81358457>] [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
[ 368.804827] RSP: 0018:ffff880117001cc0 EFLAGS: 00010246
[ 368.804842] RAX: 0000000000000000 RBX: ffff8801185030d0 RCX: ffff88008edcb420
[ 368.804857] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8801185030d4
[ 368.804873] RBP: ffff8801181531c0 R08: 0000000000000020 R09: 00000000fffffffe
[ 368.804885] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801185030d4
[ 368.804899] R13: 0000000000000002 R14: ffff880117001fd8 R15: ffff8801185030d8
[ 368.804916] FS: 0000000000000000(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 368.804931] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 368.804946] CR2: 0000000000000000 CR3: 000000000160b000 CR4: 00000000000006e0
[ 368.804962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 368.804978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 368.804995] Process kworker/u:3 (pid: 392, threadinfo ffff880117000000, task ffff8801181531c0)
[ 368.805009] Stack:
[ 368.805017] ffff8801185030d8 0000000000000000 ffffffff8161ddf0 ffffffff81056f7c
[ 368.805062] 000000000000b503 ffff8801185030d0 ffff880118503000 0000000000000000
[ 368.805100] ffff8801185030d0 ffff8801188b8000 ffff88008edcb420 ffffffff813583ac
[ 368.805135] Call Trace:
[ 368.805153] [<ffffffff81056f7c>] ? up+0xb/0x33
[ 368.805168] [<ffffffff813583ac>] ? mutex_lock+0x16/0x25
[ 368.805194] [<ffffffffa018c414>] ? smp_execute_task+0x4e/0x222 [libsas]
[ 368.805217] [<ffffffffa018ce1c>] ? sas_find_bcast_dev+0x3c/0x15d [libsas]
[ 368.805240] [<ffffffffa018ce4f>] ? sas_find_bcast_dev+0x6f/0x15d [libsas]
[ 368.805264] [<ffffffffa018e989>] ? sas_ex_revalidate_domain+0x37/0x2ec [libsas]
[ 368.805280] [<ffffffff81355a2a>] ? printk+0x43/0x48
[ 368.805296] [<ffffffff81359a65>] ? _raw_spin_unlock_irqrestore+0xc/0xd
[ 368.805318] [<ffffffffa018b767>] ? sas_revalidate_domain+0x85/0xb6 [libsas]
[ 368.805336] [<ffffffff8104e5d9>] ? process_one_work+0x151/0x27c
[ 368.805351] [<ffffffff8104f6cd>] ? worker_thread+0xbb/0x152
[ 368.805366] [<ffffffff8104f612>] ? manage_workers.isra.29+0x163/0x163
[ 368.805382] [<ffffffff81052c4e>] ? kthread+0x79/0x81
[ 368.805399] [<ffffffff8135fea4>] ? kernel_thread_helper+0x4/0x10
[ 368.805416] [<ffffffff81052bd5>] ? kthread_flush_work_fn+0x9/0x9
[ 368.805431] [<ffffffff8135fea0>] ? gs_change+0x13/0x13
[ 368.805442] Code: 83 7d 30 63 7e 04 f3 90 eb ab 4c 8d 63 04 4c 8d 7b 08 4c 89 e7 e8 fa 15 00 00 48 8b 43 10 4c 89 3c 24 48 89 63 10 48 89 44 24 08 <48> 89 20 83 c8 ff 48 89 6c 24 10 87 03 ff c8 74 35 4d 89 ee 41
[ 368.805851] RIP [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
[ 368.805877] RSP <ffff880117001cc0>
[ 368.805886] CR2: 0000000000000000
[ 368.805899] ---[ end trace b720682065d8f4cc ]---
It's directly caused by 89d3cf6 [SCSI] libsas: add mutex for SMP task
execution, but shows a deeper cause: expander functions expect to be able to
cast to and treat domain devices as expanders. The correct fix is to only do
expander discover when we know we've got an expander device to avoid wrongly
casting a non-expander device.
Reported-by: Praveen Murali <pmurali@logicube.com>
Tested-by: Praveen Murali <pmurali@logicube.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
gather segment boundary limit.
commit f76a610a8b4b6280eaedf48f3af9d5d74e418b66 upstream.
In reference to bug https://bugzilla.redhat.com/show_bug.cgi?id=1097141
Assert is seen with AMD cpu whenever calling pci_alloc_consistent.
[ 29.406183] ------------[ cut here ]------------
[ 29.410505] kernel BUG at lib/iommu-helper.c:13!
Signed-off-by: Minh Tran <minh.tran@emulex.com>
Fixes: 6733b39a1301b0b020bbcbf3295852e93e624cb1
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3b524a683af8991b4eab4182b947c65f0ce1421b upstream.
Fix SCSI generic read() incorrectly returning success after detecting an
error.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c2ced1719a1b903350955a511e1666e6d05a7f5b upstream.
Update driver "mask_interrupts" before enable/disable hardware interrupt
in order to avoid missing interrupts because of "mask_interrupts" still
set to 1 and hardware interrupts are enabled.
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Chaitra Basappa <chaitra.basappa@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6cdb08172bc89f0a39e1643c5e7eab362692fd1b upstream.
Fixes a race condition in abort handling that was injected
when multiple interrupt support was added. When only a single
interrupt is present, the adapter guarantees it will send
responses for aborted commands prior to the response for the
abort command itself. With multiple interrupts, these responses
generally come back on different interrupts, so we need to
ensure the abort thread waits until the aborted command is
complete so we don't perform a double completion. This race
condition was being hit frequently in environments which
were triggering command timeouts, which was resulting in
a double completion causing a kernel oops.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com>
Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e86fb5e8ab95f10ec5f2e9430119d5d35020c951 upstream.
When ring buffer returns an error indicating retry, storvsc may not
return a proper error code to SCSI when bounce buffer is not used.
This has introduced I/O freeze on RAID running atop storvsc devices.
This patch fixes it by always returning a proper error code.
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 198a956a11b15b564ac06d1411881e215b587408 upstream.
The Microsoft iSCSI target does not support REPORT SUPPORTED OPERATION
CODES. Blacklist these devices so we don't attempt to send the command.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tested-by: Mike Christie <michaelc@cs.wisc.edu>
Reported-by: jazz@deti74.ru
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2311ce4d9c91ed63a46e18f0378f3e406e7e888e upstream.
This reverts commit 963ba22b90a955363644cd397b20226928eab976
("mpt3sas: Remove phys on topology change")
Reverting the previous mpt3sas drives patch changes,
since we will observe below issue
Issue:
Drives connected Enclosure/Expander will unregister with
SCSI Transport Layer, if any one remove and add expander
cable with in DMD (Device Missing Delay) time period or
even any one power-off and power-on the Enclosure with in
the DMD period.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 81a89c2d891b78695aa7e4cc0d5a7427785ae078 upstream.
This reverts commit 3520f9c779bed098ca76dd3fb6377264301d57ed
("mpt2sas: Remove phys on topology change")
Reverting the previous mpt2sas drives patch changes,
since we will observe below issue
Issue:
Drives connected Enclosure/Expander will unregister with
SCSI Transport Layer, if any one remove and add expander
cable with in DMD (Device Missing Delay) time period or
even any one power-off and power-on the Enclosure with in
the DMD period.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b6c92b7e0af575e2b8b05bdf33633cf9e1661cbf upstream.
The .eh_abort_handler needs to return SUCCESS, FAILED, or
FAST_IO_FAIL. So fixup all callers to adhere to this requirement.
Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 170c238701ec38b1829321b17c70671c101bac55 upstream.
Corrected wait_event() call which was waiting for wrong completion
status (0xFF).
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 01a4cc4d0cd6a836c7b923760e8eb1cbb6a47258 upstream.
In some cases, the fcoe_rx_list may contains multiple instances
of the same skb (the so called "shared skbs").
the bnx2fc_l2_rcv thread is a loop that extracts a skb from the list,
modifies (and destroys) its content and then proceed to the next one.
The problem is that if the skb is shared, the remaining instances will
be corrupted.
The solution is to use skb_share_check() before adding the skb to the
fcoe_rx_list.
[ 6286.808725] ------------[ cut here ]------------
[ 6286.808729] WARNING: at include/scsi/fc_frame.h:173 bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc]()
[ 6286.808748] Modules linked in: bnx2x(-) mdio dm_service_time bnx2fc cnic uio fcoe libfcoe 8021q garp stp mrp libfc llc scsi_transport_fc scsi_tgt sg iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel e1000e ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper ptp cryptd hpilo serio_raw hpwdt lpc_ich pps_core ipmi_si pcspkr mfd_core ipmi_msghandler shpchp pcc_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc dm_multipath xfs libcrc32c ata_generic pata_acpi sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit ata_piix drm_kms_helper ttm drm libata i2c_core hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mdio]
[ 6286.808750] CPU: 3 PID: 1304 Comm: bnx2fc_l2_threa Not tainted 3.10.0-121.el7.x86_64 #1
[ 6286.808750] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013
[ 6286.808752] 0000000000000000 000000000b36e715 ffff8800deba1e00 ffffffff815ec0ba
[ 6286.808753] ffff8800deba1e38 ffffffff8105dee1 ffffffffa05618c0 ffff8801e4c81888
[ 6286.808754] ffffe8ffff663868 ffff8801f402b180 ffff8801f56bc000 ffff8800deba1e48
[ 6286.808754] Call Trace:
[ 6286.808759] [<ffffffff815ec0ba>] dump_stack+0x19/0x1b
[ 6286.808762] [<ffffffff8105dee1>] warn_slowpath_common+0x61/0x80
[ 6286.808763] [<ffffffff8105e00a>] warn_slowpath_null+0x1a/0x20
[ 6286.808765] [<ffffffffa054f415>] bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc]
[ 6286.808767] [<ffffffffa054eff0>] ? bnx2fc_disable+0x90/0x90 [bnx2fc]
[ 6286.808769] [<ffffffff81085aef>] kthread+0xcf/0xe0
[ 6286.808770] [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
[ 6286.808772] [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0
[ 6286.808773] [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
[ 6286.808774] ---[ end trace c6cdb939184ccb4e ]---
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1899045510ff109980d9cc34e330fd8ca3631871 upstream.
Intel Multi-Flex LUNs choke on REPORT SUPPORTED OPERATION CODES
resulting in sd_mod hanging for several minutes on startup.
The issue was introduced with WRITE SAME discovery heuristics.
Fixes: 5db44863b6eb ("[SCSI] sd: Implement support for WRITE SAME")
Signed-off-by: Christian Sünkenberg <christian.suenkenberg@hfg-karlsruhe.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 48379270fe6808cf4612ee094adc8da2b7a83baa upstream.
Setups that use the blk-mq I/O path can lock up if a host with a single
device that has its door locked enters EH. Make sure to only send the
command to re-lock the door to devices that actually were reset and thus
might have lost their state. Otherwise the EH code might be get blocked
on blk_get_request as all requests for non-reset devices might be in use.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Meelis Roos <meelis.roos@ut.ee>
Tested-by: Meelis Roos <meelis.roos@ut.ee>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f4c24db1b7ad0ce84409e15744d26c6f86a96840 upstream.
The code is currently riddled with "drop the hardware_lock to avoid a
deadlock" bugs that expose races. One of those races seems to expose a
valid warning in tcm_qla2xxx_clear_nacl_from_fcport_map. Add some
bandaid to it.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db7157d4cfce6edf052452fb1d327d4d11b67f4c upstream.
Once calling scsi_host_put, be careful to not access qla_hw_data through
the Scsi_Host private data (ie, scsi_qla_host base_vha).
Fixes: fe1b806f4f71 ("qla2xxx: Refactor shutdown code so some functionality can be reused")
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 75554b68ac1e018bca00d68a430b92ada8ab52dd upstream.
Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a41a9ad3bbf61fae0b6bfb232153da60d14fdbd9 upstream.
Dan Carpenter found a issue where be2iscsi would copy the ip
from userspace to the driver buffer before checking the len
of the data being copied:
http://marc.info/?l=linux-scsi&m=140982651504251&w=2
This patch just has us only copy what we the driver buffer
can support.
Tested-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db9bfd64b14a3a8f1868d2164518fdeab1b26ad1 upstream.
This patches fixes a potential buffer overrun in __iscsi_conn_send_pdu.
This function is used by iscsi drivers and userspace to send iscsi PDUs/
commands. For login commands, we have a set buffer size. For all other
commands we do not support data buffers.
This was reported by Dan Carpenter here:
http://www.spinics.net/lists/linux-scsi/msg66838.html
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 03a6c3ff3282ee9fa893089304d951e0be93a144 upstream.
bfa_swap_words() shifts its argument (assumed to be 64-bit) by 32 bits
each way. In two places the argument type is dma_addr_t, which may be
32-bit, in which case the effect of the bit shift is undefined:
drivers/scsi/bfa/bfa_fcpim.c: In function 'bfa_ioim_send_ioreq':
drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: left shift count >= width of type [enabled by default]
addr = bfa_sgaddr_le(sg_dma_address(sg));
^
drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: right shift count >= width of type [enabled by default]
drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: left shift count >= width of type [enabled by default]
addr = bfa_sgaddr_le(sg_dma_address(sg));
^
drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: right shift count >= width of type [enabled by default]
Avoid this by adding casts to u64 in bfa_swap_words().
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Fixes: f16a17507b09 ('[SCSI] bfa: remove all OS wrappers')
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cd53eb686d2418eda938aad3c9da42b7dfa9351f upstream.
If scsi_remove_host() is called while an rport is in the blocked state
then scsi_remove_host() will only finish if the rport is unblocked
from inside a timer function. Make sure that an rport only enters the
blocked state if a timer will be started that will unblock it. This
avoids that unloading the ib_srp kernel module after having
disconnected the initiator from the target system results in a
deadlock if both the fast_io_fail_tmo and dev_loss_tmo parameters have
been set to "off".
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: David Dillow <dave@thedillows.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|