GIT 607cdaefc24699cc76925aad981c05823dd60ff8 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git#for-mm commit Author: Steve Welch Date: Tue Oct 23 15:06:10 2007 -0700 IB/mad: Enable loopback of DR SMP responses from userspace The local loopback of an outgoing DR SMP response is limited to those that originate at the driver specific SMA implementation during the driver specific process_mad() function. This patch enables a returning DR SMP originating in userspace (or elsewhere) to be delivered to the local managment stack. In this specific case the driver process_mad() function does not consume or process the MAD, so a reponse mad has not be created and the original MAD must manually be copied to the MAD buffer that is to be handed off to the local agent. Signed-off-by: Steve Welch Acked-by: Hal Rosenstock Signed-off-by: Roland Dreier commit dd2f50dba7355731825d4509096d92b51a786bb3 Author: Ralph Campbell Date: Tue Oct 23 15:07:41 2007 -0700 IB/ipath: Enable loopback of DR SMP responses from userspace This patch is in response to reviewing a patch to the core MAD processing which fixes loopback of directed route packets to/from user level MAD agents. This change enables the core code to work for ib_ipath by fixing the return code from the ipath process_mad method. Signed-off-by: Ralph Campbell Signed-off-by: Roland Dreier commit fb4e1b85c08481813f0009e87379ea060f63d328 Author: Ralph Campbell Date: Tue Oct 23 15:04:15 2007 -0700 IB/mad: Remove redundant NULL pointer check in ib_mad_recv_done_handler() In ib_mad_recv_done_handler(), the response pointer is checked for NULL after allocating it. It is then checked again in the local process_mad() path but there is no possibility of it changing in between. Signed-off-by: Ralph Campbell Acked-by: Hal Rosenstock Signed-off-by: Roland Dreier commit 83159c0b589053132962b0dfc36e6d5a9afe10b1 Author: Steve Wise Date: Mon Oct 29 11:34:05 2007 -0500 RDMA/iwcm: Set initiator depth and responder resources to device max values Set the initiator depth and responder resources to the device max values for new connect request events in the iWARP connection manager. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit f5bcdc96ec1330212cd78015a54be8d01ba959e9 Author: Dave Olson Date: Tue Oct 9 22:10:35 2007 -0700 IB/ipath: Improve interrupt handler cache footprint Improve interrupt handler cache footprint by noinline'ing error functions that are rarely called. Signed-off-by: Dave Olson Signed-off-by: Roland Dreier commit a206a859bed23b87978a3652bab04707aeec81db Author: Pradeep Satyanarayana Date: Sat Nov 17 13:53:07 2007 -0800 IPoIB/cm: Add connected mode support for devices without SRQs Some IB adapters (notably IBM's eHCA) do not implement SRQs (shared receive queues). The current IPoIB connected mode support only works on devices that support SRQs. Fix this by adding support for using the receive queue of each connected mode receive QP. The disadvantage of this compared to using an SRQ is that it means a full queue of receives must be posted for each remote connected mode peer, which means that total memory usage is potentially much higher than when using SRQs. To manage this, add a new module parameter "max_nonsrq_conn_qp" that limits the number of connections allowed per interface. The rest of the changes are fairly straightforward: we use a table of struct ipoib_cm_rx to hold all the active connections, and put the table index of the connection in the high bits of receive WR IDs. This is needed because we cannot rely on the struct ib_wc.qp field for non-SRQ receive completions. Most of the rest of the changes just test whether or not an SRQ is available, and post receives or find received packets in the right place depending on the answer. Cleaning up dead connections actually becomes simpler, because we do not have to do the "last WQE reached" dance that is required to destroy QPs attached to an SRQ. We just move the QP to the error state and wait for all pending receives to be flushed. Signed-off-by: Pradeep Satyanarayana [ Completely rewritten and split up, based on Pradeep's work. Several bugs fixed and no doubt several bugs introduced. - Roland ] Signed-off-by: Roland Dreier commit 2886f55c90deb7a6795d0adf2ce2f12eb0274e29 Author: Roland Dreier Date: Sat Nov 17 13:53:06 2007 -0800 IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list() Factor out the code for going through the rx_reap list of struct ipoib_cm_rx and freeing each one. This consolidates the code duplicated between ipoib_cm_dev_stop() and ipoib_cm_rx_reap() and reduces the risk of error when adding additional accounting. Signed-off-by: Roland Dreier commit 6738d1df2b3559795471219ea569ad7dde6ab284 Author: Roland Dreier Date: Sat Nov 17 13:53:06 2007 -0800 IPoIB/cm: Factor out ipoib_cm_create_srq() Factor out the code to create an SRQ and allocate the receive ring in ipoib_cm_dev_init() into a new function ipoib_cm_create_srq(). This will make the code neater when support for devices that don't implement SRQs is added. Signed-off-by: Roland Dreier commit 770a3adea618d200e3c21c258f350fbcb244a100 Author: Roland Dreier Date: Sat Nov 17 13:53:06 2007 -0800 IPoIB/cm: Factor out ipoib_cm_free_rx_ring() Factor out the code to unmap/free skbs and free the receive ring in ipoib_cm_dev_cleanup() into a new function ipoib_cm_free_rx_ring(). This function will be called from a couple of other places when support for devices that don't implement SRQs is added. Signed-off-by: Roland Dreier commit f9416cec192f4f82be77cfd3d41c471076dcdbea Author: Roland Dreier Date: Tue Oct 23 12:57:54 2007 -0700 IPoIB: Trivial formatting cleanups Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc. Signed-off-by: Roland Dreier drivers/infiniband/core/cma.c | 10 + drivers/infiniband/core/mad.c | 15 +- drivers/infiniband/core/smi.h | 18 ++- drivers/infiniband/hw/ipath/ipath_intr.c | 4 +- drivers/infiniband/hw/ipath/ipath_mad.c | 4 +- drivers/infiniband/ulp/ipoib/ipoib.h | 171 +++++++----- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 352 +++++++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 8 +- drivers/infiniband/ulp/ipoib/ipoib_main.c | 40 ++-- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 4 +- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 18 +- 11 files changed, 420 insertions(+), 224 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 0751697..5a80e74 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1262,6 +1262,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, struct net_device *dev = NULL; struct rdma_cm_event event; int ret; + struct ib_device_attr attr; listen_id = cm_id->context; if (cma_disable_remove(listen_id, CMA_LISTEN)) @@ -1311,10 +1312,19 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, sin = (struct sockaddr_in *) &new_cm_id->route.addr.dst_addr; *sin = iw_event->remote_addr; + ret = ib_query_device(conn_id->id.device, &attr); + if (ret) { + cma_enable_remove(conn_id); + rdma_destroy_id(new_cm_id); + goto out; + } + memset(&event, 0, sizeof event); event.event = RDMA_CM_EVENT_CONNECT_REQUEST; event.param.conn.private_data = iw_event->private_data; event.param.conn.private_data_len = iw_event->private_data_len; + event.param.conn.initiator_depth = attr.max_qp_init_rd_atom; + event.param.conn.responder_resources = attr.max_qp_rd_atom; ret = conn_id->id.event_handler(&conn_id->id, &event); if (ret) { /* User wants to destroy the CM ID */ diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 6f42877..649335a 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -701,7 +701,8 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv, } /* Check to post send on QP or process locally */ - if (smi_check_local_smp(smp, device) == IB_SMI_DISCARD) + if (smi_check_local_smp(smp, device) == IB_SMI_DISCARD && + smi_check_local_returning_smp(smp, device) == IB_SMI_DISCARD) goto out; local = kmalloc(sizeof *local, GFP_ATOMIC); @@ -752,8 +753,7 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv, port_priv = ib_get_mad_port(mad_agent_priv->agent.device, mad_agent_priv->agent.port_num); if (port_priv) { - mad_priv->mad.mad.mad_hdr.tid = - ((struct ib_mad *)smp)->mad_hdr.tid; + memcpy(&mad_priv->mad.mad, smp, sizeof(struct ib_mad)); recv_mad_agent = find_mad_agent(port_priv, &mad_priv->mad.mad); } @@ -1931,15 +1931,6 @@ local: if (port_priv->device->process_mad) { int ret; - if (!response) { - printk(KERN_ERR PFX "No memory for response MAD\n"); - /* - * Is it better to assume that - * it wouldn't be processed ? - */ - goto out; - } - ret = port_priv->device->process_mad(port_priv->device, 0, port_priv->port_num, wc, &recv->grh, diff --git a/drivers/infiniband/core/smi.h b/drivers/infiniband/core/smi.h index 1cfc298..aff96ba 100644 --- a/drivers/infiniband/core/smi.h +++ b/drivers/infiniband/core/smi.h @@ -59,7 +59,8 @@ extern enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp, u8 node_type, int port_num); /* - * Return 1 if the SMP should be handled by the local SMA/SM via process_mad + * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM + * via process_mad */ static inline enum smi_action smi_check_local_smp(struct ib_smp *smp, struct ib_device *device) @@ -71,4 +72,19 @@ static inline enum smi_action smi_check_local_smp(struct ib_smp *smp, (smp->hop_ptr == smp->hop_cnt + 1)) ? IB_SMI_HANDLE : IB_SMI_DISCARD); } + +/* + * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM + * via process_mad + */ +static inline enum smi_action smi_check_local_returning_smp(struct ib_smp *smp, + struct ib_device *device) +{ + /* C14-13:3 -- We're at the end of the DR segment of path */ + /* C14-13:4 -- Hop Pointer == 0 -> give to SM */ + return ((device->process_mad && + ib_get_smp_direction(smp) && + !smp->hop_ptr) ? IB_SMI_HANDLE : IB_SMI_DISCARD); +} + #endif /* __SMI_H_ */ diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c index c61f9da..8f3718c 100644 --- a/drivers/infiniband/hw/ipath/ipath_intr.c +++ b/drivers/infiniband/hw/ipath/ipath_intr.c @@ -849,7 +849,7 @@ void ipath_clear_freeze(struct ipath_devdata *dd) /* this is separate to allow for better optimization of ipath_intr() */ -static void ipath_bad_intr(struct ipath_devdata *dd, u32 * unexpectp) +static noinline void ipath_bad_intr(struct ipath_devdata *dd, u32 *unexpectp) { /* * sometimes happen during driver init and unload, don't want @@ -892,7 +892,7 @@ static void ipath_bad_intr(struct ipath_devdata *dd, u32 * unexpectp) "ignoring\n"); } -static void ipath_bad_regread(struct ipath_devdata *dd) +static noinline void ipath_bad_regread(struct ipath_devdata *dd) { static int allbits; diff --git a/drivers/infiniband/hw/ipath/ipath_mad.c b/drivers/infiniband/hw/ipath/ipath_mad.c index 3d1432d..1978c34 100644 --- a/drivers/infiniband/hw/ipath/ipath_mad.c +++ b/drivers/infiniband/hw/ipath/ipath_mad.c @@ -1434,7 +1434,7 @@ static int process_subn(struct ib_device *ibdev, int mad_flags, * before checking for other consumers. * Just tell the caller to process it normally. */ - ret = IB_MAD_RESULT_FAILURE; + ret = IB_MAD_RESULT_SUCCESS; goto bail; default: smp->status |= IB_SMP_UNSUP_METHOD; @@ -1516,7 +1516,7 @@ static int process_perf(struct ib_device *ibdev, u8 port_num, * before checking for other consumers. * Just tell the caller to process it normally. */ - ret = IB_MAD_RESULT_FAILURE; + ret = IB_MAD_RESULT_SUCCESS; goto bail; default: pmp->status |= IB_SMP_UNSUP_METHOD; diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index eb7edab..d35025f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -56,42 +56,43 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, + IPOIB_PACKET_SIZE = 2048, + IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, + IPOIB_ENCAP_LEN = 4, - IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ - IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU + IPOIB_ENCAP_LEN, - IPOIB_CM_HEAD_SIZE = IPOIB_CM_BUF_SIZE % PAGE_SIZE, - IPOIB_CM_RX_SG = ALIGN(IPOIB_CM_BUF_SIZE, PAGE_SIZE) / PAGE_SIZE, - IPOIB_RX_RING_SIZE = 128, - IPOIB_TX_RING_SIZE = 64, + IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ + IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU + IPOIB_ENCAP_LEN, + IPOIB_CM_HEAD_SIZE = IPOIB_CM_BUF_SIZE % PAGE_SIZE, + IPOIB_CM_RX_SG = ALIGN(IPOIB_CM_BUF_SIZE, PAGE_SIZE) / PAGE_SIZE, + IPOIB_RX_RING_SIZE = 128, + IPOIB_TX_RING_SIZE = 64, IPOIB_MAX_QUEUE_SIZE = 8192, IPOIB_MIN_QUEUE_SIZE = 2, + IPOIB_CM_MAX_CONN_QP = 4096, - IPOIB_NUM_WC = 4, + IPOIB_NUM_WC = 4, IPOIB_MAX_PATH_REC_QUEUE = 3, - IPOIB_MAX_MCAST_QUEUE = 3, - - IPOIB_FLAG_OPER_UP = 0, - IPOIB_FLAG_INITIALIZED = 1, - IPOIB_FLAG_ADMIN_UP = 2, - IPOIB_PKEY_ASSIGNED = 3, - IPOIB_PKEY_STOP = 4, - IPOIB_FLAG_SUBINTERFACE = 5, - IPOIB_MCAST_RUN = 6, - IPOIB_STOP_REAPER = 7, - IPOIB_MCAST_STARTED = 8, - IPOIB_FLAG_ADMIN_CM = 9, + IPOIB_MAX_MCAST_QUEUE = 3, + + IPOIB_FLAG_OPER_UP = 0, + IPOIB_FLAG_INITIALIZED = 1, + IPOIB_FLAG_ADMIN_UP = 2, + IPOIB_PKEY_ASSIGNED = 3, + IPOIB_PKEY_STOP = 4, + IPOIB_FLAG_SUBINTERFACE = 5, + IPOIB_MCAST_RUN = 6, + IPOIB_STOP_REAPER = 7, + IPOIB_MCAST_STARTED = 8, + IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, IPOIB_MAX_BACKOFF_SECONDS = 16, - IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */ + IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */ IPOIB_MCAST_FLAG_SENDONLY = 1, - IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */ + IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */ IPOIB_MCAST_FLAG_ATTACHED = 3, }; @@ -117,7 +118,7 @@ struct ipoib_pseudoheader { struct ipoib_mcast { struct ib_sa_mcmember_rec mcmember; struct ib_sa_multicast *mc; - struct ipoib_ah *ah; + struct ipoib_ah *ah; struct rb_node rb_node; struct list_head list; @@ -186,27 +187,29 @@ enum ipoib_cm_state { }; struct ipoib_cm_rx { - struct ib_cm_id *id; - struct ib_qp *qp; - struct list_head list; - struct net_device *dev; - unsigned long jiffies; - enum ipoib_cm_state state; + struct ib_cm_id *id; + struct ib_qp *qp; + struct ipoib_cm_rx_buf *rx_ring; + struct list_head list; + struct net_device *dev; + unsigned long jiffies; + enum ipoib_cm_state state; + int recv_count; }; struct ipoib_cm_tx { - struct ib_cm_id *id; - struct ib_qp *qp; + struct ib_cm_id *id; + struct ib_qp *qp; struct list_head list; struct net_device *dev; struct ipoib_neigh *neigh; struct ipoib_path *path; struct ipoib_tx_buf *tx_ring; - unsigned tx_head; - unsigned tx_tail; - unsigned long flags; - u32 mtu; - struct ib_wc ibwc[IPOIB_NUM_WC]; + unsigned tx_head; + unsigned tx_tail; + unsigned long flags; + u32 mtu; + struct ib_wc ibwc[IPOIB_NUM_WC]; }; struct ipoib_cm_rx_buf { @@ -215,25 +218,26 @@ struct ipoib_cm_rx_buf { }; struct ipoib_cm_dev_priv { - struct ib_srq *srq; + struct ib_srq *srq; struct ipoib_cm_rx_buf *srq_ring; - struct ib_cm_id *id; - struct list_head passive_ids; /* state: LIVE */ - struct list_head rx_error_list; /* state: ERROR */ - struct list_head rx_flush_list; /* state: FLUSH, drain not started */ - struct list_head rx_drain_list; /* state: FLUSH, drain started */ - struct list_head rx_reap_list; /* state: FLUSH, drain done */ + struct ib_cm_id *id; + struct list_head passive_ids; /* state: LIVE */ + struct list_head rx_error_list; /* state: ERROR */ + struct list_head rx_flush_list; /* state: FLUSH, drain not started */ + struct list_head rx_drain_list; /* state: FLUSH, drain started */ + struct list_head rx_reap_list; /* state: FLUSH, drain done */ struct work_struct start_task; struct work_struct reap_task; struct work_struct skb_task; struct work_struct rx_reap_task; struct delayed_work stale_task; struct sk_buff_head skb_queue; - struct list_head start_list; - struct list_head reap_list; - struct ib_wc ibwc[IPOIB_NUM_WC]; - struct ib_sge rx_sge[IPOIB_CM_RX_SG]; + struct list_head start_list; + struct list_head reap_list; + struct ib_wc ibwc[IPOIB_NUM_WC]; + struct ib_sge rx_sge[IPOIB_CM_RX_SG]; struct ib_recv_wr rx_wr; + int nonsrq_conn_qp; }; /* @@ -269,30 +273,30 @@ struct ipoib_dev_priv { struct work_struct pkey_event_task; struct ib_device *ca; - u8 port; - u16 pkey; - u16 pkey_index; - struct ib_pd *pd; - struct ib_mr *mr; - struct ib_cq *cq; - struct ib_qp *qp; - u32 qkey; + u8 port; + u16 pkey; + u16 pkey_index; + struct ib_pd *pd; + struct ib_mr *mr; + struct ib_cq *cq; + struct ib_qp *qp; + u32 qkey; union ib_gid local_gid; - u16 local_lid; + u16 local_lid; unsigned int admin_mtu; unsigned int mcast_mtu; struct ipoib_rx_buf *rx_ring; - spinlock_t tx_lock; + spinlock_t tx_lock; struct ipoib_tx_buf *tx_ring; - unsigned tx_head; - unsigned tx_tail; - struct ib_sge tx_sge; + unsigned tx_head; + unsigned tx_tail; + struct ib_sge tx_sge; struct ib_send_wr tx_wr; - unsigned tx_outstanding; + unsigned tx_outstanding; struct ib_wc ibwc[IPOIB_NUM_WC]; @@ -317,10 +321,10 @@ struct ipoib_dev_priv { struct ipoib_ah { struct net_device *dev; - struct ib_ah *ah; + struct ib_ah *ah; struct list_head list; - struct kref ref; - unsigned last_send; + struct kref ref; + unsigned last_send; }; struct ipoib_path { @@ -331,11 +335,11 @@ struct ipoib_path { struct list_head neigh_list; - int query_id; + int query_id; struct ib_sa_query *query; struct completion done; - struct rb_node rb_node; + struct rb_node rb_node; struct list_head list; }; @@ -344,7 +348,7 @@ struct ipoib_neigh { #ifdef CONFIG_INFINIBAND_IPOIB_CM struct ipoib_cm_tx *cm; #endif - union ib_gid dgid; + union ib_gid dgid; struct sk_buff_head queue; struct neighbour *neighbour; @@ -455,12 +459,14 @@ void ipoib_drain_cq(struct net_device *dev); #ifdef CONFIG_INFINIBAND_IPOIB_CM -#define IPOIB_FLAGS_RC 0x80 -#define IPOIB_FLAGS_UC 0x40 +#define IPOIB_FLAGS_RC 0x80 +#define IPOIB_FLAGS_UC 0x40 /* We don't support UC connections at the moment */ #define IPOIB_CM_SUPPORTED(ha) (ha[0] & (IPOIB_FLAGS_RC)) +extern int ipoib_max_conn_qp; + static inline int ipoib_cm_admin_enabled(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -491,6 +497,12 @@ static inline void ipoib_cm_set(struct ipoib_neigh *neigh, struct ipoib_cm_tx *t neigh->cm = tx; } +static inline int ipoib_cm_has_srq(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + return !!priv->cm.srq; +} + void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_tx *tx); int ipoib_cm_dev_open(struct net_device *dev); void ipoib_cm_dev_stop(struct net_device *dev); @@ -500,7 +512,7 @@ void ipoib_cm_dev_cleanup(struct net_device *dev); struct ipoib_cm_tx *ipoib_cm_create_tx(struct net_device *dev, struct ipoib_path *path, struct ipoib_neigh *neigh); void ipoib_cm_destroy_tx(struct ipoib_cm_tx *tx); -void ipoib_cm_skb_too_long(struct net_device* dev, struct sk_buff *skb, +void ipoib_cm_skb_too_long(struct net_device *dev, struct sk_buff *skb, unsigned int mtu); void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc); void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc); @@ -508,6 +520,8 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc); struct ipoib_cm_tx; +#define ipoib_max_conn_qp 0 + static inline int ipoib_cm_admin_enabled(struct net_device *dev) { return 0; @@ -533,6 +547,11 @@ static inline void ipoib_cm_set(struct ipoib_neigh *neigh, struct ipoib_cm_tx *t { } +static inline int ipoib_cm_has_srq(struct net_device *dev) +{ + return 0; +} + static inline void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_tx *tx) { @@ -582,7 +601,7 @@ int ipoib_cm_add_mode_attr(struct net_device *dev) return 0; } -static inline void ipoib_cm_skb_too_long(struct net_device* dev, struct sk_buff *skb, +static inline void ipoib_cm_skb_too_long(struct net_device *dev, struct sk_buff *skb, unsigned int mtu) { dev_kfree_skb_any(skb); @@ -624,12 +643,12 @@ extern struct ib_sa_client ipoib_sa_client; extern int ipoib_debug_level; #define ipoib_dbg(priv, format, arg...) \ - do { \ + do { \ if (ipoib_debug_level > 0) \ ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ } while (0) #define ipoib_dbg_mcast(priv, format, arg...) \ - do { \ + do { \ if (mcast_debug_level > 0) \ ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ } while (0) @@ -642,7 +661,7 @@ extern int ipoib_debug_level; #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA #define ipoib_dbg_data(priv, format, arg...) \ - do { \ + do { \ if (data_debug_level > 0) \ ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ } while (0) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 059cf92..df77a28 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -39,6 +39,13 @@ #include #include +int ipoib_max_conn_qp = 128; + +module_param_named(max_nonsrq_conn_qp, ipoib_max_conn_qp, int, 0444); +MODULE_PARM_DESC(max_nonsrq_conn_qp, + "Max number of connected-mode QPs per interface " + "(applied only if shared receive queue is not available)"); + #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA static int data_debug_level; @@ -81,7 +88,7 @@ static void ipoib_cm_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, DMA_FROM_DEVICE); } -static int ipoib_cm_post_receive(struct net_device *dev, int id) +static int ipoib_cm_post_receive_srq(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ib_recv_wr *bad_wr; @@ -104,7 +111,33 @@ static int ipoib_cm_post_receive(struct net_device *dev, int id) return ret; } -static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, int id, int frags, +static int ipoib_cm_post_receive_nonsrq(struct net_device *dev, + struct ipoib_cm_rx *rx, int id) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_recv_wr *bad_wr; + int i, ret; + + priv->cm.rx_wr.wr_id = id | IPOIB_OP_CM | IPOIB_OP_RECV; + + for (i = 0; i < IPOIB_CM_RX_SG; ++i) + priv->cm.rx_sge[i].addr = rx->rx_ring[id].mapping[i]; + + ret = ib_post_recv(rx->qp, &priv->cm.rx_wr, &bad_wr); + if (unlikely(ret)) { + ipoib_warn(priv, "post recv failed for buf %d (%d)\n", id, ret); + ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + rx->rx_ring[id].mapping); + dev_kfree_skb_any(rx->rx_ring[id].skb); + rx->rx_ring[id].skb = NULL; + } + + return ret; +} + +static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, + struct ipoib_cm_rx_buf *rx_ring, + int id, int frags, u64 mapping[IPOIB_CM_RX_SG]) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -141,7 +174,7 @@ static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, int id, int goto partial_error; } - priv->cm.srq_ring[id].skb = skb; + rx_ring[id].skb = skb; return skb; partial_error: @@ -155,7 +188,23 @@ partial_error: return NULL; } -static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv* priv) +static void ipoib_cm_free_rx_ring(struct net_device *dev, + struct ipoib_cm_rx_buf *rx_ring) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < ipoib_recvq_size; ++i) + if (rx_ring[i].skb) { + ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + rx_ring[i].mapping); + dev_kfree_skb_any(rx_ring[i].skb); + } + + kfree(rx_ring); +} + +static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) { struct ib_send_wr *bad_wr; struct ipoib_cm_rx *p; @@ -208,12 +257,18 @@ static struct ib_qp *ipoib_cm_create_rx_qp(struct net_device *dev, .qp_type = IB_QPT_RC, .qp_context = p, }; + + if (!ipoib_cm_has_srq(dev)) { + attr.cap.max_recv_wr = ipoib_recvq_size; + attr.cap.max_recv_sge = IPOIB_CM_RX_SG; + } + return ib_create_qp(priv->pd, &attr); } static int ipoib_cm_modify_rx_qp(struct net_device *dev, - struct ib_cm_id *cm_id, struct ib_qp *qp, - unsigned psn) + struct ib_cm_id *cm_id, struct ib_qp *qp, + unsigned psn) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ib_qp_attr qp_attr; @@ -266,6 +321,60 @@ static int ipoib_cm_modify_rx_qp(struct net_device *dev, return 0; } +static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_id, + struct ipoib_cm_rx *rx) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + int i; + + rx->rx_ring = kcalloc(ipoib_recvq_size, sizeof *rx->rx_ring, GFP_KERNEL); + if (!rx->rx_ring) + return -ENOMEM; + + spin_lock_irq(&priv->lock); + + if (priv->cm.nonsrq_conn_qp >= ipoib_max_conn_qp) { + spin_unlock_irq(&priv->lock); + ib_send_cm_rej(cm_id, IB_CM_REJ_NO_QP, NULL, 0, NULL, 0); + ret = -EINVAL; + goto err_free; + } else + ++priv->cm.nonsrq_conn_qp; + + spin_unlock_irq(&priv->lock); + + for (i = 0; i < ipoib_recvq_size; ++i) { + if (!ipoib_cm_alloc_rx_skb(dev, rx->rx_ring, i, IPOIB_CM_RX_SG - 1, + rx->rx_ring[i].mapping)) { + ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); + ret = -ENOMEM; + goto err_count; + } + ret = ipoib_cm_post_receive_nonsrq(dev, rx, i); + if (ret) { + ipoib_warn(priv, "ipoib_cm_post_receive_nonsrq " + "failed for buf %d\n", i); + ret = -EIO; + goto err_count; + } + } + + rx->recv_count = ipoib_recvq_size; + + return 0; + +err_count: + spin_lock_irq(&priv->lock); + --priv->cm.nonsrq_conn_qp; + spin_unlock_irq(&priv->lock); + +err_free: + ipoib_cm_free_rx_ring(dev, rx->rx_ring); + + return ret; +} + static int ipoib_cm_send_rep(struct net_device *dev, struct ib_cm_id *cm_id, struct ib_qp *qp, struct ib_cm_req_event_param *req, unsigned psn) @@ -281,7 +390,7 @@ static int ipoib_cm_send_rep(struct net_device *dev, struct ib_cm_id *cm_id, rep.private_data_len = sizeof data; rep.flow_control = 0; rep.rnr_retry_count = req->rnr_retry_count; - rep.srq = 1; + rep.srq = ipoib_cm_has_srq(dev); rep.qp_num = qp->qp_num; rep.starting_psn = psn; return ib_send_cm_rep(cm_id, &rep); @@ -317,6 +426,12 @@ static int ipoib_cm_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *even if (ret) goto err_modify; + if (!ipoib_cm_has_srq(dev)) { + ret = ipoib_cm_nonsrq_init_rx(dev, cm_id, p); + if (ret) + goto err_modify; + } + spin_lock_irq(&priv->lock); queue_delayed_work(ipoib_workqueue, &priv->cm.stale_task, IPOIB_CM_RX_DELAY); @@ -401,6 +516,7 @@ static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_cm_rx_buf *rx_ring; unsigned int wr_id = wc->wr_id & ~(IPOIB_OP_CM | IPOIB_OP_RECV); struct sk_buff *skb, *newskb; struct ipoib_cm_rx *p; @@ -424,18 +540,34 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) return; } - skb = priv->cm.srq_ring[wr_id].skb; + p = wc->qp->qp_context; + + if (ipoib_cm_has_srq(dev)) + rx_ring = priv->cm.srq_ring; + else + rx_ring = p->rx_ring; + + skb = rx_ring[wr_id].skb; if (unlikely(wc->status != IB_WC_SUCCESS)) { ipoib_dbg(priv, "cm recv error " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); ++dev->stats.rx_dropped; - goto repost; + if (!p->rx_ring) + goto repost; + else { + if (!--p->recv_count) { + spin_lock_irqsave(&priv->lock, flags); + list_move(&p->list, &priv->cm.rx_reap_list); + spin_unlock_irqrestore(&priv->lock, flags); + queue_work(ipoib_workqueue, &priv->cm.rx_reap_task); + } + return; + } } if (unlikely(!(wr_id & IPOIB_CM_RX_UPDATE_MASK))) { - p = wc->qp->qp_context; if (p && time_after_eq(jiffies, p->jiffies + IPOIB_CM_RX_UPDATE_TIME)) { spin_lock_irqsave(&priv->lock, flags); p->jiffies = jiffies; @@ -450,7 +582,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, wr_id, frags, mapping); + newskb = ipoib_cm_alloc_rx_skb(dev, rx_ring, wr_id, frags, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -461,8 +593,8 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) goto repost; } - ipoib_cm_dma_unmap_rx(priv, frags, priv->cm.srq_ring[wr_id].mapping); - memcpy(priv->cm.srq_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); + ipoib_cm_dma_unmap_rx(priv, frags, rx_ring[wr_id].mapping); + memcpy(rx_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); @@ -483,9 +615,17 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) netif_receive_skb(skb); repost: - if (unlikely(ipoib_cm_post_receive(dev, wr_id))) - ipoib_warn(priv, "ipoib_cm_post_receive failed " - "for buf %d\n", wr_id); + if (p->rx_ring) { + if (unlikely(ipoib_cm_post_receive_nonsrq(dev, p, wr_id))) { + --p->recv_count; + ipoib_warn(priv, "ipoib_cm_post_receive_nonsrq failed " + "for buf %d\n", wr_id); + } + } else { + if (unlikely(ipoib_cm_post_receive_srq(dev, wr_id))) + ipoib_warn(priv, "ipoib_cm_post_receive_srq failed " + "for buf %d\n", wr_id); + } } static inline int post_send(struct ipoib_dev_priv *priv, @@ -495,10 +635,10 @@ static inline int post_send(struct ipoib_dev_priv *priv, { struct ib_send_wr *bad_wr; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; + priv->tx_sge.addr = addr; + priv->tx_sge.length = len; - priv->tx_wr.wr_id = wr_id | IPOIB_OP_CM; + priv->tx_wr.wr_id = wr_id | IPOIB_OP_CM; return ib_post_send(tx->qp, &priv->tx_wr, &bad_wr); } @@ -540,7 +680,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ tx_req->mapping = addr; if (unlikely(post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), - addr, skb->len))) { + addr, skb->len))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE); @@ -657,10 +797,33 @@ err_cm: return ret; } +static void ipoib_cm_free_rx_reap_list(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_cm_rx *rx, *n; + LIST_HEAD(list); + + spin_lock_irq(&priv->lock); + list_splice_init(&priv->cm.rx_reap_list, &list); + spin_unlock_irq(&priv->lock); + + list_for_each_entry_safe(rx, n, &list, list) { + ib_destroy_cm_id(rx->id); + ib_destroy_qp(rx->qp); + if (!ipoib_cm_has_srq(dev)) { + ipoib_cm_free_rx_ring(priv->dev, rx->rx_ring); + spin_lock_irq(&priv->lock); + --priv->cm.nonsrq_conn_qp; + spin_unlock_irq(&priv->lock); + } + kfree(rx); + } +} + void ipoib_cm_dev_stop(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ipoib_cm_rx *p, *n; + struct ipoib_cm_rx *p; unsigned long begin; LIST_HEAD(list); int ret; @@ -706,15 +869,9 @@ void ipoib_cm_dev_stop(struct net_device *dev) spin_lock_irq(&priv->lock); } - list_splice_init(&priv->cm.rx_reap_list, &list); - spin_unlock_irq(&priv->lock); - list_for_each_entry_safe(p, n, &list, list) { - ib_destroy_cm_id(p->id); - ib_destroy_qp(p->qp); - kfree(p); - } + ipoib_cm_free_rx_reap_list(dev); cancel_delayed_work(&priv->cm.stale_task); } @@ -799,7 +956,7 @@ static struct ib_qp *ipoib_cm_create_tx_qp(struct net_device *dev, struct ipoib_ .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_RC, .qp_context = tx - }; + }; return ib_create_qp(priv->pd, &attr); } @@ -816,28 +973,28 @@ static int ipoib_cm_send_req(struct net_device *dev, data.qpn = cpu_to_be32(priv->qp->qp_num); data.mtu = cpu_to_be32(IPOIB_CM_BUF_SIZE); - req.primary_path = pathrec; - req.alternate_path = NULL; - req.service_id = cpu_to_be64(IPOIB_CM_IETF_ID | qpn); - req.qp_num = qp->qp_num; - req.qp_type = qp->qp_type; - req.private_data = &data; - req.private_data_len = sizeof data; - req.flow_control = 0; + req.primary_path = pathrec; + req.alternate_path = NULL; + req.service_id = cpu_to_be64(IPOIB_CM_IETF_ID | qpn); + req.qp_num = qp->qp_num; + req.qp_type = qp->qp_type; + req.private_data = &data; + req.private_data_len = sizeof data; + req.flow_control = 0; - req.starting_psn = 0; /* FIXME */ + req.starting_psn = 0; /* FIXME */ /* * Pick some arbitrary defaults here; we could make these * module parameters if anyone cared about setting them. */ - req.responder_resources = 4; - req.remote_cm_response_timeout = 20; - req.local_cm_response_timeout = 20; - req.retry_count = 0; /* RFC draft warns against retries */ - req.rnr_retry_count = 0; /* RFC draft warns against retries */ - req.max_cm_retries = 15; - req.srq = 1; + req.responder_resources = 4; + req.remote_cm_response_timeout = 20; + req.local_cm_response_timeout = 20; + req.retry_count = 0; /* RFC draft warns against retries */ + req.rnr_retry_count = 0; /* RFC draft warns against retries */ + req.max_cm_retries = 15; + req.srq = ipoib_cm_has_srq(dev); return ib_send_cm_req(id, &req); } @@ -1150,7 +1307,7 @@ static void ipoib_cm_skb_reap(struct work_struct *work) spin_unlock_irq(&priv->tx_lock); } -void ipoib_cm_skb_too_long(struct net_device* dev, struct sk_buff *skb, +void ipoib_cm_skb_too_long(struct net_device *dev, struct sk_buff *skb, unsigned int mtu) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -1166,20 +1323,8 @@ void ipoib_cm_skb_too_long(struct net_device* dev, struct sk_buff *skb, static void ipoib_cm_rx_reap(struct work_struct *work) { - struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv, - cm.rx_reap_task); - struct ipoib_cm_rx *p, *n; - LIST_HEAD(list); - - spin_lock_irq(&priv->lock); - list_splice_init(&priv->cm.rx_reap_list, &list); - spin_unlock_irq(&priv->lock); - - list_for_each_entry_safe(p, n, &list, list) { - ib_destroy_cm_id(p->id); - ib_destroy_qp(p->qp); - kfree(p); - } + ipoib_cm_free_rx_reap_list(container_of(work, struct ipoib_dev_priv, + cm.rx_reap_task)->dev); } static void ipoib_cm_stale_task(struct work_struct *work) @@ -1212,7 +1357,7 @@ static void ipoib_cm_stale_task(struct work_struct *work) } -static ssize_t show_mode(struct device *d, struct device_attribute *attr, +static ssize_t show_mode(struct device *d, struct device_attribute *attr, char *buf) { struct ipoib_dev_priv *priv = netdev_priv(to_net_dev(d)); @@ -1255,7 +1400,7 @@ int ipoib_cm_add_mode_attr(struct net_device *dev) return device_create_file(&dev->dev, &dev_attr_mode); } -int ipoib_cm_dev_init(struct net_device *dev) +static void ipoib_cm_create_srq(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ib_srq_init_attr srq_init_attr = { @@ -1264,7 +1409,30 @@ int ipoib_cm_dev_init(struct net_device *dev) .max_sge = IPOIB_CM_RX_SG } }; - int ret, i; + + priv->cm.srq = ib_create_srq(priv->pd, &srq_init_attr); + if (IS_ERR(priv->cm.srq)) { + if (PTR_ERR(priv->cm.srq) != -ENOSYS) + printk(KERN_WARNING "%s: failed to allocate SRQ, error %ld\n", + priv->ca->name, PTR_ERR(priv->cm.srq)); + priv->cm.srq = NULL; + return; + } + + priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring, + GFP_KERNEL); + if (!priv->cm.srq_ring) { + printk(KERN_WARNING "%s: failed to allocate CM SRQ ring (%d entries)\n", + priv->ca->name, ipoib_recvq_size); + ib_destroy_srq(priv->cm.srq); + priv->cm.srq = NULL; + } +} + +int ipoib_cm_dev_init(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; INIT_LIST_HEAD(&priv->cm.passive_ids); INIT_LIST_HEAD(&priv->cm.reap_list); @@ -1281,22 +1449,6 @@ int ipoib_cm_dev_init(struct net_device *dev) skb_queue_head_init(&priv->cm.skb_queue); - priv->cm.srq = ib_create_srq(priv->pd, &srq_init_attr); - if (IS_ERR(priv->cm.srq)) { - ret = PTR_ERR(priv->cm.srq); - priv->cm.srq = NULL; - return ret; - } - - priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring, - GFP_KERNEL); - if (!priv->cm.srq_ring) { - printk(KERN_WARNING "%s: failed to allocate CM ring (%d entries)\n", - priv->ca->name, ipoib_recvq_size); - ipoib_cm_dev_cleanup(dev); - return -ENOMEM; - } - for (i = 0; i < IPOIB_CM_RX_SG; ++i) priv->cm.rx_sge[i].lkey = priv->mr->lkey; @@ -1307,17 +1459,25 @@ int ipoib_cm_dev_init(struct net_device *dev) priv->cm.rx_wr.sg_list = priv->cm.rx_sge; priv->cm.rx_wr.num_sge = IPOIB_CM_RX_SG; - for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, i, IPOIB_CM_RX_SG - 1, - priv->cm.srq_ring[i].mapping)) { - ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); - ipoib_cm_dev_cleanup(dev); - return -ENOMEM; - } - if (ipoib_cm_post_receive(dev, i)) { - ipoib_warn(priv, "ipoib_ib_post_receive failed for buf %d\n", i); - ipoib_cm_dev_cleanup(dev); - return -EIO; + ipoib_cm_create_srq(dev); + + if (ipoib_cm_has_srq(dev)) { + for (i = 0; i < ipoib_recvq_size; ++i) { + if (!ipoib_cm_alloc_rx_skb(dev, priv->cm.srq_ring, i, + IPOIB_CM_RX_SG - 1, + priv->cm.srq_ring[i].mapping)) { + ipoib_warn(priv, "failed to allocate " + "receive buffer %d\n", i); + ipoib_cm_dev_cleanup(dev); + return -ENOMEM; + } + + if (ipoib_cm_post_receive_srq(dev, i)) { + ipoib_warn(priv, "ipoib_cm_post_receive_srq " + "failed for buf %d\n", i); + ipoib_cm_dev_cleanup(dev); + return -EIO; + } } } @@ -1328,7 +1488,7 @@ int ipoib_cm_dev_init(struct net_device *dev) void ipoib_cm_dev_cleanup(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); - int i, ret; + int ret; if (!priv->cm.srq) return; @@ -1342,13 +1502,7 @@ void ipoib_cm_dev_cleanup(struct net_device *dev) priv->cm.srq = NULL; if (!priv->cm.srq_ring) return; - for (i = 0; i < ipoib_recvq_size; ++i) - if (priv->cm.srq_ring[i].skb) { - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - priv->cm.srq_ring[i].mapping); - dev_kfree_skb_any(priv->cm.srq_ring[i].skb); - priv->cm.srq_ring[i].skb = NULL; - } - kfree(priv->cm.srq_ring); + + ipoib_cm_free_rx_ring(dev, priv->cm.srq_ring); priv->cm.srq_ring = NULL; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 5063dd5..52bc2bd 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -345,12 +345,12 @@ static inline int post_send(struct ipoib_dev_priv *priv, { struct ib_send_wr *bad_wr; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; + priv->tx_sge.addr = addr; + priv->tx_sge.length = len; - priv->tx_wr.wr_id = wr_id; + priv->tx_wr.wr_id = wr_id; priv->tx_wr.wr.ud.remote_qpn = qpn; - priv->tx_wr.wr.ud.ah = address; + priv->tx_wr.wr.ud.ah = address; return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a03a65e..623458e 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -471,8 +471,8 @@ static struct ipoib_path *path_rec_create(struct net_device *dev, void *gid) INIT_LIST_HEAD(&path->neigh_list); memcpy(path->pathrec.dgid.raw, gid, sizeof (union ib_gid)); - path->pathrec.sgid = priv->local_gid; - path->pathrec.pkey = cpu_to_be16(priv->pkey); + path->pathrec.sgid = priv->local_gid; + path->pathrec.pkey = cpu_to_be16(priv->pkey); path->pathrec.numb_path = 1; path->pathrec.traffic_class = priv->broadcast->mcmember.traffic_class; @@ -947,34 +947,34 @@ static void ipoib_setup(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); - dev->open = ipoib_open; - dev->stop = ipoib_stop; - dev->change_mtu = ipoib_change_mtu; - dev->hard_start_xmit = ipoib_start_xmit; - dev->tx_timeout = ipoib_timeout; - dev->header_ops = &ipoib_header_ops; - dev->set_multicast_list = ipoib_set_mcast_list; - dev->neigh_setup = ipoib_neigh_setup_dev; + dev->open = ipoib_open; + dev->stop = ipoib_stop; + dev->change_mtu = ipoib_change_mtu; + dev->hard_start_xmit = ipoib_start_xmit; + dev->tx_timeout = ipoib_timeout; + dev->header_ops = &ipoib_header_ops; + dev->set_multicast_list = ipoib_set_mcast_list; + dev->neigh_setup = ipoib_neigh_setup_dev; netif_napi_add(dev, &priv->napi, ipoib_poll, 100); - dev->watchdog_timeo = HZ; + dev->watchdog_timeo = HZ; - dev->flags |= IFF_BROADCAST | IFF_MULTICAST; + dev->flags |= IFF_BROADCAST | IFF_MULTICAST; /* * We add in INFINIBAND_ALEN to allow for the destination * address "pseudoheader" for skbs without neighbour struct. */ - dev->hard_header_len = IPOIB_ENCAP_LEN + INFINIBAND_ALEN; - dev->addr_len = INFINIBAND_ALEN; - dev->type = ARPHRD_INFINIBAND; - dev->tx_queue_len = ipoib_sendq_size * 2; - dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; + dev->hard_header_len = IPOIB_ENCAP_LEN + INFINIBAND_ALEN; + dev->addr_len = INFINIBAND_ALEN; + dev->type = ARPHRD_INFINIBAND; + dev->tx_queue_len = ipoib_sendq_size * 2; + dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; - priv->mcast_mtu = priv->admin_mtu = dev->mtu; + dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; + priv->mcast_mtu = priv->admin_mtu = dev->mtu; memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); @@ -1266,6 +1266,8 @@ static int __init ipoib_init_module(void) ipoib_sendq_size = min(ipoib_sendq_size, IPOIB_MAX_QUEUE_SIZE); ipoib_sendq_size = max(ipoib_sendq_size, IPOIB_MIN_QUEUE_SIZE); + ipoib_max_conn_qp = min(ipoib_max_conn_qp, IPOIB_CM_MAX_CONN_QP); + ret = ipoib_register_debugfs(); if (ret) return ret; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 9bcfc7a..858ada1 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -702,7 +702,7 @@ void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb) out: if (mcast && mcast->ah) { - if (skb->dst && + if (skb->dst && skb->dst->neighbour && !*to_ipoib_neigh(skb->dst->neighbour)) { struct ipoib_neigh *neigh = ipoib_neigh_alloc(skb->dst->neighbour, @@ -710,7 +710,7 @@ out: if (neigh) { kref_get(&mcast->ah->ref); - neigh->ah = mcast->ah; + neigh->ah = mcast->ah; list_add_tail(&neigh->list, &mcast->neigh_list); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3c6e45d..433e99a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -172,8 +172,12 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) size = ipoib_sendq_size + ipoib_recvq_size + 1; ret = ipoib_cm_dev_init(dev); - if (!ret) - size += ipoib_recvq_size + 1 /* 1 extra for rx_drain_qp */; + if (!ret) { + if (ipoib_cm_has_srq(dev)) + size += ipoib_recvq_size + 1; /* 1 extra for rx_drain_qp */ + else + size += ipoib_recvq_size * ipoib_max_conn_qp; + } priv->cq = ib_create_cq(priv->ca, ipoib_ib_completion, NULL, dev, size, 0); if (IS_ERR(priv->cq)) { @@ -197,12 +201,12 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; - priv->tx_sge.lkey = priv->mr->lkey; + priv->tx_sge.lkey = priv->mr->lkey; - priv->tx_wr.opcode = IB_WR_SEND; - priv->tx_wr.sg_list = &priv->tx_sge; - priv->tx_wr.num_sge = 1; - priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->tx_wr.opcode = IB_WR_SEND; + priv->tx_wr.sg_list = &priv->tx_sge; + priv->tx_wr.num_sge = 1; + priv->tx_wr.send_flags = IB_SEND_SIGNALED; return 0;