Commit Graph

356 Commits

Author SHA1 Message Date
Navdeep Parhar
1fa10c920e Make sure *some* edc is setup even for an unknown transceiver (assume
it is optical).
2009-11-13 00:31:51 +00:00
Navdeep Parhar
7ead19d4a8 sc->rev and is_offload(sc) will always be 0 during probe. Wait till
attach to get correct values.
2009-11-13 00:28:16 +00:00
John Baldwin
e1b17582f4 Take a step towards removing if_watchdog/if_timer. Don't explicitly set
if_watchdog/if_timer to NULL/0 when initializing an ifnet.  if_alloc()
sets those members to NULL/0 already.
2009-11-06 14:55:01 +00:00
Navdeep Parhar
c01f2b8301 cxgb(4) updates, including:
- support for the new Gen-2, BT, and LP-CR cards.
- T3 firmware 7.7.0
- shared "common code" updates.

Approved by:	gnn (mentor)
Obtained from:	Chelsio
MFC after:	1 month
2009-10-05 20:21:41 +00:00
Navdeep Parhar
81af4f18d6 There is no need to log anything for a ctrlq stall or restart. These are
normal events.

Approved by:	gnn (mentor)
MFC after:	1 month
2009-09-09 18:55:18 +00:00
John Baldwin
6b2eaa836f Fill the reverse RSS map with 0xff's so that the subsequent loop to
calculate the values will work properly.

Reviewed by:	np
MFC after:	1 month
2009-09-04 21:00:45 +00:00
Robert Watson
315e3e38fa Many network stack subsystems use a single global data structure to hold
all pertinent statatistics for the subsystem.  These structures are
sometimes "borrowed" by kernel modules that require a place to store
statistics for similar events.

Add KPI accessor functions for statistics structures referenced by kernel
modules so that they no longer encode certain specifics of how the data
structures are named and stored.  This change is intended to make it
easier to move to per-CPU network stats following 8.0-RELEASE.

The following modules are affected by this change:

      if_bridge
      if_cxgb
      if_gif
      ip_mroute
      ipdivert
      pf

In practice, most of these statistics consumers should, in fact, maintain
their own statistics data structures rather than borrowing structures
from the base network stack.  However, that change is too agressive for
this point in the release cycle.

Reviewed by:	bz
Approved by:	re (kib)
2009-08-02 19:43:32 +00:00
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
Lawrence Stewart
5f1ff8136a Fix a buglet that slipped into r195654. My buildworld/buildkernel sanity
check missed this because cxgb's TOM is currently commented out of the build
system.

Submitted by:	Navdeep Parhar <np at FreeBSD dot org>
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-14 11:53:21 +00:00
Lawrence Stewart
237fbe0a1c Replace struct tcpopt with a proxy toeopt struct in the TOE driver interface to
the TCP syncache. This returns struct tcpopt to being private within the TCP
implementation, thus allowing it to be modified without ABI concerns.

The patch breaks the ABI. Bump __FreeBSD_version to 800103 accordingly. The cxgb
driver is the only TOE consumer affected by this change, and needs to be
recompiled along with the kernel.

Suggested by:	rwatson
Reviewed by:	rwatson, kmacy
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-13 11:51:02 +00:00
Navdeep Parhar
adb1423aa6 Fix cxgb(4) panic with jumbo frames.
Reviewed by:	kmacy
Approved by:	re (kib), gnn (mentor)
2009-07-09 19:27:58 +00:00
Robert Watson
5157862aa2 Use if_maddr_rlock() instead of IF_ADDR_LOCK() to protect access to
if_multiaddrs in if_cxgb.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 19:04:08 +00:00
Navdeep Parhar
fd3e790228 mvec routines should have no knowledge of the SG engine.
Reviewed by:	kmacy
Approved by:	gnn (mentor)
2009-06-25 21:50:15 +00:00
Navdeep Parhar
2975f78738 Various ifmedia related fixes in cxgb(4), including:
- build ifmedia list based on phy->caps, not string comparisons.
- rebuild media list when a transceiver change is detected.
- return EOPNOTSUPP instead of ENXIO in cxgb_media_status.

Approved by:	gnn (mentor)
MFC after:	2 weeks.
2009-06-24 21:56:05 +00:00
Bjoern A. Zeeb
5736e6fb9d After cleaning up rt_tables from vnet.h and cleaning up opt_route.h
a lot of files no longer need route.h either. Garbage collect them.
While here remove now unneeded vnet.h #includes as well.
2009-06-23 17:03:45 +00:00
Navdeep Parhar
16daf36814 Fix cxgb's ifmedia ioctl handling. Also fixed a comment.
Reviewed by:	kmacy
Approved by:	gnn (mentor)
2009-06-22 21:42:57 +00:00
Robert Watson
8896f83a58 Add a new function, ifa_ifwithaddr_check(), which rather than returning
a pointer to an ifaddr matching the passed socket address, returns a
boolean indicating whether one was present.  In the (near) future,
ifa_ifwithaddr() will return a referenced ifaddr rather than a raw
ifaddr pointer, and the new wrapper will allow callers that care only
about the boolean condition to avoid having to free that reference.

MFC after:	3 weeks
2009-06-22 10:59:34 +00:00
Kip Macy
e321e16bfb fix !x86 cxgb compile 2009-06-21 01:17:38 +00:00
Kip Macy
8b6dccee61 fix typo in conditional 2009-06-20 19:09:41 +00:00
Kip Macy
75417d6de3 - fix dma map handling for !x86 case
- fix allocation failure handing in refill_fl
2009-06-20 18:57:14 +00:00
Kip Macy
3f345a5d09 Greatly simplify cxgb by removing almost all of the custom mbuf management logic
- remove mbuf iovec - useful, but adds too much complexity when isolated to
   the driver

- remove driver private caching - insufficient benefit over UMA to justify
  the added complexity and maintenance overhead

- remove separate logic for managing multiple transmit queues, with the
  new drbr routines the control flow can be made to much more closely resemble
  legacy drivers

- remove dedicated service threads, with per-cpu callouts one can get the same
  benefit much more simply by registering a callout 1 tick in the future if there
  are still buffered packets

- remove embedded mbuf usage - Jeffr's changes will (I hope) soon be integrated
  greatly reducing the overhead of using kernel APIs for reference counting
  clusters

- add hysteresis to descriptor coalescing logic

- add coalesce threshold sysctls to allow users to decide at run-time
  between optimizing for forwarding / UDP or optimizing for TCP

- add once per second watchdog to effectively close the very rare races
  occurring from coalescing

- incorporate Navdeep's changes to the initialization path required to
  convert port and adapter locks back to ordinary mutexes (silencing BPF
  LOR complaints)

- enable prefetches in get_packet and tx cleaning

Reviewed by:	navdeep@
MFC after:	2 weeks
2009-06-19 23:34:32 +00:00
Sam Leffler
d659538f72 r193336 moved ifq_detach to if_free which broke if_alloc followed
by if_free (w/o doing if_attach); move ifq_attach to if_alloc and
rename ifq_attach/detach to ifq_init/ifq_delete to better identify
their purpose

Reviewed by:	jhb, kmacy
2009-06-15 19:50:03 +00:00
George V. Neville-Neil
02c7d9a64f Re-add the send queue tunable for people who do not use buffering.
Reviewed by:	jhb
MFC after:	3 days
2009-06-11 21:32:26 +00:00
George V. Neville-Neil
74b0900f05 Add a missing error statistic, the number of FCS errors on receive.
Reviewed by:	jhb
MFC after:	1 day
2009-06-10 14:34:56 +00:00
Kip Macy
a913be0917 - add drbr routines for accessing #qentries and conditionally dequeueing
- track bytes enqueued in buf_ring
2009-06-09 19:19:16 +00:00
Bjoern A. Zeeb
8d8bc0182e After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.
2009-06-08 19:57:35 +00:00
John Baldwin
74fb0ba732 Rework socket upcalls to close some races with setup/teardown of upcalls.
- Each socket upcall is now invoked with the appropriate socket buffer
  locked.  It is not permissible to call soisconnected() with this lock
  held; however, so socket upcalls now return an integer value.  The two
  possible values are SU_OK and SU_ISCONNECTED.  If an upcall returns
  SU_ISCONNECTED, then the soisconnected() will be invoked on the
  socket after the socket buffer lock is dropped.
- A new API is provided for setting and clearing socket upcalls.  The
  API consists of soupcall_set() and soupcall_clear().
- To simplify locking, each socket buffer now has a separate upcall.
- When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from
  the receive socket buffer automatically.  Note that a SO_SND upcall
  should never return SU_ISCONNECTED.
- All this means that accept filters should now return SU_ISCONNECTED
  instead of calling soisconnected() directly.  They also no longer need
  to explicitly clear the upcall on the new socket.
- The HTTP accept filter still uses soupcall_set() to manage its internal
  state machine, but other accept filters no longer have any explicit
  knowlege of socket upcall internals aside from their return value.
- The various RPC client upcalls currently drop the socket buffer lock
  while invoking soreceive() as a temporary band-aid.  The plan for
  the future is to add a new flag to allow soreceive() to be called with
  the socket buffer locked.
- The AIO callback for socket I/O is now also invoked with the socket
  buffer locked.  Previously sowakeup() would drop the socket buffer
  lock only to call aio_swake() which immediately re-acquired the socket
  buffer lock for the duration of the function call.

Discussed with:	rwatson, rmacklem
2009-06-01 21:17:03 +00:00
Marko Zec
b2bc853659 Update VNET base pointer setting macro to use a correct source of
vnet context.

Approved by:	julian (mentor)
2009-06-01 21:10:23 +00:00
George V. Neville-Neil
e3503bc98d Rework interrupt bringup and teardown.
Calculate the exact number of vectors we'll use before calling
pci_alloc_msix.  Don't grab nine all the time.

Call cxgb_setup_interrupts once per T3, not once per port.  Ditto
for cxgb_teardown_interrupts.

Don't leak resources when interrupt setup fails in the middle.

Obtained from:	Navdeep Parhar
MFC after:	10 days
2009-05-27 20:13:36 +00:00
George V. Neville-Neil
7529967798 Partial reversion of previous commit. The CXGB_SHUTDOWN flag does NOT
need to be inverted when doing an ifconfig down of an interface.

Pointed out by:	Navdeep Parhar
MFC after: 1 week
2009-05-22 18:26:47 +00:00
George V. Neville-Neil
c2009a4c90 Fix a possible panic cxgb_controller_attach() routine that would occur
only if prepping the adapter failed.

Slight adjustment to comments.

Fix a bug whereby downing the interface didn't preven it from
processing packets.

Submitted by:	Navdeep Parhar
MFC after:	1 week
2009-05-22 15:06:03 +00:00
George V. Neville-Neil
0bbdea7776 Integrate three changes from Chelsio.
1) Add a sysctl that will say what type of PHYs exist on the card.
2) Fix a bug that occurs when an AEL 2005 PHY resets without a transciever
in the card.
3) Unify the PHY link detection code.

Obtained from:	Navdeep Parhar
MFC after:	10 days
2009-05-21 15:08:03 +00:00
George V. Neville-Neil
3cf138bb79 Modified the attach and detach routines to handle bringing ports up
and down more cleanly.  This addresses a problem where if we have the
link flap during boot the driver would lock up the system.

Reviewed by:	jhb
MFC after:	1 week
2009-05-21 14:43:12 +00:00
Warner Losh
00b4e54ae7 We no longer need to use d_thread_t, migrate to struct thread *. 2009-05-20 17:29:21 +00:00
Kip Macy
24df53962e fix bug introduced by last change
Submitted by:	Navdeep Parhar
2009-05-12 03:30:25 +00:00
Marko Zec
21ca7b57bd Change the curvnet variable from a global const struct vnet *,
previously always pointing to the default vnet context, to a
dynamically changing thread-local one.  The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE().  Recursions
on curvnet are permitted, though strongly discuouraged.

This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.

The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc.  Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.

The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.

This change also introduces a DDB subcommand to show the list of all
vnet instances.

Approved by:	julian (mentor)
2009-05-05 10:56:12 +00:00
Kip Macy
bf541f371e simplify by removing dead code 2009-04-27 22:54:30 +00:00
Robert Watson
78b5071407 Update stats in struct tcpstat using two new macros, TCPSTAT_ADD() and
TCPSTAT_INC(), rather than directly manipulating the fields across the
kernel.  This will make it easier to change the implementation of
these statistics, such as using per-CPU versions of the data structures.

MFC after:	3 days
2009-04-11 22:07:19 +00:00
Kip Macy
80cb9f211a Import "flowid" support for serializing flows across transmit queues
Reviewed by:	rwatson and jeli
2009-04-10 06:16:14 +00:00
George V. Neville-Neil
0c1ff9c605 Minor updates to the Chelsio driver, including removing an LOR.
Submitted by:	Navdeep Parhar at Chelsio
Reviewed by:	gnn
MFC after: 	3 weeks
2009-03-23 19:58:26 +00:00
George V. Neville-Neil
6b6e256ae1 Fix a bug in the recent update to the Chelsio driver.
The tick routine was not being restarted in the init_locked routine
which could resulted in loss of carrier when updating the MTU.

Submitted by:	Navdeep Parhar at Chelsio
MFC after:	3 weeks
2009-03-21 17:09:00 +00:00
Robert Watson
a6c19108c5 Prefer ENETDOWN to ENXIO when returning queuing errors due to a link
down, interface down, etc, with if_cxgb's if_transmit routine.

MFC after:	3 days
Reviewed by:	kmacy
2009-03-10 22:35:45 +00:00
George V. Neville-Neil
f2d8ff04fe Update the Chelsio driver to the latest bits from Chelsio
Firmware upgraded to 7.1.0 (from 5.0.0).
T3C EEPROM and SRAM added; Code to update eeprom/sram fixed.
fl_empty and rx_fifo_ovfl counters can be observed via sysctl.
Two new cxgbtool commands to get uP logic analyzer info and uP IOQs
Synced up with Chelsio's "common code" (as of 03/03/09)

Submitted by:	 Navdeep Parhar at Chelsio
Reviewed by:	gnn
MFC after:	2 weeks
2009-03-10 19:22:45 +00:00
Bjoern A. Zeeb
33553d6e99 For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.
2009-02-27 14:12:05 +00:00
George V. Neville-Neil
837f41b067 Check in the actual module recognition code for the Chelsio
driver.

Obtained from:	Chelsio Inc.
2008-12-18 14:21:35 +00:00
Bjoern A. Zeeb
dcdb4371ca Use inc_flags instead of the inc_isipv6 alias which so far
had been the only flag with random usage patterns.
Switch inc_flags to be used as a real bit field by using
INC_ISIPV6 with bitops to check for the 'isipv6' condition.

While here fix a place or two where in case of v4 inc_flags
were not properly initialized before.[1]

Found by:	rwatson during review [1]
Discussed with:	rwatson
Reviewed by:	rwatson
MFC after:	4 weeks
2008-12-17 12:52:34 +00:00
Qing Li
6e6b3f7cbc This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
   possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,

The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.

Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:

- Kip Macy revised the locking code completely, thus completing
  the last piece of the puzzle, Kip has also been conducting
  active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
  provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
  me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
George V. Neville-Neil
7f15419bb7 Bug fix to support N310 version of Chelsio cards (board ID 1088).
Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-12-06 02:10:53 +00:00
George V. Neville-Neil
5197f3abd7 Re submit code to print the part and serial number for Chelsio cards.
The original code was accidentally removed in another commit.

MFC after: 1 day
2008-12-05 21:40:11 +00:00
George V. Neville-Neil
9036240993 Fix a bug with the ael1006 PHY. The bug shows up as persistent but incomplete
packet loss, of between 10-30%. The fix is to put the PHY into
and take it out of local loopback mode when resetting the interface.

Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-12-04 20:32:53 +00:00
Bjoern A. Zeeb
4b79449e2f Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by:	brooks, gnn, des, zec, imp
Sponsored by:	The FreeBSD Foundation
2008-12-02 21:37:28 +00:00
George V. Neville-Neil
c9c0c99f30 Bug fix from Chelsio which addresses the issue of the device resetting
when it sees only received packets.  In some cases where a device only
recieves data it mistakenly thinks that its transmitting side is broken
and resets the device.

Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-12-02 15:42:47 +00:00
Kip Macy
f3c713fdaf - fix bug where dnsperf would stop transmitting after a few seconds
- break complex conditionals in to multiple lines to avoid wrapping
- remove copious unused debug statements
- be more aggressive about cleaning in the calling thread
- eliminate usage of ENOSPC
- increase number of iterations that cxgbsp can do
- eliminate "initerr" usage to simplify ENOBUFS handling
- when coalescing pass all packets to BPF
- always set overrun if hardware queue is full
2008-12-02 07:01:18 +00:00
Kip Macy
e44730ed21 The pkthdr field is flowid not rss_hash 2008-12-02 00:51:56 +00:00
Kip Macy
6b9003b6e6 - fix multiqueue conditional
- don't leak mbuf tags in the non-conditional case

Found by: Navdeep Parhar
2008-12-02 00:48:08 +00:00
Kip Macy
6b58e3f4db integrate use after free fixes from private branch
Found by: kkenn@
2008-12-02 00:39:50 +00:00
Kip Macy
69c4b66c6d null out m_next when marshalling a packet 2008-12-01 05:44:08 +00:00
Kip Macy
f35c2d6551 Update internal mac stats every time the tick task is called
if we don't do this "netstat -w 1" will frequently see negative
differences in packets sent
2008-12-01 05:43:30 +00:00
Kip Macy
45839f4aeb don't manually track statistics 2008-12-01 04:42:39 +00:00
Kip Macy
ceac50eb77 Proper fix for tracking ifnet statistics 2008-12-01 04:41:45 +00:00
Kip Macy
5eba27fe0c Add backward compatibility ifdefs for non-multiq kernels 2008-11-23 07:30:07 +00:00
Kip Macy
5d96be4eb8 work around periodic leak on queue overrun by enabling coalescing of packets in to
work requests by default
2008-11-23 00:22:52 +00:00
Kip Macy
098fb9b469 intr_machdep.h breaks build on some arches and is not needed 2008-11-23 00:13:25 +00:00
Kip Macy
a02573bc06 - enable multiple transmit queues
- invert sense of hw.cxgb.singleq tunable to hw.cxgb.multiq
- don't wake up transmitting thread by default
- add per tx queue ifaltq to handle ALTQ
- remove several unused functions in cxgb_multiq.c
- add several sysctls: multiq_tx_enable, coalesce_tx_enable,
  and wakeup_tx_thread
- this obsoletes the hw.cxgb.snd_queue_len as ifq is replaced
  by a buf_ring
2008-11-22 08:05:05 +00:00
Kip Macy
db7f0b974f - bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
  allow drivers to efficiently manage multiple hardware queues
  (i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.
2008-11-22 05:55:56 +00:00
George V. Neville-Neil
2a1b9f07fc Several small additions to the Chelsio 10G driver.
1) Fix a bug in dealing with the Alerus 1006 PHY which prevented the
device from ever coming back up once it had been set to down.

2) Add a kernel tunable (hw.cxgb.snd_queue_len) which makes it possible
to give the device more than IFQ_MAXLEN entries in its send queue.  The
default remains 50.

3) Add code to place the card'd identification and serial number into
its description (%desc) so that users can tell which card they have
installed.
2008-11-21 19:22:25 +00:00
Marko Zec
44e33a0758 Change the initialization methodology for global variables scheduled
for virtualization.

Instead of initializing the affected global variables at instatiation,
assign initial values to them in initializer functions.  As a rule,
initialization at instatiation for such variables should never be
introduced again from now on.  Furthermore, enclose all instantiations
of such global variables in #ifdef VIMAGE_GLOBALS blocks.

Essentialy, this change should have zero functional impact.  In the next
phase of merging network stack virtualization infrastructure from
p4/vimage branch, the new initialization methology will allow us to
switch between using global variables and their counterparts residing in
virtualization containers with minimum code churn, and in the long run
allow us to intialize multiple instances of such container structures.

Discussed at:	devsummit Strassburg
Reviewed by:	bz, julian
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-11-19 09:39:34 +00:00
Kip Macy
d1340303b9 Update firmware version check
make ddp a tunable

Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-11-12 04:45:09 +00:00
Bjoern A. Zeeb
ca6bc428e7 For now our LRO code (tcp_lro.c) only supports IPv4 properly thus
only enable if INET is on.

Reviewed by:	kmacy
MFC after:	2 months
2008-11-06 10:35:46 +00:00
Bjoern A. Zeeb
34627f9384 Hide AF_INET specific ioctl handling under #ifdef INET.
Reviewed by:	kmacy
MFC after:	2 months
2008-11-06 10:17:57 +00:00
Kip Macy
9762ac4204 Track number of packets transmitted and number of packets received
PR:	125806
MFC after:	3 days
2008-10-17 07:04:29 +00:00
Kip Macy
5ec372d1fa Fix bug in LRO on T304 whereby a packet could be sent to the wrong interface's ifp.
Submitted by:	Chelsio Inc.
MFC after:	1 day
2008-10-03 00:50:26 +00:00
Marko Zec
8b615593fc Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by:	julian, bz, brooks, zec
Reviewed by:	julian, bz, brooks, kris, rwatson, ...
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-10-02 15:37:58 +00:00
Kip Macy
1cfe52493f update callers of vm_fault_hold_user_pages
MFC after:	1 week
2008-09-30 23:45:22 +00:00
Kip Macy
7585919a7f Refactor vm_fault_hold_user_pages:
- simplify page hold logic
- allow pages for processes other than that of curthread to
  have pages held
- normalize the interface to more closely resemble the functions in
  sys/vm

MFC after:	1 week
2008-09-30 23:44:44 +00:00
Kip Macy
41509ecd3a Make sure that optical PHYs work ...
Submitted by:	Chelsio Inc.
MFC after:	1 day
2008-09-30 21:21:52 +00:00
Kip Macy
e2b2d0e9d5 vm_fault_hold_user_pages will not return if an address in the range passed in is mapped RO
but an RW mapping exists for the underlying page. This change fixes the bug by using the
page / NULL returned from pmap_extract_and_hold to determine whether or not vm_fault needs
to be called.

The bug was pointed out by alc.

MFC after:	3 days
2008-09-29 22:13:29 +00:00
Kip Macy
82c2cf3b05 fix insta-panic:
- determine which ext_arg offsets to use based on the version number

Submitted by:	Chelsio Inc.
MFC after:	1 day
2008-09-25 06:46:28 +00:00
Kip Macy
a7db7fbd35 - Remove default NIC dependency on ulp headers
- make toe module build dependent on kernel support

Submitted by:	Chelsio Inc.
MFC after:	1 week
2008-09-24 01:19:08 +00:00
Kip Macy
79775f8f1b Update cxgb include paths to not require prefixing with dev/cxgb
Submitted by:	Chelsio Inc.
2008-09-23 03:16:54 +00:00
Kip Macy
e97121da99 Allow cxgb to be unified across versions by making newer features conditional
Submitted by:	Chelsio Inc
MFC after:	3 days
2008-09-23 02:22:24 +00:00
Kip Macy
9f58ea1678 - Fix flag check
- Fix adaptive thread sleep
- set oactive when queue is full
2008-09-23 01:55:36 +00:00
Kip Macy
c7e1ab2e4c - Track number of times that the transmit queue overflowed
- Trivial whitespace cleanup

MFC after:	3 days
2008-09-23 01:27:19 +00:00
Kip Macy
023d231918 Fix issue with tom loading by moving cxgb_log_tcb in to tom
MFC after:	3 days
2008-09-19 21:12:19 +00:00
Kip Macy
7c2bd2b9e2 Fix two panics:
1. panic: rtalloc1_fib: bad fibnum

2. panic: Lock tcpinp not exclusively locked
@ /usr/src/sys/netinet/in_pcb.c:1284

Submitted by:	Chelsio Inc.
MFC after:	3 days
2008-09-18 23:56:42 +00:00
Attilio Rao
cecd8edba5 Remove the suser(9) interface from the kernel. It has been replaced from
years by the priv_check(9) interface and just very few places are left.
Note that compatibility stub with older FreeBSD version
(all above the 8 limit though) are left in order to reduce diffs against
old versions. It is responsibility of the maintainers for any module, if
they think it is the case, to axe out such cases.

This patch breaks KPI so __FreeBSD_version will be bumped into a later
commit.

This patch needs to be credited 50-50 with rwatson@ as he found time to
explain me how the priv_check() works in detail and to review patches.

Tested by:      Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Reviewed by:    rwatson
2008-09-17 15:49:44 +00:00
Kip Macy
c5f0f48581 Further whitespace and copyright cleanups to minimize the
delta with RELENG_7.
2008-09-16 02:28:08 +00:00
Kip Macy
aa819acf89 White space cleanups to bring closer to RELENG_7 2008-09-16 02:03:28 +00:00
Kip Macy
af9b081c37 Remove some dead code along with gratuitous differences between HEAD and 7 2008-09-16 01:02:17 +00:00
Kip Macy
d5e2c3dd04 Fix issue with recovering from transient jumbo mbuf shortage.
Submitted by:	Chelsio Inc.
MFC after:	3 days
2008-09-09 01:36:02 +00:00
Julian Elischer
ad34e48415 New file missed vimagification. 2008-09-03 19:23:01 +00:00
Kip Macy
6eb15755c7 Indicate at probe time if device can do offload and which revision it is
MFC after:	3 days
2008-09-02 22:38:49 +00:00
Kip Macy
1ffd6e5809 Import ioctl updates for latest rev of cxgbtool
Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-09-02 07:47:14 +00:00
Kip Macy
7c80e4f37f Don't check if an interface can do tcp offload if there are no offload devices registered on the system.
Suggested by: rwatson
MFC after:	3 days
2008-09-01 05:30:22 +00:00
Bjoern A. Zeeb
603724d3ab Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
Kip Macy
2bd5f41aae Fix runt TSO packet issue.
Obtained from:	Chelsio Inc.
MFC after:	1 week
2008-08-13 01:32:32 +00:00
Kip Macy
706cb31f0a Add LRO and MAC statistics to exported sysctls.
Obtained from:	Chelsio Inc.
MFC after:	1 week
2008-08-13 01:30:41 +00:00
Kip Macy
25292deb42 Remove cxgb private lro implementation and switch to using system implementation.
Obtained from:	Chelsio Inc.
MFC after:	1 week
2008-08-12 00:27:32 +00:00
Kip Macy
9b4de886f9 Vendor fix for PHY problem.
Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-08-11 23:01:34 +00:00
Kip Macy
006c3d2eb6 remove socketvar.h, add more selective includes 2008-07-31 20:28:58 +00:00
Paul Saab
721c409daf Unbreak the build by including sys/socketvar.h 2008-07-31 01:52:04 +00:00
Kip Macy
c316a7ab32 fix includes for post sockbuf re-factor 2008-07-30 20:08:34 +00:00
Kip Macy
99d803b98f remove call to unsafe tcp_twstart function 2008-07-21 21:23:43 +00:00
Kip Macy
7038c6d2c1 remove unneeded declarations 2008-07-21 02:34:52 +00:00
Kip Macy
9a027ea9a4 remove local version of tcp_offload_* functions 2008-07-21 02:29:40 +00:00
Kip Macy
d21cb942bc update syncache function names 2008-07-21 02:26:49 +00:00
Kip Macy
83324f5cb5 remove cxgb local definition of locked syncache_expand 2008-07-21 02:17:27 +00:00
Kip Macy
e331636d14 remove cxgb local definitions of socket accessor functions 2008-07-21 01:23:19 +00:00
Kip Macy
38ddd4d3ed new vendor PHY support 2008-07-18 07:01:51 +00:00
Kip Macy
4af83c8cff import vendor fixes to cxgb 2008-07-18 06:12:31 +00:00
Kip Macy
71dba7f30c conditionally define PANIC_IF, remove 'unlikely' 2008-05-05 22:37:21 +00:00
Kip Macy
713edd3a06 LINT fixes 2008-05-05 20:41:10 +00:00
Kip Macy
66f645e768 import support for iwarp on Chelsio T3 card
Supported by Chelsio Inc.
2008-05-05 18:46:18 +00:00
Kip Macy
ed0fb18dc6 MFSVN:
- add / remove clients from cxgb_main.c now
 - change ifdef TOE_ENABLED to TCP_OFFLOAD_DISABLE
 - update copyrights
 - fix transmit data mismatch bug caused by not setting SB_NOCOALESCE
   on tx sockbuf on passive connections
 - fix receive sequence mismatch bug caused by not setting SB_NOCOALESCE
   on rx sockbuf on passive connections
 - don't sleep without checking SBS_CANTRCVMORE first
 - various ddp ordering fixes

Supported by: Chelsio Inc.
2008-05-05 01:41:53 +00:00
Kip Macy
6d294e500e remove kdb_backtrace() call 2008-04-19 03:43:06 +00:00
Kip Macy
46b0a854cc move cxgb_lt2.[ch] from NIC to TOE
move most offload functionality from NIC to TOE
factor out all socket and inpcb direct access
factor out access to locking in incpb, pcbinfo, and sockbuf
2008-04-19 03:22:43 +00:00
Robert Watson
8501a69cc9 Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to
explicitly select write locking for all use of the inpcb mutex.
Update some pcbinfo lock assertions to assert locked rather than
write-locked, although in practice almost all uses of the pcbinfo
rwlock main exclusive, and all instances of inpcb lock acquisition
are exclusive.

This change should introduce (ideally) little functional change.
However, it lays the groundwork for significantly increased
parallelism in the TCP/IP code.

MFC after:	3 months
Tested by:	kris (superset of committered patch)
2008-04-17 21:38:18 +00:00
Kip Macy
ef9e6d4c6c reduce the size of the jumbo ring on i386 and disable pcpu cluster caching 2008-03-31 21:02:27 +00:00
Kip Macy
e79dd20dd5 change inp_wlock_assert to inp_lock_assert 2008-03-24 20:24:04 +00:00
Kip Macy
cf7a8ff3b7 remove unneccessary tcbinfo lock acquisitions - set tp to null affter calling enter_timewait as we no longer own the inpcb 2008-03-24 05:21:10 +00:00
Kip Macy
3d5853271e Insulate inpcb consumers outside the stack from the lock type and offset within the pcb by adding accessor functions.
Reviewed by: rwatson
MFC after: 3 weeks
2008-03-23 22:34:16 +00:00
Kip Macy
f705d735fb pay attention to default cluster limits when sizing receive queues 2008-03-20 20:52:37 +00:00
Kip Macy
ef027c528c fix link management bug and conditionally allow the PHY to be kept on at all times for allowing non-conformant link state checks 2008-03-19 20:56:51 +00:00
Kip Macy
19905d6dbd - Integrate 1.133 vendor driver changes
- update some copyrights
- add improved support for delayed ack
- fix issue with fec
2008-03-18 03:55:12 +00:00
Kip Macy
dc50741adc Parameterize for module name 2008-02-26 23:12:55 +00:00
Kip Macy
a8badc1997 Remove unused files 2008-02-26 23:06:22 +00:00
Kip Macy
64a3713337 move remaining binaries in to blob headers 2008-02-26 23:05:05 +00:00
Kip Macy
404825a72b Move firmware in to separate module that can be compiled statically in to the kernel
Add utility for converting future firmware revs to a C header file
2008-02-26 03:02:20 +00:00
Giorgos Keramidas
19deb17618 Spell 'overwriting' correctly in a KASSERT() message. 2008-02-25 19:28:27 +00:00
Kip Macy
88e8506e22 Fix namespace collision with sparc macro 2008-02-24 07:19:31 +00:00
Kip Macy
7c84f79070 remove call to kdb_backtrace() 2008-02-23 21:18:13 +00:00
Kip Macy
ce86be8a6a Fix tinderbox by removing call to kdb_backtrace
MFC after: 3 days
2008-02-23 06:19:16 +00:00
Kip Macy
8e10660f12 - update firmware to 5.0
- add support for T3C
- add DDP support (zero-copy receive)
- fix TOE transmit of large requests
- fix shutdown so that sockets don't remain in CLOSING state indefinitely
- register listeners when an interface is brought up after tom is loaded
- fix setting of multicast filter
- enable link at device attach
- exit tick handler if shutdown is in progress
- add helper for logging TCB
- add sysctls for dumping transmit queues

- note that TOE wxill not be MFC'd until after 7.0 has been finalized

MFC after: 3 days
2008-02-23 01:06:17 +00:00
Poul-Henning Kamp
cf827063a9 Give MEXTADD() another argument to make both void pointers to the
free function controlable, instead of passing the KVA of the buffer
storage as the first argument.

Fix all conventional users of the API to pass the KVA of the buffer
as the first argument, to make this a no-op commit.

Likely break the only non-convetional user of the API, after informing
the relevant committer.

Update the mbuf(9) manual page, which was already out of sync on
this point.

Bump __FreeBSD_version to 800016 as there is no way to tell how
many arguments a CPP macro needs any other way.

This paves the way for giving sendfile(9) a way to wait for the
passed storage to have been accessed before returning.

This does not affect the memory layout or size of mbufs.

Parental oversight by:	sam and rwatson.

No MFC is anticipated.
2008-02-01 19:36:27 +00:00
Kip Macy
6edc218ea1 Fix loading for case where we don't overload tcp_usrreqs by calling tcp_drop directly 2008-01-27 04:39:38 +00:00
Kip Macy
a57927a1e6 fix DISABLE_MBUF_IOVEC case by initializing mbuf header completely 2008-01-27 04:37:02 +00:00
Kip Macy
9619451708 Re-enable pcpu caching by default make sysctl R/W 2008-01-19 22:47:43 +00:00
Kip Macy
8ec3680eb5 - remove bogus_imm counter
- disable pcpu cluster cache by default until reference counting is handled
  correctly for held clusters - can be re-enable by sysctl
2008-01-17 21:25:58 +00:00
Sam Leffler
eeb76a1889 promote ath_defrag to m_collapse (and retire private+unused
m_collapse from cxgb)

Reviewed by:	pyun, jhb, kmacy
MFC after:	2 weeks
2008-01-17 21:25:09 +00:00
Kip Macy
4f6a96ae5b Fix lock ordering panic by not calling ether_ioctl with port lock held
Reported by: rrs
2008-01-16 21:33:34 +00:00
Kip Macy
8030c630da remove superfluous debug printfs 2008-01-16 02:39:33 +00:00
Kip Macy
c833fdd83f Fix mbuf leak caused by freeing packet zone clusters but not their associated mbufs
- Track packet zone mbufs separately from other mbufs
- free packet zone buffers via m_free rather than trying to manage the refcount
  as with clusters - its refcount and management seems to be "special"
2008-01-16 00:28:30 +00:00
Kip Macy
2fd79ec2de put tx queue size back to 1024 2008-01-16 00:26:04 +00:00
John Baldwin
16670d1bd1 Use '%zd' to print PIO_LEN since it involves a size_t (via sizeof()) to
appease the tinderbox on 32-bit platforms.

Tested on:	amd64, i386
2008-01-15 22:01:26 +00:00
Kip Macy
139edb19d9 - Simplify mb_free_ext_fast
- increase asserts for mbuf accounting
- track outstanding mbufs (maps very closely to leaked)
- actually only create one thread per port if !multiq
    Oddly enough this fixes the use after free

- move txq_segs to stack in t3_encap
- add checks that pidx doesn't move pass cidx
- simplify mbuf free logic in collapse mbufs routine
2008-01-15 08:08:09 +00:00
Kip Macy
60f1e27625 - move WR_LEN in to cxgb_adapter.h add PIO_LEN to make intent clearer
- move cxgb_tx_common in to cxgb_multiq.c and rename to cxgb_tx
- move cxgb_tx_common dependencies
- further simplify cxgb_dequeue_packet for the non-multiqueue case
- only launch one service thread per port in the non-multiq case
- remove dead cleaning code from cxgb_sge.c
- simplify PIO case substantially in by returning directly from mbuf collapse
  and just using m_copydata
- remove gratuitous m_gethdr in the rx path
- clarify freeing of mbufs in collapse
2008-01-15 03:27:42 +00:00
Kip Macy
74aba11713 remove superfluous locking from dequeue 2008-01-15 03:21:02 +00:00
Kip Macy
8b7399ad30 - Assert that immpkt is not set
- convert %lx to 32-bit safe %jx
2008-01-14 07:55:56 +00:00
Kip Macy
efe7dfb26c - Add more extensive sanity checks
- remove initial dequeue from cxgb_start as it was causing an mbuf to be referenced twice
2008-01-14 06:00:41 +00:00