SlideShare a Scribd company logo
1 of 68
Divye Kapoor
PracheerAgarwal
Swagat Konchada
 It is the software layer in the kernel that provides a
uniform filesystem interface to userspace programs
 It provides an abstraction within the kernel that allows
for transparent working with a variety of filesystems.
 Thus it allows many different filesystem
implementations to coexist freely
 Each socket is implemented as a “file” mounted on
the sockfs filesystem.
 file->private points to the socket information.
 Inodes provide a method to access the actual
data blocks allocated to a file. For sockets, they
provide buffer space which can be used to hold
socket specific data.
 struct inode
 Every file is represented in the kernel as an
object of the file structure. It requires an inode
provided to it.
 struct file
Struct operations {
int (*read)(int, char *, int);
void (*destroy_inode)(inode *);
void (*dirty_inode) (struct inode *);
int (*write_inode) (struct inode *, int);
void (*drop_inode) (struct inode *);
void (*delete_inode) (struct inode *);
};
Sizeof(operations) = sizeof(function ptr)*6
Divye Kapoor
User Space
Socket, bind, listen, connect, send, recv, write, read etc.
Socket Functions (Kernel)
sys_socket, sys_bind, sys_listen, sys_connect etc. in socket.c
TCP/IP Layer Functions
inet_create, tcp_v4_connect, tcp_sendmsg, tcp_recvmsg
Ethernet Device Layer
dev_hard_start_xmit
Sys_socket()
Sock_create() Sock_map_fd()
Allocate a socket object
(internally an inode
Associated with a file object)
Locate the family requested and
call the create function for that
family
Inet_create()
Lower layer initialization
Sock_alloc_fd()
Allocate a file descriptor
Sock_attach_fd()
Fd_install()
Sys_connect()
Sockfd_lookup_light()
Returns the socket object
associated with the given fd
Move_addr_to_kernel()
For userspace sockaddr *
Sock->ops->connect()
Lower layer call
Tcp_v4_connect()
Socket layer functions
are elided.
Defined in <include/linux/skbuff.h>
 used by every network layer (except the physical layer)
 fields of the structure change as it is passed from one layer to another
 i.e., fields are layer dependent.
struct sk_buff {
... ... ...
#ifdef CONFIG_NET_SCHED
_ _u32 tc_index;
#ifdef CONFIG_NET_CLS_ACT
_ _u32 tc_verd;
_ _u32 tc_classid;
#endif
#endif
}
sk_buff is peppered with C preprocessor #ifdef directives.
CONFIG_NET_SCHED symbol should be defined at compile time for the
structure to have the element tc_index.
enabled with some version of make config by an administrator.
 The kernel maintains all sk_buff structures in a doubly linked list.
struct sk_buff_head {/* only the head of the list */
/*These two members must be first. */
struct sk_buff * next;
struct sk_buff * prev;
_ _u32 qlen;
spinlock_t lock;/* atomicity in accessing a sk_buff list. */
};
 Layout
 General
 Feature-specific
 Management functions
 struct sock * sk
sock data structure of the socket that owns this buffer
 unsigned int len
includes both the data in the main buffer (i.e., the one pointed to by head)
and the data in the fragments
 unsigned int data_len
unlike len, data_len accounts only for the size of the data in the fragments.
 unsigned int truesize
skb->truesize = size + sizeof(struct sk_buff);
 atomic_t users
reference count, or the number of entities using this sk_buff buffer
atomic_inc and atomic_dec
 struct sock * sk
sock data structure of the socket that owns this buffer
 unsigned int len
includes both the data in the main buffer (i.e., the one pointed to by
head) and the data in the fragments
 unsigned int data_len
unlike len, data_len accounts only for the size of the data in the fragments.
 unsigned int truesize
skb->truesize = size + sizeof(struct sk_buff);
 atomic_t users
reference count, or the number of entities using this sk_buff buffer
atomic_inc and atomic_dec
 struct sock * sk
sock data structure of the socket that owns this buffer
 unsigned int len
includes both the data in the main buffer (i.e., the one pointed to by head)
and the data in the fragments
 unsigned int data_len
unlike len, data_len accounts only for the size of the data in the
fragments.
 unsigned int truesize
skb->truesize = size + sizeof(struct sk_buff);
 atomic_t users
reference count, or the number of entities using this sk_buff buffer
atomic_inc and atomic_dec
 struct sock * sk
sock data structure of the socket that owns this buffer
 unsigned int len
includes both the data in the main buffer (i.e., the one pointed to by head)
and the data in the fragments
 unsigned int data_len
unlike len, data_len accounts only for the size of the data in the fragments.
 unsigned int truesize
skb->truesize = size + sizeof(struct sk_buff);
 atomic_t users
reference count, or the number of entities using this sk_buff buffer
atomic_inc and atomic_dec
 struct sock * sk
sock data structure of the socket that owns this buffer
 unsigned int len
includes both the data in the main buffer (i.e., the one pointed to by head)
and the data in the fragments
 unsigned int data_len
unlike len, data_len accounts only for the size of the data in the fragments.
 unsigned int truesize
skb->truesize = size + sizeof(struct sk_buff);
 atomic_t users
reference count, or the number of entities using this sk_buff buffer
atomic_inc() and atomic_dec()
• unsigned char *head
• sk_buff_data_t end
• unsigned char *data
• sk_buff_data_t tail
struct net_device *dev
 represents the receiving interface or the to be transmitted device(or
interface) corresponding to the packet.
 usually represents the virtual device’s(representation of all devices
grouped) net_device structure.
 Pointers to protocol headers.
 sk_buff_data_t transport_header;
 sk_buff_data_t network_header;
 sk_buff_data_t mac_header;
updation of data is done using the *_header pointers
 char cb[40]
 This is a "control buffer," or storage for private information, maintained
by each layer for internal use.
struct tcp_skb_cb {
... ... ... _ _u32 seq; /* Starting sequence number */
_ _u32 end_seq; /* SEQ + FIN + SYN + datalen*/
_ _u32 when; /* used to compute rtt's */
_ _u8 flags; /*TCP header flags. */
... ... ...
};
Defined in <include/linux/skbuff.h> & <net/core/skbuff.c>
skb_put(struct sk_buff *, usingned int len)
skb_push(struct sk_buff *skb, unsigned int len)
skb_pull(struct sk_buff *skb, unsigned int len)
skb_reserve(struct sk_buff *skb, int len)
Each of the above four memory management functions return the data ptr.
defined in <net/core/skbuff.c>
struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
int fclone, int node)
…
size = SKB_DATA_ALIGN(size);
data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask);
…
struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
unsigned int length, gfp_t gfp_mask)
The buffer allocation function meant for use by device drivers
Executed in interrupt mode
Freeing memory: kfree_skb and dev_kfree_skb
Release buffer back to the buffer-pool.
Buffer released only when skb_users counter is 1. If not, the counter is
decremented.
Socket layer functions
are elided.
 Defined in <include/linux/netdevice.h>
 stores all information specifically regarding a network device
 one such structure for each device, both real ones (such as Ethernet
NICs) and virtual ones
 Network devices can be classified into types such as Ethernet cards and
Token Ring cards
 Each type may come in several models.
 Model specific parameters are initialized by device driver software.
 Parameters common for different models are initiated by kernel.
struct net_device{
char name[IFNAMSIZ];
int ifindex;
/* device name hash chain, ex: eth0 */
struct hlist_node name_hlist;
unsigned long mem_end;/* shared mem end */
unsigned long mem_start; /* shared mem start */
unsigned long base_addr; /* device I/O address */
unsigned int irq; /* device IRQ number*/
unsigned char if_port; /* Selectable AUI,TP,..*/
unsigned char dma; /* DMA channel */
…
struct net_device{
char name[IFNAMSIZ];
int ifindex;
/* device name hash chain, ex: eth0 */
struct hlist_node name_hlist;
unsigned long mem_end; /* shared mem end */
unsigned long mem_start; /* shared mem start */
unsigned long base_addr; /* device I/O address */
unsigned int irq; /* device IRQ number*/
unsigned char if_port; /* Selectable AUI,TP,..*/
unsigned char dma; /* DMA channel */
…
struct net_device{
char name[IFNAMSIZ];
int ifindex;
/* device name hash chain, ex: eth0 */
struct hlist_node name_hlist;
unsigned long mem_end;/* shared mem end */
unsigned long mem_start; /* shared mem start */
unsigned long base_addr; /* device I/O address */
unsigned int irq; /* device IRQ number*/
unsigned char if_port; /* Selectable AUI,TP,..*/
unsigned char dma; /* DMA channel
*/
struct net_device{
char name[IFNAMSIZ];
/* device name hash chain, ex: eth0 */
struct hlist_node name_hlist;
unsigned long mem_end;/* shared mem end */
unsigned long mem_start; /* shared mem start */
unsigned long base_addr; /* device I/O address */
unsigned int irq; /* device IRQ number*/
unsigned char if_port; /* Selectable AUI,TP,..*/
unsigned char dma; /* DMA channel */
unsigned short flags; /* interface flags (a la BSD) */
…
struct net_device{
char name[IFNAMSIZ];
/* device name hash chain, ex: eth0 */
struct hlist_node name_hlist;
unsigned long mem_end;/* shared mem end */
unsigned long mem_start; /* shared mem start */
unsigned long base_addr; /* device I/O address */
unsigned int irq; /* device IRQ number*/
unsigned char if_port; /* Selectable AUI,TP,..*/
unsigned char dma; /* DMA channel */
unsigned short flags; /* interface flags (a la BSD)*/
…
struct net_device{
char name[IFNAMSIZ];
/* device name hash chain, ex: eth0 */
struct hlist_node name_hlist;
unsigned long mem_end;/* shared mem end */
unsigned long mem_start; /* shared mem start */
unsigned long base_addr; /* device I/O address */
unsigned int irq; /* device IRQ number*/
unsigned char if_port; /* Selectable AUI,TP,..*/
unsigned char dma; /* DMA channel */
unsigned short flags; /* interface flags (a la BSD)*/
/* ex : IFF_UP || IFF_RUNNING || IFF_MULTICAST */
struct net_device{
…
unsigned mtu; /* interface MTU value */
unsigned short type; /* interface hardware type */
unsigned short hard_header_len; /* hardware hdr length */
unsigned char dev_addr[MAX_ADDR_LEN];
unsigned char addr_len; /* hardware address length */
unsigned char broadcast[MAX_ADDR_LEN];
unsigned int promiscuity;
…
struct net_device{
…
unsigned mtu; /* interface MTU value */
unsigned short type; /* interface hardware type*/
unsigned short hard_header_len; /* hardware hdr length */
unsigned char dev_addr[MAX_ADDR_LEN];
unsigned char addr_len; /* hardware address length */
unsigned char broadcast[MAX_ADDR_LEN];
unsigned int promiscuity;
…
struct net_device{
…
unsigned mtu; /* interface MTU value */
unsigned short type; /* interface hardware type */
unsigned short hard_header_len;/* hardware hdr length */
unsigned char dev_addr[MAX_ADDR_LEN];
unsigned char addr_len; /* hardware address length */
unsigned char broadcast[MAX_ADDR_LEN];
unsigned int promiscuity;
…
struct net_device{
…
unsigned mtu; /* interface MTU value */
unsigned short type; /* interface hardware type */
unsigned short hard_header_len; /* hardware hdr length */
unsigned char dev_addr[MAX_ADDR_LEN];
unsigned char addr_len; /* hardware address length*/
unsigned char broadcast[MAX_ADDR_LEN];
unsigned int promiscuity;
…
struct net_device{
…
unsigned mtu; /* interface MTU value */
unsigned short type; /* interface hardware type */
unsigned short hard_header_len; /* hardware hdr length */
unsigned char dev_addr[MAX_ADDR_LEN];
unsigned char addr_len; /* hardware address length */
unsigned char broadcast[MAX_ADDR_LEN];
unsigned int promiscuity;
…
struct net_device{
…
unsigned mtu; /* interface MTU value */
unsigned short type; /* interface hardware type */
unsigned short hard_header_len; /* hardware hdr length */
unsigned char dev_addr[MAX_ADDR_LEN];
unsigned char addr_len; /* hardware address length */
unsigned char broadcast[MAX_ADDR_LEN];
unsigned int promiscuity;
…
struct net_device{
…
struct net_device *next;
struct hlist_node name_hlist;
struct hlist_node index_hlist;
We don’t process the packet in the interrupt subroutine.
Netif_rx() – raise the net Rx softIRQ.
Net_rx_action() is called - start processing the packet
 Processing of packet starts with the protocol switching section
Netif_receive_skb() is called to process the packet and find out the next protocol layer.
Protocol family of the packet is extracted from the link layer header.
ip_rcv() is an entry point for IP packets processing.
Checks if the packet we have is destined for some other host (using PACKET_OTHERHOST)
Check the checksum of the packet by calling ip_fast_csum()
Call ip_route_input() , this routine checks kernel routing table rt_hash_table.
If packet needs to be forwarded input routine is ip_forward()
Otherwise ip_local_deliver()
ip_send() is called to check if the packet needs to be fragmented
If yes , fragment the packet by calling ip_fragment()
Packet output path – ip_finish_output()
ip_local_deliver() – packets need to delivered locally
ip_defrag()
Protocol identifier field skb->np.iph->protocol (in IP header).
ForTCP, we find the receive handler as tcp_v4_rcv() (entry point for theTCP layer)
_tcp_v4_lookup() – find the socket to which the packet belongs
Establised sockets are maintained in the hash table tcp_ehash.
Established socket not found – New connection request for any listening socket
Search for listening socket – tcp_v4_lookup_listener()
tcp_rcv_established()
Application read the data from the receive queue if it issues recv()
Kernel routine to read data fromTCP socket is tcp_recvmsg()
The TCP/IP Stack in the Linux Kernel

More Related Content

What's hot

Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDPlcplcp1
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracingViller Hsiao
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
Understanding Open vSwitch
Understanding Open vSwitch Understanding Open vSwitch
Understanding Open vSwitch YongKi Kim
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernelAdrian Huang
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux KernelKernel TLV
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux NetworkingPLUMgrid
 

What's hot (20)

Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
Hands-on ethernet driver
Hands-on ethernet driverHands-on ethernet driver
Hands-on ethernet driver
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Understanding Open vSwitch
Understanding Open vSwitch Understanding Open vSwitch
Understanding Open vSwitch
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
Basic Linux Internals
Basic Linux InternalsBasic Linux Internals
Basic Linux Internals
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 

Similar to The TCP/IP Stack in the Linux Kernel

Char Drivers And Debugging Techniques
Char Drivers And Debugging TechniquesChar Drivers And Debugging Techniques
Char Drivers And Debugging TechniquesYourHelper1
 
The TCP/IP stack in the FreeBSD kernel COSCUP 2014
The TCP/IP stack in the FreeBSD kernel COSCUP 2014The TCP/IP stack in the FreeBSD kernel COSCUP 2014
The TCP/IP stack in the FreeBSD kernel COSCUP 2014Kevin Lo
 
Multithreaded sockets c++11
Multithreaded sockets c++11Multithreaded sockets c++11
Multithreaded sockets c++11Russell Childs
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenPostgresOpen
 
Introduction to Kernel Programming
Introduction to Kernel ProgrammingIntroduction to Kernel Programming
Introduction to Kernel ProgrammingAhmed Mekkawy
 
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bfinalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bChereCheek752
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and DriversKernel TLV
 
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLinaro
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.callsGRajendra
 
Exploitation of counter overflows in the Linux kernel
Exploitation of counter overflows in the Linux kernelExploitation of counter overflows in the Linux kernel
Exploitation of counter overflows in the Linux kernelVitaly Nikolenko
 
Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...
Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...
Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...Anne Nicolas
 
Unit 3
Unit  3Unit  3
Unit 3siddr
 
Linux Ethernet device driver
Linux Ethernet device driverLinux Ethernet device driver
Linux Ethernet device driver艾鍗科技
 

Similar to The TCP/IP Stack in the Linux Kernel (20)

Linux
LinuxLinux
Linux
 
C Assignment Help
C Assignment HelpC Assignment Help
C Assignment Help
 
Char Drivers And Debugging Techniques
Char Drivers And Debugging TechniquesChar Drivers And Debugging Techniques
Char Drivers And Debugging Techniques
 
The TCP/IP stack in the FreeBSD kernel COSCUP 2014
The TCP/IP stack in the FreeBSD kernel COSCUP 2014The TCP/IP stack in the FreeBSD kernel COSCUP 2014
The TCP/IP stack in the FreeBSD kernel COSCUP 2014
 
Multithreaded sockets c++11
Multithreaded sockets c++11Multithreaded sockets c++11
Multithreaded sockets c++11
 
Sysprog17
Sysprog17Sysprog17
Sysprog17
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
 
Embedded C - Lecture 4
Embedded C - Lecture 4Embedded C - Lecture 4
Embedded C - Lecture 4
 
Introduction to Kernel Programming
Introduction to Kernel ProgrammingIntroduction to Kernel Programming
Introduction to Kernel Programming
 
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bfinalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
 
Sysprog 16
Sysprog 16Sysprog 16
Sysprog 16
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
 
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.calls
 
Exploitation of counter overflows in the Linux kernel
Exploitation of counter overflows in the Linux kernelExploitation of counter overflows in the Linux kernel
Exploitation of counter overflows in the Linux kernel
 
Linux Device Tree
Linux Device TreeLinux Device Tree
Linux Device Tree
 
Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...
Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...
Kernel Recipes 2014 - What I’m forgetting when designing a new userspace inte...
 
Unit 3
Unit  3Unit  3
Unit 3
 
Linux Ethernet device driver
Linux Ethernet device driverLinux Ethernet device driver
Linux Ethernet device driver
 

More from Divye Kapoor

Tuning Flink Clusters for stability and efficiency
Tuning Flink Clusters for stability and efficiencyTuning Flink Clusters for stability and efficiency
Tuning Flink Clusters for stability and efficiencyDivye Kapoor
 
A particle filter based scheme for indoor tracking on an Android Smartphone
A particle filter based scheme for indoor tracking on an Android SmartphoneA particle filter based scheme for indoor tracking on an Android Smartphone
A particle filter based scheme for indoor tracking on an Android SmartphoneDivye Kapoor
 
The Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsThe Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsDivye Kapoor
 
Cybermania Prelims
Cybermania PrelimsCybermania Prelims
Cybermania PrelimsDivye Kapoor
 

More from Divye Kapoor (6)

Tuning Flink Clusters for stability and efficiency
Tuning Flink Clusters for stability and efficiencyTuning Flink Clusters for stability and efficiency
Tuning Flink Clusters for stability and efficiency
 
A particle filter based scheme for indoor tracking on an Android Smartphone
A particle filter based scheme for indoor tracking on an Android SmartphoneA particle filter based scheme for indoor tracking on an Android Smartphone
A particle filter based scheme for indoor tracking on an Android Smartphone
 
The Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsThe Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOs
 
Cybermania Prelims
Cybermania PrelimsCybermania Prelims
Cybermania Prelims
 
Cybermania Mains
Cybermania MainsCybermania Mains
Cybermania Mains
 
IPv6
IPv6IPv6
IPv6
 

Recently uploaded

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Recently uploaded (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

The TCP/IP Stack in the Linux Kernel

  • 2.
  • 3.  It is the software layer in the kernel that provides a uniform filesystem interface to userspace programs  It provides an abstraction within the kernel that allows for transparent working with a variety of filesystems.  Thus it allows many different filesystem implementations to coexist freely  Each socket is implemented as a “file” mounted on the sockfs filesystem.  file->private points to the socket information.
  • 4.  Inodes provide a method to access the actual data blocks allocated to a file. For sockets, they provide buffer space which can be used to hold socket specific data.  struct inode  Every file is represented in the kernel as an object of the file structure. It requires an inode provided to it.  struct file
  • 5. Struct operations { int (*read)(int, char *, int); void (*destroy_inode)(inode *); void (*dirty_inode) (struct inode *); int (*write_inode) (struct inode *, int); void (*drop_inode) (struct inode *); void (*delete_inode) (struct inode *); }; Sizeof(operations) = sizeof(function ptr)*6
  • 7. User Space Socket, bind, listen, connect, send, recv, write, read etc. Socket Functions (Kernel) sys_socket, sys_bind, sys_listen, sys_connect etc. in socket.c TCP/IP Layer Functions inet_create, tcp_v4_connect, tcp_sendmsg, tcp_recvmsg Ethernet Device Layer dev_hard_start_xmit
  • 8. Sys_socket() Sock_create() Sock_map_fd() Allocate a socket object (internally an inode Associated with a file object) Locate the family requested and call the create function for that family Inet_create() Lower layer initialization Sock_alloc_fd() Allocate a file descriptor Sock_attach_fd() Fd_install()
  • 9. Sys_connect() Sockfd_lookup_light() Returns the socket object associated with the given fd Move_addr_to_kernel() For userspace sockaddr * Sock->ops->connect() Lower layer call Tcp_v4_connect()
  • 10.
  • 11.
  • 12.
  • 14.
  • 15. Defined in <include/linux/skbuff.h>  used by every network layer (except the physical layer)  fields of the structure change as it is passed from one layer to another  i.e., fields are layer dependent.
  • 16. struct sk_buff { ... ... ... #ifdef CONFIG_NET_SCHED _ _u32 tc_index; #ifdef CONFIG_NET_CLS_ACT _ _u32 tc_verd; _ _u32 tc_classid; #endif #endif } sk_buff is peppered with C preprocessor #ifdef directives. CONFIG_NET_SCHED symbol should be defined at compile time for the structure to have the element tc_index. enabled with some version of make config by an administrator.
  • 17.  The kernel maintains all sk_buff structures in a doubly linked list. struct sk_buff_head {/* only the head of the list */ /*These two members must be first. */ struct sk_buff * next; struct sk_buff * prev; _ _u32 qlen; spinlock_t lock;/* atomicity in accessing a sk_buff list. */ };
  • 18.  Layout  General  Feature-specific  Management functions
  • 19.  struct sock * sk sock data structure of the socket that owns this buffer  unsigned int len includes both the data in the main buffer (i.e., the one pointed to by head) and the data in the fragments  unsigned int data_len unlike len, data_len accounts only for the size of the data in the fragments.  unsigned int truesize skb->truesize = size + sizeof(struct sk_buff);  atomic_t users reference count, or the number of entities using this sk_buff buffer atomic_inc and atomic_dec
  • 20.  struct sock * sk sock data structure of the socket that owns this buffer  unsigned int len includes both the data in the main buffer (i.e., the one pointed to by head) and the data in the fragments  unsigned int data_len unlike len, data_len accounts only for the size of the data in the fragments.  unsigned int truesize skb->truesize = size + sizeof(struct sk_buff);  atomic_t users reference count, or the number of entities using this sk_buff buffer atomic_inc and atomic_dec
  • 21.  struct sock * sk sock data structure of the socket that owns this buffer  unsigned int len includes both the data in the main buffer (i.e., the one pointed to by head) and the data in the fragments  unsigned int data_len unlike len, data_len accounts only for the size of the data in the fragments.  unsigned int truesize skb->truesize = size + sizeof(struct sk_buff);  atomic_t users reference count, or the number of entities using this sk_buff buffer atomic_inc and atomic_dec
  • 22.  struct sock * sk sock data structure of the socket that owns this buffer  unsigned int len includes both the data in the main buffer (i.e., the one pointed to by head) and the data in the fragments  unsigned int data_len unlike len, data_len accounts only for the size of the data in the fragments.  unsigned int truesize skb->truesize = size + sizeof(struct sk_buff);  atomic_t users reference count, or the number of entities using this sk_buff buffer atomic_inc and atomic_dec
  • 23.  struct sock * sk sock data structure of the socket that owns this buffer  unsigned int len includes both the data in the main buffer (i.e., the one pointed to by head) and the data in the fragments  unsigned int data_len unlike len, data_len accounts only for the size of the data in the fragments.  unsigned int truesize skb->truesize = size + sizeof(struct sk_buff);  atomic_t users reference count, or the number of entities using this sk_buff buffer atomic_inc() and atomic_dec()
  • 24. • unsigned char *head • sk_buff_data_t end • unsigned char *data • sk_buff_data_t tail
  • 25. struct net_device *dev  represents the receiving interface or the to be transmitted device(or interface) corresponding to the packet.  usually represents the virtual device’s(representation of all devices grouped) net_device structure.  Pointers to protocol headers.  sk_buff_data_t transport_header;  sk_buff_data_t network_header;  sk_buff_data_t mac_header;
  • 26. updation of data is done using the *_header pointers
  • 27.  char cb[40]  This is a "control buffer," or storage for private information, maintained by each layer for internal use. struct tcp_skb_cb { ... ... ... _ _u32 seq; /* Starting sequence number */ _ _u32 end_seq; /* SEQ + FIN + SYN + datalen*/ _ _u32 when; /* used to compute rtt's */ _ _u8 flags; /*TCP header flags. */ ... ... ... };
  • 28. Defined in <include/linux/skbuff.h> & <net/core/skbuff.c> skb_put(struct sk_buff *, usingned int len)
  • 29. skb_push(struct sk_buff *skb, unsigned int len)
  • 30. skb_pull(struct sk_buff *skb, unsigned int len)
  • 31. skb_reserve(struct sk_buff *skb, int len) Each of the above four memory management functions return the data ptr.
  • 32. defined in <net/core/skbuff.c> struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, int fclone, int node) … size = SKB_DATA_ALIGN(size); data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask); …
  • 33. struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int length, gfp_t gfp_mask) The buffer allocation function meant for use by device drivers Executed in interrupt mode Freeing memory: kfree_skb and dev_kfree_skb Release buffer back to the buffer-pool. Buffer released only when skb_users counter is 1. If not, the counter is decremented.
  • 34.
  • 35.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.  Defined in <include/linux/netdevice.h>  stores all information specifically regarding a network device  one such structure for each device, both real ones (such as Ethernet NICs) and virtual ones  Network devices can be classified into types such as Ethernet cards and Token Ring cards  Each type may come in several models.  Model specific parameters are initialized by device driver software.  Parameters common for different models are initiated by kernel.
  • 47. struct net_device{ char name[IFNAMSIZ]; int ifindex; /* device name hash chain, ex: eth0 */ struct hlist_node name_hlist; unsigned long mem_end;/* shared mem end */ unsigned long mem_start; /* shared mem start */ unsigned long base_addr; /* device I/O address */ unsigned int irq; /* device IRQ number*/ unsigned char if_port; /* Selectable AUI,TP,..*/ unsigned char dma; /* DMA channel */ …
  • 48. struct net_device{ char name[IFNAMSIZ]; int ifindex; /* device name hash chain, ex: eth0 */ struct hlist_node name_hlist; unsigned long mem_end; /* shared mem end */ unsigned long mem_start; /* shared mem start */ unsigned long base_addr; /* device I/O address */ unsigned int irq; /* device IRQ number*/ unsigned char if_port; /* Selectable AUI,TP,..*/ unsigned char dma; /* DMA channel */ …
  • 49. struct net_device{ char name[IFNAMSIZ]; int ifindex; /* device name hash chain, ex: eth0 */ struct hlist_node name_hlist; unsigned long mem_end;/* shared mem end */ unsigned long mem_start; /* shared mem start */ unsigned long base_addr; /* device I/O address */ unsigned int irq; /* device IRQ number*/ unsigned char if_port; /* Selectable AUI,TP,..*/ unsigned char dma; /* DMA channel */
  • 50. struct net_device{ char name[IFNAMSIZ]; /* device name hash chain, ex: eth0 */ struct hlist_node name_hlist; unsigned long mem_end;/* shared mem end */ unsigned long mem_start; /* shared mem start */ unsigned long base_addr; /* device I/O address */ unsigned int irq; /* device IRQ number*/ unsigned char if_port; /* Selectable AUI,TP,..*/ unsigned char dma; /* DMA channel */ unsigned short flags; /* interface flags (a la BSD) */ …
  • 51. struct net_device{ char name[IFNAMSIZ]; /* device name hash chain, ex: eth0 */ struct hlist_node name_hlist; unsigned long mem_end;/* shared mem end */ unsigned long mem_start; /* shared mem start */ unsigned long base_addr; /* device I/O address */ unsigned int irq; /* device IRQ number*/ unsigned char if_port; /* Selectable AUI,TP,..*/ unsigned char dma; /* DMA channel */ unsigned short flags; /* interface flags (a la BSD)*/ …
  • 52. struct net_device{ char name[IFNAMSIZ]; /* device name hash chain, ex: eth0 */ struct hlist_node name_hlist; unsigned long mem_end;/* shared mem end */ unsigned long mem_start; /* shared mem start */ unsigned long base_addr; /* device I/O address */ unsigned int irq; /* device IRQ number*/ unsigned char if_port; /* Selectable AUI,TP,..*/ unsigned char dma; /* DMA channel */ unsigned short flags; /* interface flags (a la BSD)*/ /* ex : IFF_UP || IFF_RUNNING || IFF_MULTICAST */
  • 53. struct net_device{ … unsigned mtu; /* interface MTU value */ unsigned short type; /* interface hardware type */ unsigned short hard_header_len; /* hardware hdr length */ unsigned char dev_addr[MAX_ADDR_LEN]; unsigned char addr_len; /* hardware address length */ unsigned char broadcast[MAX_ADDR_LEN]; unsigned int promiscuity; …
  • 54. struct net_device{ … unsigned mtu; /* interface MTU value */ unsigned short type; /* interface hardware type*/ unsigned short hard_header_len; /* hardware hdr length */ unsigned char dev_addr[MAX_ADDR_LEN]; unsigned char addr_len; /* hardware address length */ unsigned char broadcast[MAX_ADDR_LEN]; unsigned int promiscuity; …
  • 55. struct net_device{ … unsigned mtu; /* interface MTU value */ unsigned short type; /* interface hardware type */ unsigned short hard_header_len;/* hardware hdr length */ unsigned char dev_addr[MAX_ADDR_LEN]; unsigned char addr_len; /* hardware address length */ unsigned char broadcast[MAX_ADDR_LEN]; unsigned int promiscuity; …
  • 56. struct net_device{ … unsigned mtu; /* interface MTU value */ unsigned short type; /* interface hardware type */ unsigned short hard_header_len; /* hardware hdr length */ unsigned char dev_addr[MAX_ADDR_LEN]; unsigned char addr_len; /* hardware address length*/ unsigned char broadcast[MAX_ADDR_LEN]; unsigned int promiscuity; …
  • 57. struct net_device{ … unsigned mtu; /* interface MTU value */ unsigned short type; /* interface hardware type */ unsigned short hard_header_len; /* hardware hdr length */ unsigned char dev_addr[MAX_ADDR_LEN]; unsigned char addr_len; /* hardware address length */ unsigned char broadcast[MAX_ADDR_LEN]; unsigned int promiscuity; …
  • 58. struct net_device{ … unsigned mtu; /* interface MTU value */ unsigned short type; /* interface hardware type */ unsigned short hard_header_len; /* hardware hdr length */ unsigned char dev_addr[MAX_ADDR_LEN]; unsigned char addr_len; /* hardware address length */ unsigned char broadcast[MAX_ADDR_LEN]; unsigned int promiscuity; …
  • 59. struct net_device{ … struct net_device *next; struct hlist_node name_hlist; struct hlist_node index_hlist;
  • 60.
  • 61. We don’t process the packet in the interrupt subroutine. Netif_rx() – raise the net Rx softIRQ. Net_rx_action() is called - start processing the packet  Processing of packet starts with the protocol switching section
  • 62. Netif_receive_skb() is called to process the packet and find out the next protocol layer. Protocol family of the packet is extracted from the link layer header.
  • 63. ip_rcv() is an entry point for IP packets processing. Checks if the packet we have is destined for some other host (using PACKET_OTHERHOST) Check the checksum of the packet by calling ip_fast_csum()
  • 64. Call ip_route_input() , this routine checks kernel routing table rt_hash_table. If packet needs to be forwarded input routine is ip_forward() Otherwise ip_local_deliver() ip_send() is called to check if the packet needs to be fragmented If yes , fragment the packet by calling ip_fragment() Packet output path – ip_finish_output() ip_local_deliver() – packets need to delivered locally
  • 65. ip_defrag() Protocol identifier field skb->np.iph->protocol (in IP header). ForTCP, we find the receive handler as tcp_v4_rcv() (entry point for theTCP layer)
  • 66. _tcp_v4_lookup() – find the socket to which the packet belongs Establised sockets are maintained in the hash table tcp_ehash. Established socket not found – New connection request for any listening socket Search for listening socket – tcp_v4_lookup_listener() tcp_rcv_established()
  • 67. Application read the data from the receive queue if it issues recv() Kernel routine to read data fromTCP socket is tcp_recvmsg()

Editor's Notes

  1. Req_irq and free_irq
  2. Req_irq and free_irq
  3. Req_irq and free_irq
  4. Req_irq and free_irq