Goals
- conceptual, implementation aspects of application protocols
- learn about protocols by examining popular a level protocols
- programming network applications
Principles of Network Applications
Architectures
- client-server architecture
- server – always-on
- no communication between clients
- server’s IP is known
- P2P
- self-scalability
- BitTorrent
Process
- Processes on two different end systems communicate with each other by exchanging messages across the computer network.
- In the context of a communication session between a pair of processes, the process that initiates the communication (that is, initially contacts the other process at the beginning of the session) is labeled as the client. The process that waits to be contacted to begin the session is the server
- C/S → C and S are both processes
- P2P 根据传输方向分client和server的角色
- Socket
- 一种低层次的协议,提供网络通信接口,与API(编程接口)不同
- Interface between the process and the computer networkw
- Host identified by IP address
- Process identified by port number
Transport Services for Application
4 Factors
- Reliable
- packets may loss during transportation
- guaranteed data delivery service
- 可靠不等于安全 保证完整但和是否被窃取or修改无关
- Throughput
- Because other sessions will be sharing the bandwidth along the network path
- bandwidth-sensitive applications require throughput
- 越多越好
- Timing
- 定时保证
- appealing to interactive real-time applications
- 时延
- Security
- 运输层协议发送前加密 接受前解密 防止信息在传输过程中暴露
Transport Services Provided by the Internet
UDP vs. TCP
- TCP
- connection-oriented service
- reliable data transfer service
- congestion-control mechanism
- with TLS to provide security services
- UDP
- connectionless
- unreliable data transfer service
- no congestion-control mechanism
- Application – Application-Layer Protocol – Underlying Transport Protocol
Web and HTTP
- HTTP – HyperText Transfer Protocol
- URL: protocol://host/path 形式
- uses TCP
- the browser and the server processes access TCP through their socket interfaces
- stateless – 不会保留过去发送的请求
Connection
- client initiates TCP connection (creates socket) to server, port 80
- server accepts TCP connection from client
- HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server)
- TCP connection closed
Non-persistent Connection
- HTTP 1.0
- at most one object sent over TCP connection
- open TCP connection, after sent, immediately close connection
- 2 RTTs per object
- parallel TCP connections to fetch referenced objects
- response time
- 1 RTT to initiate connection
- 1 RTT for HTTP request and response
- also file transmission time – the time fetching file from server
Persistent Connection
- HTTP 1.1 – persistent with piplining
- keep connection open after sending response
- subsequent HTTP messages between same client/server sent over this open connection
- 1 RTT
- 一个接一个
- 一个connection 一段时间未使用就关闭
Piplining
- persistant without piplining
- 1 RTT for 1 object
- send after the former one received
- persistant with piplining
- HTTP 1.1
- 1 RTT for all referenced objects
- send request as soon, no waiting
HTTP Message Format
- ASCII
- request
- POST and GET
- HTTP 1.0 → GET POST HEAD
- HTTP 1.1 → GET, POST, HEAD PUT DELETE
- response
- status code
Cookies
- User-Server Interaction
- serve content as a function of the user identity
- maintain some state between transactions
- header in response: set-cookie
- header in request: cookie
- four components
- cookie header line of HTTP response message
- cookie header line in next HTTP request message
- cookie file kept on user’s host, managed by user’s browser
- back-end database at Web site
- can be used for
- authorization
- shopping carts
- recommendations
- user session state
Web Caching
== Proxy server
cache is a closer “station” than server
- send all HTTP requests to cache 中介
- if object in cache → directly return
- if not → cache requests it from origin server and then return it from cache
- acts as both client and server 不同角色
- reduce response time for client request
- reduce traffic
- Conditional GET
- Goal: don’t send object if browser has up-to-date cached version
- no object transmission delay 如果cache是最新的就可以直接返回 避免再从server fetch文件 减少文件传输的时间
- If-Modified-Since header in request
- if up-to-date: status code 304 Not Modified in response
- HTTP 2
- decreased delay in multi-object HTTP requests
- increased flexibility at server in sending objects to client
- transmission order of requested objects based on client-specified object priority 不是先来就先传 引入了优先级 比如先传很小的object 最后发送大的 从而减少总延时
- divide objects into frames, schedule frames to mitigate HOL blocking(队头阻塞)
- allowing resources to be loaded in parallel
- HTTP 3
- adds security
- congestion control over UDP
xxxx@xxxx.xxxx 前面是用户名 后面是服务器域名
3 major components:
- user agents
- mail servers
- simple mail transfer protocol: SMTP
SMTP
transfers messages from senders’ mail servers to the recipients’ mail servers
message → mail server → mail server → receive
- restricts the messages to be in 7-bit ASCII 只有这个才能直接传输 其他的需要转化
- port 25
- uses TCP
- push protocol
- uses persistent connections
- user agent → mail server & mail server → mail server
Mail Message Formats
- header
- To
- From
- Subject (optional) – description of a mail
- body: the message, ASCII characters only
- MIME – multimedia extensions: add additional lines in message header to declare MIME content type 将二进制转为ASCII
- Content-Transfer-Encoding: method used to encode data
- Content-Type: declare content type and subtype, e.g. image/jpeg
- MIME-Version (optional)
Mail Access Protocols
- 上传邮件
- 有可能用的是HTTP 比如Webmail 网页端
- 也有可能是SMTP 比如邮件客户端
- 下载邮件 (authorization & download)
- POP3 – Post Office Protocol stateless across session, cannot re-read e-mail if changing client
- IMAP – Internet Mail Access Protocol
- 更复杂更多功能
- keep user state across session 有状态协议
- Microsoft Outlook
- HTTP (gmail, Hotmail)
DNS
== domain name system
hostname → IP
一个域名通常会有多个主域名服务器,这些主域名服务器会相互同步,以提供更好的可用性和容错性。这些主域名服务器共同承担着解析该域名的DNS查询请求的任务
Service
- hostname to IP address translation
- host aliasing
- mail server aliasing
- load distribution
Structure
- distributed database
- in a hierarchy
- root DNS servers 不需要记录 本身就记在缓存服务器里 也无法被清除
- TLD(top-level domain servers) name server (.com .cn …)
- authoritative DNS servers (google.com …)
- Local DNS name servers doesn’t belong to hierarchy
- resolution
- iterated 迭代 依次问询
- recursive 递归
- cache
- improve response time
- automatically disappear after TTL – time to live
- if IP change, it have to wait until all TTL expired to get everybody known
DNS Record
- resource records – RRs: (Name, Value, Type, TTL)
- type
Type | Name | Value |
---|---|---|
A | hostname | IP |
NS | domain | host-name of an authoritative DNS server that knows how to obtain the IP addresses for hosts in the domain |
CNAME | hostname | canonical name |
MX | domain | canonical name of a mail server |
DNS Message
both query and reply messages have the same format
Example
P2P
- self-scalability
- no always-on server
- end systems communicate directly
- peers are intermittently connected and change IP addresses
- BitTorrent, KanKan, Skype
File Distribution for C-S
the whole process contains server uploading & client downloading, each need a time
total distribution time is the greater one, and increase linearly with N (number of files)
File Distribution for P2P
each peer can assist the server in distributing the file
when a peer receives some file data, it can use its own upload capacity to redistribute the data to other peers
服务器上传和客户端下载时间不变 但另有每个客户端都上传一部分 总和为
$$
D_{P2P} = max\{\frac{F}{u_s},\frac{F}{d_{min}},\frac{NF}{u_s + \sum u_i}\}
$$
$u_s$ – 服务器上传速率
$u_i$ – 各客户端上传速率
$d_{min}$ – 客户端下载速率
F – 单个文件大小
N – 文件数量
其中 $\sum u_i$ 也随 N 线性增长 所以D的增长速率会很平稳
Searching for Information
Index – maps infos with peer location
- file sharing
- tracks the locations of files
- peers index what they have
- search index to determine where the specific files can be found
- instant messaging
- maps user names to locations
- inform index of users’ location
- search index to determine IP address of user
Napster → Centralized index → downsides:
- single point of failure
- performance bottleneck
- copyright infringement
So, file transfer is decentralized, but locating content is highly centralized
Query Flooding Used By Gnutella
Each peer indexes the files it makes available for sharing
- how to join?
- use list of candidate peers to find another peer
- attempts TCP connections with candidate peers
- sends Ping message to the first one, and then that one forwards the message around the peers, and each of the receivers will send response to myself
- so I am able to set up a lot more TCP connections
BitTorrent
Registers with tracker to get list of peers when joining torrent
while downloading, peer uploads chunks to other peers
once received entire file, it could leave or stay
如何发送也有自己的一系列规则
- request
- periodically requesting for list of chunks
- rarest first
- send
- 向为自己传输最快的前4个结点发送chunk,每10秒重新评估一次
- 每30秒随机选择一个结点向其发送文件块
find better transmission partner
Video Streaming and Content Distribution Networks
biggest challenge: heterogeneity 不均匀性 不同用户网络条件差距大
solution: distributed, application-level infrastructure
DASH
Dynamic, Adaptive Streaming over HTTP
used for streaming
- 普通方式
- 缺点是 variations in the amount of bandwidth available 差距很大 而视频编码方式相同
- establishes a TCP connection
- GET and HTTP response
- bytes are collected in a client application buffer
- streaming video application periodically grabs video frames from the client application buffer, depress it and display
- DASH
- server
- divides video file into multiple chunks
- each chunk encoded at multiple different rates
- files replicated in various CDN nodes
- high bandwidth → high-rate version
- different rate encodings stored in different files
- client periodically estimates server-to-client bandwidth
- client selects different chunks one at a time with HTTP GET
- chooses maximum coding rate sustainable given current bandwidth
- server
CDN
stores multiple copies of videos at multiple geographically distributed sites
在全球各地分布节点
use cluster selection strategy to choose node worldwide 聚类
DNS & CDN:
Example:
Notice
选B. At a given instant of time, a peer A may upload to a peer B, even if peer B is not sending anything to peer A.
C应该是4个