Lec2 Application Layer
应用层
Overview
goals:
- conceptual and implementation aspects of application-layer protocols
- transport-layer service models
- client-server
- peer-to-peer paradigm
- Learn about protocols by examining popular application-layer protocols
- HTTP
- SMTP, IMAP
- DNS
- programming network application
- socket API
Summary:
- application architectures
- client-server
- P2P
- application service requirements:
- reliability, bandwidth, delay
- Internet transport service model
- connection-oriented, reliable: TCP
- unreliable, datagrams: UDP
- specific protocols:
- HTTO
- SMTP, IMAP
- DNS
- P2P: BitTorrent
- soket programming
- TCP, UDP sockets
- Typical request/reply message exchange:
- client requests info or service
- server responds with data, status code
- message formats:
- headers: fields giving info about dat
- data: info(payload) being communicated
- impoetant themes:
- centralized vs. decentralized
- stateless vs. stateful
- scalavility
- reliable vs. unreliable message transfer
- “complexity at network edge”
重点
- 服务器
- 客户端
- 寻址过程
- 点对点结构
- 标志符 (identifier)
- 进程间通信
- 网络协议接口
- 套接字(socket) (还未掌握)
Some network apps
Creating a network app&
write programs that:
- run on (different) end systems
- communicate over network
- e.g., web server software communication with browser software
no need to write software for network-core devices
- network-core devices do not run user applications
- applications on end systems allows for rapid app development, propagation
Client-server architecture
serve:
服务器
- always-on host
- permanet IP address
- data centers for scaling
clients:
客户端
- communicate with server
- may be intermittently connected 可能断断续续连接
- may be have dynamic IP addresses
- do not communicate directly with each other
Peer-To-Peer architecture 点对点/端对端
- no always-on serve 不一定走服务器
- arbitrary end systems directly communicate 任意端系统之间交流
- peers request service from other peers, provide service in return to other peers 从一个点到另一个点传递性服务
- self scalability–new peers bring new service capacity, as well as new service demands
自我可伸缩性——新的对等体带来新的服务能力,以及新的服务需求
- self scalability–new peers bring new service capacity, as well as new service demands
- connected and change IP addresses
- complex management
Processes communcating 实现进程间通信
process: program running within a host
within same host, two processes communicate using inter-process communication(defined by OS)
processes in different hosts communicate by exchanging messages
client, servers
- client process: process that initiates communication
- server process: process that waits to be contacted
note: applications with P2P architecutres have client processes & server processes
Sockets 套接字-网络协议接口
- process sends/receives messages to/from its socket
- socket analogous to door
- sending process shoves message out door
- sending process relies on transport infrastructure on other side of door to deliver message to socket at receiving process
- two sockets involved: one on each side
Addressing processes
- to receive messages, process must have identifier (标志符)
- host device has unique 32-bit IP address
- Q: does IP address of host on which process runs suffice for identifying the process?
- A: no,many processes can be running on same host
- Q: does IP address of host on which process runs suffice for identifying the process?
- identifier includes both IP address and port numbers associated with process on host.
- example port numbers:
- HTTP server: 80
- mail server: 25
- to send HTTP message to gaia.cs.umass.edu web server:
- IP address: 128.119.245.12
- port number: 80
An application-layer protocol defines
应用层协议
- types of messages exchanged,
- e.g., request, reponse
- message syntax:
- what fields in messages & how fields are delineated
- message semantics: 信息语义
- meaning of information in fields
- rules for when and how processes send & respond to messages
open protocols: 开源协议
- defined in RCFs, everyone has access to protocol defintion
- allows for interoperability
- e.g., HTTP, SMTP
proprietary protocols专用协议
- e.g., Skype
transport service for app
data integrity 可靠数据传输
- some apps(e.g, file transfer, web transcations) require 100% reliable data transfer
- other apps(e.g., audio) can tolerate some loss
timing
- some apps(e.g., Internet telephony, interactive games) require low delay to be “effective”
throughput
- some apps(e.g, multimedia) require minimum amount of throughput to be “effective”
- other apps(“elastic apps”) make use of whatever throughput they get
security
- encryption, data integrity, ….
transport service requirements: common apps
Internet transport protocols service
Securing TCP
安全协议
TCP 服务模型包括面向连接服务和可靠数据传输服务。当某个应用程序调用TCP作为运输协议的时候, 该应用程序就能获得来自TCP的两种服务
- 面向连接的服务: 在应用层数据报文(message)开始流动前, TCP让客户和服务器互相交换运输层控制信息。
- 可靠的数据传送服务: 通信进程能够依靠TCP, 无时差、 按适当顺序交付所有发送的数据
- TCP的拥塞控制机制: 阻塞出现时会抑制发送进程
- TCP 无法改变延迟,贷款(吞吐量)
UDP service
- 不提供不必要服务的轻量协议,提供最小的服务
- UDP是无线的,进程之间通信没有握手过程
Web and HTTP
First, a quick review…
- web page consists of objects, each of which can be stored on different Web servers
- object can be HTML file, JPEG image, Java applet, audio file,…..
- web page consists of base HTML-file which includes serveral referenced objcets, ecah addressable by a URL, e.g.,
HTTP overview
超文本传输协议 HTTP: hypertext transfer protocol
Web’s application layer protocol
client/server model:
- client: browser that requests, receives, (using HTTP protocol) and “displays” Web objects
- server: Web server sends (using HTTP protocol) objects in response to requests
continued
- HTTP uses TCP:
- 客户端发起TCP连接(创建套接字)到服务器,端口80
- 服务器接受从服务器TCP连接
- 浏览器(HTTP客户端)和Web服务器(HTTP服务器)之间交换的HTTP消息(应用层协议消息)
- TCP连接关闭了
- HTTp is “stateless”
- 服务器没有维护关于过去的客户端请求的信息
- 维护“状态”的协议是复杂的!
- 过去的历史(状态)必须保持
- 如果服务器/客户端崩溃,它们的“状态”视图可能不一致,必须协调
HTTP connections: two types
Non-presistent HTTP 非持续连接
- TCP connection opened 全开放
- at most one object sent over TCP connection 至少一个项目发送TCP连接
- TCP connection closed 关闭
- 每个TCP连续在服务器发送一个对象后关闭,不为其他对象持续下来
- downloading multiple objects required multiple connections
- Example
- reponse time
Persistent HTTP 持续连接
- TCP connection opened to a server
- multiple objects can be sent over single TCP connection between client, and that serve
- TCP connection closed
HTTP request messages
- two types of HTTP mesages: request, reponse
- HTTP request message:
- ASCII (human-readable format)
方法字段包括 GET, POST, HEAD, PUT 和 DELETE 绝大部分HTTP请求报文使用GET
HOST指明主机 - gereral format
使用GET方法的时候 “entity body” 为空, 当用POST的时候为实体来提交表单
但用表单生成的请求报文不是必须使用POST方法
- Other HTTP request messages
- POST method:
- web page often includes form input
- user input sent from client to server in entity body of HTTP POST request message
- HEAD method:
- requests headers that would be returned if soecified URL were requested with an HTTP GET method
- GET method:(for sending data to server):
- include user data in URL field of HTTP GET request message(following a ‘?’)
- PUT method:
- uploads new file to server
- completely replaces file tath exists at specified URL with content in entity body of POST HTTP request message
- POST method:
- ASCII (human-readable format)
HTTP reponse messages
status line: 初始行
data = entity body- status codes:
- status code appears in 1st line in server-to-client response message.
- some sample codes:
- 200 OK: request succeeded, requested object later in this message
- 301 Moved Permanently: requested object moved, new location specified later in this message
- 400 Bad request: request msg not understood by server
- 404 Not Found: requested document not found on this server
- 505 HTTP version Not Supported
- status codes:
Maintaining user/server state: cookies
用户与服务器的交互。 一个站点希望可以识别用户来限制用户的访问或者将内容与用户身份联系起来, 所以使用cookies进而对用户进行追踪。
Many Web sites ues four components:
- cookie header line of HTTP response message
- cookie header line in next HTTP request message
- cookie file kept on user’s host, managed by user’s browser
- back-end database at Web site
commetns:
What cookies can be used for
- authorization
- shopping carts
- recommendations
- user session state (web e-mail)
Challenge: How to keep state:
- protocol endpoints: maintain state at sender/receiver over multiple transactions
- cookies: HTTP messages carry state
aside: cookies and privacy
- cookies permit sites to learn a lot about you
- third party persistent cookies (tracking cookies) allow common identity (cookie value) to be tracked across multiple web sites
Web caches(proxy servers)
网络缓存 也叫代理服务器
是一种能够代表初始Web服务器来满足HTTP请求的网络实体.
- Web cache acts as both client and server
- sercer for orginal requesting client
- client to origin server
- typically cache is installed by ISP(university, company, residential ISP)
Why Web caching?
- reduce response time for client request
- cache is closer to client
- reduce traffice on an institution’s access link
- Internet is dense with caches\
- enables “poor” content providers to more effectively deliver content
- enables “poor” content providers to more effectively deliver content
Goal: satisfy client request without involving origin server
- user configures browser to point to a Web cache
- browser sends all HTTP requests to cache
- if object in cache: cache returns object to client
- else cache requests object from origin server, caches received object, then returns object to client
Conditional GET
Goal: don’t send object if cache has up-to-date cached version
- no object transmission delay
- lower link utilization
- cache: specify date of cached copy in HTTP request
if-modified-since:
- server: reponse contains no object if cached copy is up-to-date:
HTTP/1.0 304 Not Modified
Three major components:
- user agents
- mail servers
- simple mail transfer protocol: SMTP
User Agent
- a. k. a “mail reader”
- composing, editing, reading mail messages stored on server
- e.g., Outlook, iPhone mail client
- outgoing on sercer
mail servers
- mailbox contains incoming messages for user
- message queue of outgoing(to be sent) mail messages
- SMTP protocol between mail sercers to send email messages
- client: sending mail server
- “server”: receiving mail server
STMP是因特网电子邮件中主要的应用层协议。它使用TCP可靠数据传输服务, 从发送方的邮件服务器向接收发那个的邮件服务器发送邮件。和应用层应用一样也有两部分。
SMTP - Simple mail Transfer Protocol
uses TCP to reliably transfer email message from client (mail server initiating connection) to server, port 25
direct transfer: sending server (acting like client) to receiving server
three phases of transfer
- handshaking(greeting)
- transfer of messages
- closure
command/response interaction (like HTTP)
- commands: ASCII text
- reponse: status code and phrase
Scenario: Alice sends e-mail to Bob
- Alice uses UA to compose e-mail message “to” bob@someschool.edu
- Alice’s UA sends message to her mail server; message placed in message queue
- Client side of SMTP opens TCP connection with Bob’s mail server
- SMTP client sends Alice’s message over the TCP connection
- Bob’s mail server places the message in Bob’s mailbox
- Bob invokes his user agent to read message
interaction
分析一个在STMP客户(C) 和SMTP服务器 (S) 之间交换报文文本的例子comparison with HTTP:
- HTTP: pull protocol 在方便的时候, 某些人在web服务器上装载信息用户使用HTTP从该服务器拉取这些信息。
- SMTP: push 发送邮件服务器把文件推向接收邮件服务器。
- both have ASCII command/reponse interaction, status codes
- HTTP: each object encapsulated in its own response message
- SMTP: multiple objects sent in multipart message
- SMTP uses persistent connections
- SMTP requires message(header & body) to be in 7-bit ASCII
- SMTP server uses CRLF.CRLF to determine end of message
Mail message format
SMTP: protocol for exchanging e-mail messages, defined RFC 531 (like HTTP)
RFC 822 defines syntax for e-mail message itself (like HTML)Mail message format
SMTP: delivery/storage of e-mail messages to receiver’s server
mail access protocol: retrieval from server
- IMAP: Internet Mail Access Protocol [RFC 3501]: messages stored on server, IMAP provides retrieval, deletion, folders of stored messages on server
- HTTP: gmail, Hotmail, Yahoo! Mail, etc. provides web-based interface on top of SMTP (to send), IMAP (or POP) to retrieve e-mail messages
Domain Name System
people: many identifiers:
- SSN, name, passport
Internet hosts, routers:
- IP address(32 bit) - used for addressing datagrams
- “name”, e.g., www.liverpool.ac.uk used by humans
Domain Name System: 域名系统
- distributed database
implemented in hierachy of many name server - application-layer prrotocol: hosts, name servers communicate to resolve names
- note : core Internet function, implemented as application-layer protocol
- complexity at network’s “edge”
- distributed database
DNS: services, structure
- DNS service
- hostname to IP address translation
- host aliasing 主机别名
- canonial hostname 规范主机名, alias names
- mail server aliasing 邮件服务器别名
- load distribution 负载分配 : 用于冗余服务器的负载分配
- replicated Web servers:
many IP addresses correspond to one name- Q: why not centralize DNS:
- single point of failure
- traffice volume
- distant centralized database
- maintenance
- A:doesn’t scale!
- Q: why not centralize DNS:
- replicated Web servers:
- DNS service
附: DNS工作原理概述
DNS: adistributed, hierarchical database 分布式、层次数据库
三种DNS服务器: 根DNS(root), 顶级域(Top level domain)DNS服务器, 权威(Authoriative)DNS服务器。
*Client wants IP address for www.amazon.com; 1st approximation:
- client queries root server to find .com DNS server
- client queries .com DNS server to get amazon.com DNS server
- client queries amazon.com DNS server to get IP address for www.amazon.com
root servers:
- 有400多个根名字服务器遍及全世界, 这些根服务器名字由13个不同的组织管理。
- offical, contact-of-last-resort by name servers that can not resolve name
- incredibly important Internet function
- Internet couldn’t function without it!
- ICANN(Internet Corporation for Assigned Names and Numbers) manages root DNS domain
- 根服务器提供 TLD服务器的IP地址
TLD servers:
- 对于每个顶级域和所有国家的顶级域, 都有TLD服务器(或者服务器群)。 Versign Global Registry Services 公司维护com顶级域的TLD服务器,
支持TLD的网路基础设施可能是大而复杂的, TLD 服务器提供了权威DNS服务器的IP地址 - Educause: .edu TLD
- 对于每个顶级域和所有国家的顶级域, 都有TLD服务器(或者服务器群)。 Versign Global Registry Services 公司维护com顶级域的TLD服务器,
权威服务器DNS服务
- organization’s own DNS server(s), providing authoritative hostname to IP mappings for organization’s named hosts
- cna be maintained by organization or service provider
Local DNS servers
- does not strictly belong to hierarchy
- each ISP(residential ISP, company, unicersity) has one local DNS server
- also called “default name server”
- when host makes DNS query, query is sent to its local DNS server
- has local cache of rencent name-to-address translation pairs
- acts as proxy, forwards query into hierarchy
DNS name resolution: iterated query
现实中往往是递归和迭代结合的查询
- contacted server replies with name of server to contact
- “I don’t know this name, but ask this server”
~~~~~~~~~~~~~~~~~~~: recursive query
Caching, Updating DNS Records: 为了改善时延性能并减少在因特网上到处传输到DNS报文数量
- once name server learns mapping, it caches mapping
- cache entries timeout after some time(TTL)
- TLD servers typically cached in local name servers
- thus root name servers not often visited
- cached entries may be out-of-date(best-effort name-to-address translation!) 主机和主机名与IP地址间的映射并不是永久的, DNS服务器在一段时间后就会将缓存的信息丢掉。
- if name host changes IP address, may not be known Internet-wide until all TTLs expire!
- update/notify mechanisms proposed IETF standard
- RFC 2136
- once name server learns mapping, it caches mapping
DNS records
- 共同实现DNS分布式数据库的所有DNS服务器储存了资源记录(Resource Records(RR)) RR提供了主机名到IP的地址映射。
- RR 格式: (name, value, type, ttl)
- 四种type类型
- type=A
- name is hostname
- value is IP address
- type=NS
- name is domain
- value is hostname of authoritative name server for this domain
- type=CNAME
- name is alias name for some “canonical” (the real) name
- www.ibm.com is really wercereast.backup2.ibm.com
- value is canonical name
- type=MX
- value is name of mailserver associated with name
- type=A
附: DNS message
- request & response : they have same format
- request & response : they have same format
P2P Application
P2P architecture
- no always-on server
- arvitrary end systems directly communicate
- peers request service fro other peers, provide service in return to other peers
- *self scalability - new peers bring new service capacity, and new service demands
- *self scalability - new peers bring new service capacity, and new service demands
- peers are intermittently connected and change IP addresses
- complex management
- example” P2P file sharing (BitTorrent), streaming(KanKan), VoIP(Skype)
File Distribution: client-server VS P2P
客户服务器体系结构极大的依赖于总是打开的基础设施服务器
P2P体系结构总是对打开的基础设施服务器有最小的依赖
- 最流行的P2P文件分发协议: BitTorrent
P2P 结构的扩展性
client-server- server transmission: must sequentitally send N file copies:
- us:下载速率, Dcs分发时间
- time to send one copy: F/us
- time to send N copy: NF/us
- client: each client must download file copy
- dmin = min client download rate
- min client download time: F/dmin
P2P
- server transmission: must upload at least one copy:
- time to send one copy: F/us
- client: each client must download file copy
- min client download time: F/dmin
- clients: as aggregate must download NF bits
- max upload rate(limiting max download rate) is Us + $\sum$ui
- max upload rate(limiting max download rate) is Us + $\sum$ui
- example
- BitTorrent
- file divided into 256kb chunks
- peers in torrent send/receive file chinks
- torrent: 洪流
- tracker: 追踪器
- chunks: 文件块
- peer joining torrent
- has no chunks, but will accumulate them over time from other peers
- registers with tracker to get list of peers, connects to subset of peers
- while downloading, peer uploads chunks to other peers
- peer may change peers with whom it exchanges chunks
- churn: peers may come and go
- once peer has entire file, it may (selfishly) leave or (altruistically) remain in torrent
- Requesting chunks:
- at any given time, different peers have different subsets of file chunks
- periodically, Alice asks each peer for list of chunks that they have
- Alice requests missing chunks from peers, rarest first
- Sending chunks: tit-for-tat
- Alice sends chunks to those four peers currently sending her chunks at highest rate
- other peers are choked by Alice (do not receive chunks from her)•re-evaluate top 4 every10 secs
- re-evaluate top 4 every 10 secs
- every 30 secs: randomly select another peer, starts sending chunks
- “optimistically unchoke” this peer
- newly chosen peer may join top 4
- Alice sends chunks to those four peers currently sending her chunks at highest rate
- tit-for-tat
Socket programming
套接字编程goal: learn how to build client/server applications that communicate using sockets
socket: door between application process and end-end-transport protocol
- server transmission: must sequentitally send N file copies:
Two socket types for two transport services:
- UDP: unrealiable datagram
- TCP: reliable, byte stream-oriented
Socket programming with UDP
UDP: no “connection” between client & server
- no handshaking before sending data
- sender explicitly attaches IP destination address and port # to each packet
- receiver extracts sender IP address and port# from received packet
UDP: transmitted data may be lost or received out-of-order
Application viewpoint:
- UDP provides unreliabletransfer of groups of bytes (“datagrams”) between client and server
- UDP provides unreliabletransfer of groups of bytes (“datagrams”) between client and server
Example: Java client(UDP)
Example: Java server(UDP)
- Socket programming with TCP
- Client must contact server
- server process must first be running
- server must have created socket (door) that welcomes client’s contact
- Client contacts server by:
- Creating TCP socket, specifying IP address, port number of server process
- *when client creates socket:*client TCP establishes connection to server TCP
- when contacted by client, server TCP creates new socket for server process to communicate with that particular client
- allows server to talk with multiple clients
- source port numbers used to distinguish clients (more in Chap 3)
- Application viewpoint: TCP provides reliable, in-orderbyte-stream transfer (“pipe”) between client and server
- Client/server socket interaction: TCP
- Java client(TCP), cont.
- Java server(TCP)
- Client must contact server