IP Telephony Annotated Bibliography

Matthew Caesar

Last updated 5/29/02.

Recent News

Business Communications Review, Call Accounting and Billing for IP Services, News Article 2001.

Network Fusion, Trying to Measure VoIP Call Quality, News Article 2002.

Washington Post, Internet Cafe's Phone Service Fills a Void, News Article 2002. Also see Slashdot Article.

ITXC, Competitive Information: Analyst Predictions, Web site, 2002.

Light Reading, Cisco Cautious on IP Telephony, News Article, 2001.

IT World, Voice Over IP is a (fast) moving target, News Article, 2001.

Overviews

J. Kurose, K. Ross, Computer Networking: A Top-Down Approach Featuring the Internet, Addison-Wesley Publishing Company, 2000.

• Networking overview text. Provides a review several protocols used in IP Telephony, including RTP, RTSP, the H.323 protocol architecture. Gives an example of how the protocols are used together in an Internet telephony application. Explains problems with packet audio in this architecture and explains several mechanisms for how to solve these problems.
• Systems can use fixed or adaptive (adjust based on network delay and variance) playout delay in playback buffering to combat jitter. Can improve robustness to packet loss with Forward Error Correction (FEC), or interleaving, which leaves multiple small gaps in stream instead of one large one at the expense of latency.
• Protocols: (1) Gives examples for audio payload types in RTP. (2) RTCP attempts to limit bandwidth to 5% of session bandwidth. (3) H.323 requires the use of RTP, H.245 (used for controlling media between endpoints, and to negotiate a common compression standard), Q.931 (used for signaling for establishing and terminating calls), and RAS (allows endpoints to communicate with a gatekeeper).
• Provides methods to achieve better QoS over the best effort Internet by considering several scenarios: (1) FTP session can cause delays for audio session (2) FTP user pays more, gets better service because router marks packets (3) misbehaving application: we want to isolate and police flows, but want to share resources if they're unused (4) should require a call admission process. Explains how IntServ and DiffServ architectures propose to solve these problems.
B. Farkas, M. Figallo, T. Kiehn, A. Miner, H. Wang, "The Titanic and the Iceberg: the PSTN Meets IP Telephony" Preprint, 1999.
• Overview of IP Telephony market in 1999. Presents a taxonomy of carriers, equipment manufacturers, and users; and how IP Telephony can be expected to affect each. Gives advantages and disadvantages (less reliable, more methods defraud provider, lacks established standards in protocols) of IP Telephony networks over the PSTN. Gives effects strategies for service providers and vendors (major players in one space can enter another, shake up existing structure of internet and telecom markets, data vendors have many advantages over telecom vendors. Regulation benefits IP telephony because of tariffs, FCC adds charges in USA, international termination charge avoided too. Standards: H.323 is not complete or explicit enough. However, it does allow for more flexibility than the PSTN. QoS techniques such as DiffServ and MPLS are briefly discussed.
Dialogic, "IP Telephony Basics," Technical white paper.
• Overview of IP Telephony. Claims latency will improve with time over the wide area because gateways will improve, companies are deploying gateways on private networks, and internet is being developed with QoS support. DTMF doesn't travel well across the internet, so gateways must detect digits locally, suppress their transmission, then generate them on the remote side. Open systems are good because they generate competition which leads to low prices and innovation.
L. Huovinen, S. Niu, "IP Telephony," Web site.
• Overview of IP telephony. Security issues are addressed.
• Cites market expectations: more than 40% of telecom managers plan to move some voice/fax traffic onto IP network by 1999. Sales of IP gateways should approach 1.81 billion by end of 2001, by 2001 96% of revenues in IP telephony market will come from the gateway segment. Internet will carry 11% of long distance traffic by 2002.
• Users of IP telephony include residential and business users. A table is presented comparing the price of telephone calls from Finland to different countries. Discusses system components including the Multipoint Control Unit (MCU) for conference call support. Discusses interaction of H.323 and SIP.
• Security issues: eavesdropping is easier in the internet. H.323 supports authentication, integrity, privacy, and non-repudiation. IP telephony can include security functionality or can rely on a lower layer protocol such as TLS or IPSec. Streams can be encrypted. Securing the transmission with SIP involves: conveying keys using SDP, performing basic and digest authentication, or using SSL/SSH/Secure-HTTP. Firewalls must act as proxies because ports and addresses are embedded in the data stream for H.323. In SIP firewalls don't cause this problem.
• Network Interoperability: H.323 and SIP users can't directly communicate, the call path must go through the PSTN or need H.323 to SIP gateway.
• Applications and Vendors: software solutions include MS NetMeeting, VocalTec InternetPhone. Hardware solutions include Selsius Ethernet phones which have a 10 base-T interface and can be configured via a web browser, and Quicknet's specialized sound card. Nokia's solution includes Ethernet phones, telephones with serial interfaces, and adapters to allow phone lines to be provided by the Ethernet network.
H. Schulzrinne (Edited), Digest of papers in Special Issue on Internet Telephony, IEEE Network Magazine, May/June 1998.

Architecture

C. Chuah, L. Subramanian, R. Katz, A. Joseph, "QoS Provisioning Using a Clearing House Architecture," in International Workshop on Quality of Service (IWQoS), Pittsburgh, PA, pp 115-124, June 5-7, 2000.

B. Li, M. Hamdi, D. Jiang, and X. Cao, "QoS Enabled Voice Support in the Next Generation Internet: Issues, Existing Approaches and Challenges," in IEEE Communications Magazine, Vol. 38, no. 4, pp 54-61, April 2000.
Surveys technologies for enabling QoS support for voice communications. Argues that enhanced QoS support in the network is necessary for widespread deployment of VoIP. Gives several reasons why IP Telephony should replace PSTN, stresses interoperability.

• Reviews existing technologies such as the IETF internet telephony architecture and the ITU-T H.323 related recommendations.
• Claims the traditional internet cannot support the QoS required by these applications, and then discusses IntServ, DiffServ as two solutions to this.
• Presents two solutions from Cisco and Lucent for offering IP Telephony services. (1) Cisco's solution uses a resource reservation. protocol similar to RSVP and priority queuing in core routers for congestion avoidance. It uses packet classification (a la DiffServ) at the edges to allow priority handling techniques such as weighted random early detection (WRED) in the network core. This solution is targeted towards enterprise network, and is less scalable. (2) Lucent's solution uses DiffServ like mechanism with priority queues to prioritize voice data. Lucent's version is more scalable and is targeted to carrier class networks but relies on the wide-area IP network to provide QoS.
D. Bergmark, S. Keshav, "Building Blocks for IP Telephony," in IEEE Communications Magazine, Vol. 38, no. 4, pp 88-94, April 2000.
• The authors discuss a high-level API they developed on which new IP Telephony applications may be developed. This API may be used to develop applications such as Gateways and Terminals, and can also be used to develop services that can be "plugged in" to the IP Telephony network. Claims need for multimode applications (applications that span PSTN and Internet). Describes software package called ITX developed to support this. Supports SIP instead of H.323 because H.323 is more complex to implement. ITX keeps directory of locations at which a user can be reached, along with preference profiles, and tries each one till it reaches the user.
• ITX is composed of 4 main components, each of which is implemented as a Java package:
1. signaling: the control plane, which sends keep-alive signals to make easier failure recovery.
2. data exchange: the data plane, sets up data connections to peers, supports data devices.
3. directory service: the central repository for user info including the IP address where the user is currently located and the roaming location updated when user logs into internet phone. Based on BIND.
4. gateway: moves voice data between PSTN and internet. For example, the PSTN user calls in to gateway, the gateway asks for extension number, looks up user in directory service, and forwards the invite to the user.
• ITX is scalable, they give numbers for latencies, a single machine can handle hundreds of PSTN lines. Similar projects: TOPS, ChaiTime, Swiss Fed. IT work.
• Provides an example "PhoneRecorder" application. PhoneRecorder allows a terminal to connect and send a voice message, which is stored as a file on the server.
• Measured statistics for the implementation: connection setup took about 2-7 sec, transmission latency between IP hosts was 50-250ms, transmission latency through a gateway was 250-500ms, time to look up the user's current IP in the directory was approximately 16ms.
N. Anerousis, R. Gopalakrishnan, C.R. Kalmanek, A.E. Kaplan, W.T. Marshall, P.P. Mishra, P.Z. Onufryk, K.K. Ramakrishnan, C.J. Sreenan, "TOPS: An Architecture for Telephony over Packet Networks," in IEEE Journal of Selected Areas in Communications, Vol. 17, No. 1., January 1999.

F. Anjum, F. Caruso, R. Jain, P. Missier, A. Zordan, "ChaiTime: A System for Rapid Creation of Portable Next-Generation Telephony Services Using Third-Party Software Components," in Proceedings of the 2nd IEEE Conference on Open Architectures and Network Programming (OPENARCH), New York, USA, March 1999.

• Presents design of ChaiTime, an architecture that allows rapid deployment of advanced services. The PSTN is antiquated in several respects, so it prevents rapid development of new services. Calls may take place over the PSTN, the Internet, or a combination of the two. Services can be provided by cooperating service providers, unlike the PSTN, where the business model is centralized. Intelligence is assumed to be pervasive, but dumb terminals still exist, so services can be provided by proxies. The Java Telephony API (JTAPI) is extended with a set of objects (JCC). Uses Extended version of SIP for advanced services like downloading new services during session setup. Discusses CTI: community that develops portable software that can be used to design PBXs, call centers, etc; JTAPI provides API for applications running on single platform, but doesn't support using different providers for different portions of the networks or different channels in a multiparty call.
M. Hassan, A. Nayandoro, M. Atiquzzaman, "Internet Telephony: Services, Technical Challenges, and Products," in IEEE Communications Magazine, Vol. 38, no. 4, pp 96-103, April 2000.
• Provides a (1) list of advanced services we can expect from internet telephony, (2) a list of technical challenges and solutions and (3) emerging products that support internet telephony.
• New services: (1) allows integration of voice, video, data, and fax networks onto a single network, resulting in big cost savings (2) can support higher grades of sound (3) can support video telephony and (4) unified messaging (single phone #) (5) virtual phone line (DSL-data multiplexed with voice) (6) web interfaces to call centers (7) low cost calls (8) real time billing (9) telecommuting enhancements (10) enhanced teleconferencing (whiteboards, share documents).
• Challenges: (1) packet loss (gives chart of voice quality vs. packet loss rate), can be solved by network upgrade, silence/noise substitution, packet repetition/interpolation, frame interleaving (2) packet delay, due to codec, serialization, propagation, and queuing delay. (3) network jitter.
• Products/Market segments: (1) carriers (next-gen telcos that route voice traffic over internet) (2) enterprises (3) small business/single user.
• Gives a chart of Voice quality as a function of packet loss rate. The chart assumes use of LPC or PR, which utilize Forward Error Correction (FEC) techniques. Also see Steve McCanne's paper on FEC for a function relating packet loss to audio quality.
• Tables with characteristics of real products given. Includes info on simultaneous number of calls (often very small numbers of ports are included, but port cost decreases with number of ports), network interface, codecs supported, H.323 support, and hardware platform, and additional functionality (PBX functionality, SNMP management, jitter removal/lost pkt compensation).
A. Rayes, K. Sage, "Integrated Management Architecture for IP-Based Networks," in {\em IEEE Communications Magazine}, Vol. 38, no. 4, pp 48-53, April 2000.
• Discusses the development of a management architecture to monitor and configure an IP network. This architecture can be used to monitor performance, detect and locate faults in the network, and to quickly configure and extend the network. The idea is to quickly set up an IP network and keep it operating efficiently, and to make the IP network as robust, fault tolerant, and more configurable than the PSTN. The complexity of the IP network makes these management capabilities are even more necessary.
• In order to build strong customer base for new IP telephony services, IP services must (1) rapidly install and configure with no errors (2) give customers direct control over reconfiguration of services (3) allow provider to guarantee service quality and do performance monitoring. This paper discusses an integrated management support system for IP networks illustrating functions needed to support VoIP services.
• Extends FCAPS = fault, configuration, accounting, performance, security management. Performance management should be used to make sure that enough bandwidth is reserved for time sensitive IP voice traffic while other applications sharing the same link get their fair share without interfacing with the mission critical traffic. Lists examples of IP performance measurements (# packets received, # times net element enters congestion state). SLA reports are higher level reports, correlate fault and performance data (network uptime, latency). Traffic engineering is used to relieve congestion, includes rehoming, rerouting, load balancing and congestion control.
• Configuration management deals with physical interconnects of IP network elements.
• Billing used to measure subscribers and manage call detail info (can do flat rate, or Call Detail Record (CDR)).
• Security management can be used to implement an Access Control List defining a user's capabilities. Users should be verified (authentication) and activities the user is allowed to carry out may be configured.
• Gives examples of products that provide each of these functions. Netsys: generates a network topology which may then be used for capacity planning. CSRC and RADIUS applications are used to configure network elements such as broadband modems. The SNMP protocol may be used to monitor network performance. NetFlow Collector aggregates large volumes of raw usage data and supports open interfaces to integrate with the provider's software. WAN Manager provides topology management via real time topology displays and real time statistics.
M. Hamdi, O. Verscheure, J. Hubaux, I. Dalgic, P. Wang, "Voice Service Interworking for PSTN and IP Networks," in IEEE Communications Magazine, May 1999.
• Presents an overview of the main technical problems in designing interoperable services between IP Telephony networks and the PSTN. These problems include complexity in providing translation between different types of networks, providing good QoS over lossy networks, lack of interoperability / heterogeneity of network elements, naming / identification of users and terminals.
• problems: (1) Complexity - voice services can become more complex. For example, a user may want to establish a call from IP terminal and have incoming calls forwarded to a cell phone. (2) QoS - call must go through several transcoding operations - in addition, there are the standard jitter, loss, delay problems with IP network traversal. (3) ease of use - addressing becomes more complex in hybrid voice services.
• Claims reasonably that PSTN and IP telephony will coexist for a long time, and uses this as motivation for discussing calls transitioning between the two networks.
• Describes several call scenarios in the hybrid network.
1. A PSTN network is described, and the call traverses only through the PSTN core.
2. H.323 is used to make an IP Telephony call, all service specific processing and protocols are pushed to the end systems and are transparent to the network.
3. Hybrid, gateway acts as host to map media and control channels to PSTN. Gateways can be used in this case to perform transcoding operations.
4. The same protocols are used at each terminal, but the backbone network uses another protocol.
• Telecom companies are reluctant to expose their SS7 networks, so they provide SS7 access to signaling gateway which controls media gateway.
• Addressing of users is discussed. How can PSTN user identify IP network user? Should the IP address be used to identify the IP Terminal?
• IntServ and DiffServ used to give improved QoS. End system: lower bit rate gives lower signal quality and higher delays (processing (time to encode and run algorithm) and packetization (time to form packet of compressed voice) delays). Higher payload to header ratio = higher packetization delay.
C.R. Kalmanek, W.T. Marshall, P.P. Mishra, D.M. Nortz, K.K. Ramakrishnan, "DOSA: An Architecture for Providing a Robust IP Telephony Service," in Proceedings of INFOCOM, Tel-Aviv, Israel, March 2000.
1. Describes architecture with signaling protocols to allow service providers to offer network layer service differentiation and control access to the network layer QoS and other services. Claims coordination between call signaling and resource management protocols give users incentive to use service and generate revenue for provider, and that SIP and H.323 do not sufficiently achieve this.
2. Coordination between call signaling and resource management ensures (1) users are authenticated to receive additional QoS, (2) ensures network resources are available end-to-end to handle this, (3) ensures proper billing and accounting.
3. Has following functional entities:
1. IP endpoints - CPE/Terminals: personal computers, handheld devices; similar to IP Telephony Terminal
2. Edge routers - provide access to backbone network; includes functionality from H.323-style gatekeeper, but must in addition acquire authorization from a gate controller (thereby adding an extra RTT to call setup) to provide access to enhanced QoS. These edge routers know about call resource usage, and generate usage events that allow providers to charge for service.
3. DOSA proxies - provide subscriber authentication, perform call routing, provide admission control and name/number translation; provides H.323-style gatekeeper and SIP server functionality.
4. gate controllers - one of these is associated with each DOSA proxy, authorize access to enhanced QoS; provides functionality similar to having an underlying IntServ infrastructure (however, QoS is done between segments, and hence doesn't require the entire network to implement QoS support.
5. media servers - process media streams, act like Multipoint Control Unit (MCU);
6. PSTN Gateways - provide interface to the PSTN
4. Requirements for architecture include low delay/packet loss, support caller id, etc. Router provides QoS by packet marking or flow scheduling. Proposes segmented resource assignment to allow different portions of the network to have different provisioning mechanisms, allows network to cope with heterogeneity, lessens delay by not requiring per-flow signaling. Distinguishes between Authorized, Reserved, and Committed resources. Can handle multiple providers with service level agreements at boundaries. Privacy can be maintained by keeping IP addresses and end-to-end signaling exchanges private.

The TINA Consortium, "Telecommunications Information Networking Architecture," 1997. See specifications.

Cisco Systems, "Customer Profile: Florida International University," white paper.
Also:  Florida International University, "Telephony Components," slides from a talk.
• Discussion of project to implement IP Telephony solution in Florida university based on Cisco AVVID. Describes expected benefits of the upgrade, gives maps, network topology, redundancy and network plans.  Gives a plan for transition from current infrastructure.
J. Adelson, "Beyond Dial Tone: Opportunities for Value in IP Telephony (Brooktrout Technology)," Technical white paper.
• Lists advantages of IP Telephony services over those provided by the PSTN. Lists new products and services IP Telephony makes available: Internet call waiting, messaging, etc.
• These can be categorized along two dimensions: type (applications or dial-tone), user: (carrier or enterprise).
1. Carrier based dial tone: the carrier provides services that are indistinguishable from those provided by the PSTN. The carrier must have large scale so it can have termination points close to majority of call destinations.
2. Enterprise dialtone: the goal is to save toll charges by routing calls over internet. The system must support functions provided by the existing Private Branch eXchange (PBX).
3. Carrier based applications: these include internet call waiting, access to voice mail, and email access  through single web based/telephony interface. Billing and accounting must be supported, and the solutions provided can be smaller scale than those provided for dial-tone subscribers. The system must be flexible to support rapid development of new applications.
4. Enterprise based applications: these include call enabled web pages (the user can push a button on a web page to talk to a sales representative) and teleconferencing. These applications can be used to generate additional revenue for the enterprise. Applications that interface with the PSTN will require gateways.
• Implementation designs can be categorized as: open vs. embedded systems, standards-based vs. proprietary technologies.
1. Open vs. embedded: open systems often use Commercial Off The Shelf (COTS) Personal Computer (PC) chassis running Windows NT or UNIX with computer telephony boards. This allows developers to focus on application software and hardware development. Furthermore, the application developers are not reliant on a particular equipment vendor. However, embedded systems offer cost advantages, very high port counts, and are very reliable.
2. Standards based vs. proprietary: With standards comes interoperability. However, standards may be out-of-date, inefficient, or have unclear specifications. For example, H.323 may be inefficient for certain applications.
ACT Networks, "IP Telephony White Paper," Technical white paper.
• A Unified Access Architecture for IP telephony services is proposed. Several observations are made: (1) networks of the future will carry both voice and data traffic on the same net (2) wireless and wireline must converge into one net (3) no longer necessary to have a single service per network. Business opportunities include: (1) can reduce prices for high quality voice communications coupled with other multimedia services. (2) reduction in costs as enterprises route calls over internet or private intranets (3) IP based call centers, which can also handle video and combined data+voice. (4) cost of connect time can be reduced substantially: use silence suppression for calls on hold, on-hold music can be generated at packet network boundary rather than call center (5) virtual second lines can be provisioned on demand: can ring user's computer while user is connected to the Internet (6) can make advanced services available to any user via the traditional telephone set (7) improve levels of customer service.
Cisco Systems, "Architecture for Voice, Video and Integrated Data," Technical white paper.
• The Cisco AVVID (Architecture for Voice, Video and Integrated Data) is proposed, a QoS enabled IP infrastructure composed of switches/routers, applications such as call control, and clients such as fixed and wireless IP Telephones, H.323 videoconferencing equipment, and PCs. Cisco products to fulfill the manageability, reliability, and availability requirements are discussed.
• Design: IP will become the universal transport of the future. LANs need to support QoS for voice and video. PBX can be eliminated and replaced with IP Telephony over a converged network, but can fall back to the PSTN on failure. Unified messaging is achieved by allowing voice mail messages to be downloaded as WAV files when traveling. Contact center enhances legacy call-center infrastructure with IP voice, TDM voice, web, email, and fax integration. Uses open standards: H.323, SGCP, MGCP, SIP, TAPI, JTAPI, and PBX interoperability.
• Goals: rapid application development (applications can be written for the web, independent of operating system), manageable, highly available, and scalable. The concept of "data tone" is emerging: the data network needs to be as reliable as the voice network. The PSTN is a cost prohibitive system with limited salability and flexibility, often requiring a forklift to expand capacity. AVVID, however, is a distributed system where network redundancy is achieved with combination of hardware, software, and intelligent network design practices. AVVID can provide enhanced voice quality, lower total cost of ownership.
• Deployment: discusses how to implement AVVID in branch office, campus network, and wide area network. Building a hierarchical WAN to allow provisioning to take place at the edge is recommended. WAN QoS enabling techniques include: classification (DiffServ), prioritization (optimized queuing: by identifying voice traffic by port), Link-efficiency techniques include: compression, silence suppression, and traffic shaping. In general, LANs should be overprovisioned but WANs should run at near max link capacity to achieve higher cost savings.
M. Korpi, V. Kumar, "Supplementary Services in the H.323 IP Telephony Network," in IEEE Communications Magazine, Vol. 37, no. 7, July 1999, pp. 118-125.
• Describes the H.323 architecture for supplementary services, the differences in deployment of these services between circuit switched and packet switched networks, and interworking of these services across hybrid networks. Also gives an overview of H.323. Lists common supplementary services such as call transfer and call waiting as defined in H.450. Lists requirements for a protocol to handle supplementary services and claims QSIG, a separate protocol from H.323, is a good protocol for this.
• Compares H.323 and ISDN: (1) in H.323 applications run at endpoints, while in ISDN (as in the PSTN) the intelligence resides in the network. (2) ISDN signaling has dependencies on other parts of network, making deployment and upgrading difficult, (3) in H.323 user pays for software and unlimited use, while the user must pay a high access charge for ISDN services to support network upgrades, (4) ISDN reduces complexity of provisioning, (5) service incompatibility issues may be significant in H.323, where clients exchange capabilities and execute only common services. Issues involved in designing a hybrid network may include: (1) designing a terminal adapter with H.323 services implemented within, (2) getting gateways to interact effectively: for example, the gateway must either reject calls for unsupported services or emulate them.
G. Huston, ISP Survival Guide: Strategies for Running a Competitive ISP, Wiley, 1999.

J. Walrand and P. Varaiya, High-Performance Communication Networks, Morgan Kaufmann Publishers, San Francisco, CA, USA, 1996.

M. Cannon, S. Donovan "A Functional Description of a SIP PSTN Gateway," 1999.

P. Faltstrom and B. Larsson, "Where to terminate a phone call" Internet Draft, Internet Engineering Task Force, November 1998.

Protocols

J. Rosenberg, H. Schulzrinne, "Internet Telephony Gateway Location'', IEEE INFOCOM, San Francisco, USA, March-April 1998.

• Proposes Brokered Multicast Advertisements (BMA) to serve as lightweight, scalable mechanism for locating Internet Telephony Gateways (ITGs). This solves the problem of locating appropriate gateway for IP host to call user on PSTN.
• Software algorithms may be used to choose an appropriate gateway based on the advertisements without involving the end user. For example, the client may prefer to call the ITG closest to the final PSTN callee to minimize cost, or closest to the caller to maximize QoS. The set of protocols and billing mechanisms for ITGs can also vary.
• Lists requirements for gateway location protocol: should avoid central registries, converge quickly yet not require large amounts of bandwidth, be independent of the client telephony application, be simple to implement, should be scalable.
• Lists problems with using BGP, DNS, or LDAP to provide such services. BMA puts brokers at different places in the network which use scalable wide area multicast to propagate ITG advertisements.
M. Baldi and F. Risso, "Comparing the Efficiency of IP and ATM Telephony," in the 2nd International IEEE Conference on ATM (ICATM'99), Colmar, France, June 1999.
• Explores the real-time efficiency (in terms of the volume of voice traffic with deterministically guaranteed quality) related to the amount of network resources used. IP and ATM are considered as packet switching technology for carrying compressed voice and both are compared to circuit switching.
• QoS guarantees require (1) packet scheduling algorithms to control buffer delay variations as well as (2) call admission control. Efficiency indexes: effective load ( data rate at application level), real load (raw link capacity used by user data), apparent load (bandwidth reserved to the phone calls).
• Results: (1) IP over ATM can carry a higher number of calls than IP over SONET/SDH for long packetization delays. (2) a packet switched network can carry more calls than circuit switched one (3) there exists an optimal packet size which, by providing minimum apparent bandwidth for a call, minimizes the amount of bandwidth required to achieve a given delay bound for IP telephony (4) the number of hops traversed significantly decreases the volume of calls accepted on an IP network. However, this metric does not affect performance in ATM and circuit switched networks. (5) IP technology is better suited for scenarios where real-time traffic is a small part of the overall network traffic.
S. Wright, R. Onvural, "IP "Telephony" vs. ATM: What is There to Discuss?," in Proceedings of IEEE Conference on ATM (ICATM'98), Colmar, France, June 1998.
• Recommends either using ATM to carry voice or as underlying protocol in IP Telephony. Lists issues that need to be addressed to support telephony services in an integrated service environment over both ATM and IP infrastructures. IP networks are not likely to replace PSTN networks due to insufficient QoS support. Overviews how ATM and IP address issues like interoperability, network resource usage efficiency, signaling, routing, and security. Requirements include: reliability, availability, supplementary services, scalability, end-to-end interoperability, WAN resource usage efficiency, signaling, routing, security, and billing.
• There are two methods to deploy IP Telephony: (1) without QoS guarantees: can over provision the network to trade off lower cost for service quality/scalability. Can increase terminal complexity and adaptive systems to compensate for network imperfections. Several protocol headers may have to be appended, adding redundant information. (2) with QoS guarantees: can use IntServ which requires extra unused bandwidth, or DiffServ which might not provide sufficient QoS for telephony. Hybrid schemes, where the core network provides suitable QoS and access network does not, may also be used.
• ATM has several advantages over an all-IP network. Firstly, ATM can natively support voice services, has been designed from day 1 for these scenarios. Can do trunking with AAL2, where several calls can go onto one cell. Also considers IP+ATM networks, where the underlying ATM network can deliver better quality and lower cost. Can achieve datagram header reduction by compressing the IP header. Further, the integration of ATM control plane with PSTN control plane has been planned for some time.
G. Camarillo, "IP Telephony Gateways," Masters thesis, Royal Institute of Technology, 1998.
• Different protocols for establishment/release of connections are analyzed, including SIP and SS7. The compatibility between ISUP (part of SS7) and SIP is investigated.
• Two types of signaling are interexchange and subscriber loop. Signaling may also be categorized as: (1) Channel Associated Signaling, where signals sent through speech path as tones or pulses, and (2) Common Channel Signaling, where signals are sent out of band. ISDN uses DSS1 in local loop, SS7 in interexchange.
• ISUP provides basic (two party call) and supplementary (call forwarding, multiparty call) services. In IP telephony, we use H.323 and SIP to set up connection between two end systems. SIP is used to establish, modify and terminate multimedia sessions and is based on HTTP. The H.323 architecture consists of zones and gatekeepers, which provide address translation, admission control and bandwidth control.
• H.323 consists of: (1) RAS: for signaling between the end-host and the gatekeeper. It is used to perform registrations and to request bandwidth. (2) H.225: used to establish and terminate connections, (3) H.245: used to transmit control information during call and control logical channels between endpoints.
• Compares SIP vs. H.323. There is overlap, as both are standards for signaling and control for IP Telephony. SIP is claimed to be more simple, and more adaptable. It can work with any codec registered with IANA, while H.323 can only work with ITU codecs. It is modular and can be used with H.323. It is more scalable, allows servers to be stateless, does not require all information to be passed through a central control point. Although, SIP does not address problem of bandwidth control, it can use other protocols like RSVP for this. To interface SIP and ISUP: need a signaling module to translate signaling messages, a media gateway controller to control a media gateway, and a media gateway to transform a voice stream from one format to another. Possible architectures include: SIP Termination between a PSTN user and an Internet user, SIP Bridging between two PSTN phones with a call path that traverses the Internet, and SIP bridging without a media gateway where signaling goes over internet, but the media stream goes over the PSTN.
E. Hall, D. Willis, "VoIP in the Enterprise," in Network Computing Magazine, Vol. 9, no. 18, October 1998.
• Talks about issues for implementing VoIP systems for business users. Says most VoIP products deployed aren't standards based, and using H.323 can lead to security problems. H.323 chooses a random port, so the network must listen and respond to any port above 1024. SIP also relies on IP specific technologies, such as DNS, but allows for use of proxy servers which eases firewall implementation concerns. The Simple Gateway Control Protocol (SGCP) off-loads much of the signaling intelligence from the end node to the network, so dumb terminals/handsets may be used.
• Deploying VoIP at a company headquarters involves a large scale VoIP deployment with much bandwidth and latency requirements. Different vendors use different codecs for VoIP at low bandwidth, causing interoperability concerns. Increasing the network speed to support VoIP might be more expensive than a PBX based system if there will be less than 100 users.
• VoIP for telecommuter can make big cost savings, but there are several problems: users need enough bandwidth to protect voice quality, circuits need to be up if the user wants to receive a call, users need a direct connection bypassing firewalls if H.323 is used. In a survey, telecommuter sites were the last locations businesses expected to deploy the service.
• VoIP at the branch office is often the best place to begin, as the overhead is small and little traffic will be generated. However, if there are many users in a remote office, there may be infrastructure problems. G.711 uses full 64kbps channel, so to fit more calls over the T1 line, it is better to use compression based codecs such as G.723 or G.729.
Agilent Technologies, "Troubleshooting VoIP Signaling," Technical white paper.
• Lists common problems encountered by those deploying and maintaining VoIP networks, then discusses how to troubleshoot these. Problems include: (1) interoperability: there are various interpretations of h.323, h.323 has optional or unclear details, services provided by one vendor may not match those of another, codecs may be incompatible, port number allocation may mismatch, security requirements or QoS enforcement requirements may be violated. (2) Performance: poor performance during signaling can disrupt or delay call setup, quality of the call may be reduced, low bit rate codecs can cause packet loss, voice traffic can cause congestion of existing IP data traffic, codec choice can affect this load. Gives a series of tests to do, and gives explanations for what could go wrong at each step.

T. Doumas, "Next Generation Telephony: A Look at Session Initiation Protocol," Technical white paper.

• An overview of SIP. H.323 is claimed to be larger, more complex, requires a steeper learning curve, has a high cost of implementation, has a high connection setup latency, and has difficulty in achieving interoperability in heterogeneous networks. SIP is lightweight, and avoids these problems. It is simple and clear, and uses format of HTTP, DNS, web like scripting and email style addressing. Can perform SIP peer-peer, or can have SIP servers in network that act as a single access point for locating clients, mapping names to addresses, routing signaling messages between user agents, and redirecting requests.
• A SIP server may operate in two modes (1) proxy: acts like an HTTP proxy for signaling while RTP data flows directly. (2) Redirect: server tells caller location of the caller, the caller signals and sends data directly. SIP addressing: users can use email like addresses as the URL for a session.
• Compares SIP and H.323. Common problems uncovered when testing SIP based devices: the IP address for the configured SIP proxy server is incorrect, the IP address for the callee's URL is incorrect, proxies must not reorder or modify fields in SIP header, in the case of SIP over UDP: if messages are large they might require more than one datagram which increases chance of loss and negatively affects performance and reliability, the DNS query might fail, the sequence number (Cseq) handling must be implemented properly, must handle compact-style headers. At time of writing, SIP was about 1 year away from being a draft standard.
B.Aboba, "Dynamic attributes for the remote access dialin user service (RADIUS)," Internet Draft, IETF, November 1997, Work in progress.

C. Agapi, C. Chiu, T. Chong, H. Phillips, and B. Willingham, "Internet Telephony Gateway Location Service Protocol," Internet Draft, Internet Engineering Task Force, November 1998, Work in progress.

P. Calhoun and W. Bulley, "DIAMETER user authentication extension," Internet Draft, Internet Engineering Task Force, March 1999, Work in progress.

M. Day, J. Veizades,  C. Perkins, and E. Guttman, "Service Location Protocol, Version 2," IETF, RFC 2608, April 1999.

L. C. Herve, H. Olivier, H. Christian, "Gstn/tiphon interworking using a country code," Technical report, November 1997.

D. Hampton, D. Oran, H. Salama, and D. Shah, "The IP Telephony Border Gateway Protocol Architecture," Internet Draft, Internet Engineering Task Force, February 1999, Work in progress.

J. Rosenberg and H. Schulzrinne, "A Framework for a Gateway Location Protocol," IETF, RFC 2871, June 1999.

J. Rosenberg and H. Schulzrinne, "The IETF Internet Telephony Architecture and Protocols," Technical Report, Columbia University, 1999.

H. Schulzrinne and J. Rosenberg, "The Session Initiation Protocol: Providing Advanced Telephony Services across the Internet," IEEE Communications Magazine, 3(4):144--160, October-December 1998.

Issues of Packet Audio over the Internet

Athina Markopoulou, Fouad Tobagi, Mansour Karam "Assessment of VoIP quality over Internet Backbones," INFOCOM 2002.

• A measurement study that attempts to determine the ability of ISPs to provide good VoIP service. Measurements were conducted over core links for a period of two days. Although many links provided sufficient quality, a large number of paths had high delay and jitter. Many links give good quality on average, but in rare cases experience large delays and loss rates. In particular: (1) 6 out of 7 providers experienced outage periods 10-220 seconds 1-2 times per day (2) 40 out of 43 paths experienced bursts in loss from 10ms to 33sec (3) the number of out-of-order packets was small. A thorough model of voice quality based on the Emodel to find the MOS from link measurements was developed.

Ramon Caceres (ShieldIP, Inc.), Nick Duffield (AT&T Labs - Research), Timur Friedman (Univ. of Massachusetts Amherst) "Impromptu Measurement Infrastructures using RTP," INFOCOM 2002.

E. Altman, C. Barakat, V. Ramos, "Queuing analysis of simple FEC schemes for IP Telephony," in IEEE INFOCOM, Anchorage, USA, April 2001.

• The paper studies a simple FEC scheme in which for every packet n, some redundant information is added in some subsequent packet n + phi. They obtain some simple expressions for the audio quality as a function of the amount of redundancy. The analysis shows that this FEC scheme does not scale well with the number of redundant bits and quality will deteriorate for any amount of FEC and any offset phi. Audio tools like Rat [2] and FreePhone [1] use this scheme. The paper shows that this FEC scheme is not a viable solution because as we add more FEC, the audio quality decreases. This happens because adding FEC increases service time, and increases the probability of lost packets. They assume a bottleneck link in the network, so the number of lost packets increases regularly with amount of FEC.
M. Perkins, C. Dvorak, B. Lerich, J. Zebarth, "Speech Transmission Performance Planning in Hybrid IP/SCN networks," in IEEE Communications Magazine, Vol. 37, No. 7, July 1999, pp. 126-131.
• Discusses speech transmission performance requirements and the associated activities taking place in regional and international standards bodies for hybrid Switched Circuit Networks(SCNs)/IP networks. They review impairments to speech transmission that occur in SCNs and in hybrid IP/SCNs, and show that answers to many common questions about speech transmission planning in hybrid networks are already available in the standards literature.
• The SCN suffers from loss, noise, and echo. Adding digital technology can improve utilization and voice quality, but can increase delay. Hybrid networks introduce added complications: it is necessary to choose a speech codec, packetization scheme, and other speech signal processing techniques.
• Standards for speech transmission performance in hybrid networks are given, including guidance for technical issues in planning hybrid networks and general performance objectives for speech transmission. Discusses the E-Model, a tool for assessing the relative impact that transmission planning decisions have on speech performance. It outputs a scalar factor R that can be used to define speech quality category and user satisfaction based on several terms including echo, noise, delay, and codec distortion.
• Speech quality is affected by: (1) Speech coding: can use wireless codecs for this hybrid network. Performance is characterized by the subjective effect of speech impairments introduced by codec (EIF). (2) Transmission errors and packet loss caused by network congestion are manifested in the form of discarded packets or corrupted payload/speech. G.723.1 and some other codecs include error concealment techniques to minimize these factors. (3) Delay and echo. Echo control is required for all PSTN/IP user combinations. Delay over the hybrid network is greater than that in the SCN. (4) Speech compression, which may cause temporal/syllable clipping and noise contrast.
• A table is given rating popular codecs by their R values.
J. C. Bolot, S. Fosse-Parisis, D. Towsley, "Adaptive FEC-Based Error Control for Internet Telephony'', Proceedings IEEE INFOCOM, New York, USA, March 1999.
• An FEC mechanism is developed that (i) optimizes a subjective measure of quality (ii) incorporates the constraints of rate control and playout delay adjustment schemes, and (iii) it adapts to varying loss conditions in the network. The authors implemented the mechanism in FreePhone and acquired results.
• The following questions are considered: (1) given that we transmit K copies of each voice packet and we have a delay constraint of T by which the last packet should be transmitted, how should we space the packets so as to maximize the probability that at least one packet is received? (2) Given that K copies are to be transmitted equally spaced in an interval of length T, what encoding rates should be used for each copy so as to maximize the quality of the transfer subject to a rate constraint?
• They use utility functions for subjective measure of audio quality as input to their adaptive algorithm. The algorithm provides a very good audio quality even over paths with high and varying loss rates.
M. Podolsky, C. Romer, and S. McCanne, "Simulation of FEC-Based Control for Packet Audio on the Internet'' in Proceedings IEEE INFOCOM, San Francisco, USA, March-April 1998.
• A study of FEC for packet audio that characterizes the aggregate performance across all audio sources in the network.
• Key results: (1) there is an optimal amount of SFEC above which the audio quality does not improve (and may worsen) (2) There is an optimal division of an audio source's bit-rate  between the amount spent on encoding source and amount spent on encoding SFEC (3) SFEC is scalable, in the sense that as more audio sources join the network the technique remains beneficial.
• They give an equation for distortion of the received audio signal in terms of (1) probabilities of successful transmissions, (2) SFEC recoverable losses,  (3) unrecoverable losses, (4) the encoding rates for the source and redundancy data, as well as (5) alpha (results are based on an alpha equal to 16).
T. Kostas, M. Borella, I. Sidhu, G. Schuster, J. Grabiec, J. Mahler, "Real-Time Voice Over Packet-Switched Networks," IEEE Network Magazine, Vol. 12, no. 1, pp. 18-27, Jan-Feb 1998.
• Discusses the feasibility and expected QoS of audio applications over IP networks such as the Internet. They examine possible architectures for VoIP and discuss measured Internet delay and loss characteristics.
• The telecom world has envisioned an integrated network using a large-scale ATM backbone that supports many levels of QoS. The IP world believes real time voice and video can multiplex with existing data traffic.
• Implementation issues include: (1) Endpoint requirements. Users must have fast connections and considerable processing power. Table 1 gives codec complexity/bitrate/delays. (2) Echo cancellation. (3) DTMF transmission. These tones may be passed through the network as encoded audio, or can use H.245 to send logical representation of the number. (4) Clock synchronization. A mismatch can cause under/overflow in receiver buffer. (5) Billing. (6) IP address/phone number mapping. In H.323 a Gatekeeper service can be used to solve this problem.
• Factors affecting QoS include: (1) Codecs. (2) Bandwidth. (3) Packet delays and losses. (4) Access delays in the OS and hardware. Furthermore, sound cards add 20-180ms of delay, modems add 20-40ms, and Gateways should add less than 20-40ms.
• There is an inherent trade-off between delay, bandwidth, and computation. They describe a set of delay and loss measurements on the internet. They provide hypothetical mapping of delay and loss to QoS where increasing the buffer size increases delay, decreasing it increases loss, and there is an optimal point to trade off between the two, as shown in Figure 16. Table 2 compares two codecs in the Gateway to Gateway scenario (telephone to gateway to internet to gateway to telephone) and PC-PC cases.
J. C. Bolot and A. Vega-Garcia, "The Case for FEC-Based Error Control for Packet Audio in the Internet" ACM Multimedia Systems, Seattle, USA, November 1997.
• Describes an open loop mechanism based on FEC. Applications adapt to the best effort service currently provided by the network. They repeatedly measured the audio loss process to find statistics about how many packets were dropped in a sequence.
• First they consider Automatic Repeat Request (ARQ), which retransmits packets not received at destination. They then show FEC can do better than ARQ because it provides reliability without increasing latency.
• A simple FEC scheme finds XOR of N packets to reconstruct 1 packet loss in an n packet message. Another scheme, used in the MICE project, maintains a low bit rate encoding of previous packet.
• They associate a cost and a reward to evaluate the schemes (the cost is bandwidth used, the reward is the loss rate after reconstruction). They try making various combinations of PCM and LPC, and compare the reward per cost ratios of the different schemes. The more overhead for FEC they add the more reward but the higher the cost.

J. C. Bolot and A. Vega-Garcia, "Control Mechanisms for Packet Audio in the Internet," in Proceedings IEEE INFOCOM, San Francisco, USA, March 1996.

• Describes a set of control mechanisms to adapt the audio coding and decoding processes based on the characteristics of the channels, the goal being to maximize the quality of the audio delivered to the destinations. A jitter control mechanism and a combined error and rate control mechanism is considered. They use open loop error control mechanisms based on forward error correction for loss recovery. A loss recovery scheme is necessary if the number of lost audio packets is higher than that tolerated by the listener at the destination. FEC is better than ARQ, but its efficiency depends on the characteristics of the packet loss process. They add a low quality version of each packet to another packet.
• They adjust at the source both the send rate and the amount of redundant information to minimize the perceived loss at the destinations. Specifically, they: (1) Adjust the rate at which packets sent into network. Several audio codecs are used, as they couldn't get a codec that supports variable transmission rates. (2) Adjust the amount of redundancy information added in these packets. This is done to provide different levels of error correction. (3) Elicit feedback information about the loss rates measured at the destinations. A feedback mechanism based on packet loss rates is used. (4) Define a control mechanism which takes this feedback information to adjust the redundant information and send rate at the source.
J. C. Bolot and H. Crepin, "Analysis and Control of Audio Packet Loss over Packet-Switched Networks," in Proceedings of NOSSDAV, Durham, USA, April 1995.
• This paper shows using measurements over the Internet as well as analytic modeling that most audio losses are isolated when the network load is low and moderate. This suggests that open loop error control mechanisms based on FEC would be adequate to reconstruct most lost audio packets. They also characterize the packet loss process of audio streams sent over the Internet. Packet losses over time in unicast and multicast connections are measured. They develop a model to show that the impact of the Internet traffic on a Constant Bit Rate (CBR) stream of packets can be approximated by that of a batch of Bernoulli traffic. A chart of the average number of consecutive lost packets by Internet load is given. This number is very close to 1 up to a load of 0.8.
A. Percy, "Understanding Latency in IP Telephony," Technical white paper.
• Discusses the effect of latency on human conversations, analyzing the system components that incur the latency, and methods of managing the latency or maintaining sufficient QoS.  Latency in the PSTN is virtually always under 150ms.
• Effects of latency include small utterances getting delayed which results in confusion. Acceptable amounts of latency are under 200ms, those above 450 are unacceptable.
• Latency arises in the gateway: (1) in the network interface where audio is framed for transport to the DSP (about 1 ms). (2) in the DSP, from speech compression and echo cancellation algorithms, (3) framing, as the DSP needs to wait for a complete frame to process. Larger frames increase latency and efficiency. (4) processing time to run the DSP algorithms on the frame, (5) packet handing latency between the DSP and the WAN, (6) buffering before passing to the network software, (7) packetization in assembling coded voice into packets. The codec can pack more than one frame of data into a single packet to increase efficiency but increase latency. The coder can also let voice frames from other channels "piggyback" on the current packet to increase efficiency. (8) jitter buffer, a buffer to compensate for jitter. Occurences of data starvation are reduced, but latency is increased with buffer size.
• Latency arises in the network: (1) media access: low speed connections add more latency, (2) routing latency: best-effort doesn't give priority for voice. RSVP gives good service for GW-GW calls but is not supported in the public network. (3) firewalls and proxy servers: packet filters give low latency, but stand alone firewalls and proxy servers incur higher latencies.
• How to manage latency: know the sources of latency, use routing equipment that can handle prioritization, ensure the network has sufficient bandwidth, route calls away from media you don't control (such as the public Internet), purchase network services with a good SLA, reduce packet overhead. It is noted that gateway functionality can be integrated into telephones or router equipment to decrease cost.
S. Pracht, D. Hardman, "Voice Quality in Converging Telephony and IP Networks," Technical white paper..
• Discusses voice quality influencing factors and network impairments and their causes in a converged telephony and IP network, all from the perspective of the quality of the analog voice signal. Users have become accustomed to the PSTN and so Internet needs to be enhanced with mechanisms to ensure the QoS required to carry voice.
• Transmission conditions that threaten real time packetized voice data include: real-time bandwidth changes (as voice calls need constant and direct available bandwidth, and non-linear compression reduces voice quality), gateway processes such as echo cancellation, packet loss, delay (which can exacerbate echo), non-linear codecs (preserve perceptually important information but not waveform making traditional measurements less useful).
• Voice quality is made up of: (1) service quality (offered services) (2) sound quality (availability, reliability, delay, silence) (3) conversation quality (price, noise, fading, crosstalk, etc.). Voice quality is defined as sound and conversation quality. Primary factors include clarity, end to end delay, and echo.
• Observations: perception of any one aspect affects voice quality. Clarity and delay are orthogonal. Echo is dependent on delay and echo affects clarity. Clarity is affected by: loudspeaker and microphone, digital voice transmission on the PSTN, gateway components (such as the codec), IP network (which may drop packets), and the H.323 terminal. Echo exists in PSTN but is often not noticed because of small delays. Echo is caused by electrical mismatch between analog telephony devices and tail circuit. Acoustic echo is usually less noticeable. Echo cancellers are used to improve quality.
• Testing voice quality:
1. Signal to Noise Ratio (SNR) and Total Harmonic Distortion (THD) are linear methods that assume changes to waveform represent distortion. However, many codecs change the waveform but provide excellent quality.
2. A better technique might be to have a large number of human listeners evaluate the quality, as in ITU-T Mean Opinion Score (MOS). However, this scheme is expensive and slow. Hence, more advanced algorithms were developed: Perceptual Speech Quality Measurement (PSQM) compares clean voice signal and compares it to transmitted version using weighting method that takes into account physiology of human ear and cognitive factors about what humans are likely to notice. PSQM+ is an enhanced version of PSQM. Perceptual Analysis Measurement System (PAMS) uses different signal processing model than PSQM.
3. Delay is measured with: Acoustic Ping, which measures the audio spike travel time, or a more accurate method called MSL Normalized Cross Correlation.
4. Echo is measured with : Echo Return Loss (ERL) is amount of echo attenuated before it reaches the user's ear. Perceived Annoyance Caused by Echo (PACE) is an ITU-T standard measurement. Echo cancellers are compared by convergence time, cancellation depth, and double talk robustness.
5. Can measure Voice Activity Decoders (VADs) are measured by: Front End Clipping (FEC), Hold Over Time (HOT), Comfort Noise Generation (CNG).
J. Anderson, "Methods for Measuring Perceptual Speech Quality," Technical white paper.
• Reviews traditional techniques for measuring clarity, their shortcomings, and new techniques that have been developed in recent years to measure clarity from the perspective of end users. These methods include: PSQM, Measuring Normalizing Blocks (MNB), PSQM+, PAMS, and PESQ.
• Speech clarity was improved in the PSTN via handset filtering and PCM encoding. Convergent networks have different problems so traditional metrics such as SNR, Bit Error Rate (BER), and THD are no longer adequate for measuring voice quality. These networks are non-LTI (the shape of the output waveform is dependent on delay), and traditional metrics do not adequately predict a person's perception of speech quality.
• Several methods are compared based upon what they measure.
1. MOS is based on opinion scales, and standard techniques to calculate it are subjective and inefficient.
2. PSQM compares output with input in frequency domain based on frequency and loudness sensitivities. It assumes the in and out signals are synchronized, and assumes no packet loss or clipping because level scaling is performed. The processing algorithm uses perceptual and cognitive modeling.
3. MNB is an alternative to PSQM. It measures transmission channel errors, codecs with bit rates less than 4kb/s, and vocoders' impact on speech clarity. Measures distance between perceptually transformed representations of input and output signals.
4. PSQM+: improves PSQM by handling distortion caused by packet loss and time clipping.
5. PAMS: uses a different perceptual model than PSQM, gives listening quality and listening effort scores.
6. PESQ: uses the best features of PAMS (robust time alignment techniques) and PSQM99 (accurate perceptual modeling). New methods are added including transfer function equalization and calculating distortion over time.
• These techniques are then compared. Performance of each of the scheme varies with scenario. However, we can consider that PSQM+ performs better than PSQM, which in turn performs better than MNB. PSQM+ works best with background noise, PAMS works better when there is no background noise.
I. Sidhu, G. M. Schuster, J. Grabied, T. J. Kostas, M. S. Borella, and J. Mahler, "Real-time Voice over Packet Switched Networks," IEEE Network Magazine, Vol. 12, No. 1, pp. 18-27, May/June 1998.

V. Paxson, S. Floyd, "Wide-area traffic: The failure of Poisson modeling" IEEE/ACM Transactions on Networking, Vol.3, no. 3, June 1995.

W. Leland, M. Taqqu, W. Willinger, and D. Wilson, "On the self similar nature of Ethernet traffic (extended version),'' IEEE/ACM Transactions on Networking, Vol.2, No.1, Feb. 1994.

W. Willinger, M. Taqqu, R. Sherman, D. Wilson, "Self similarity through high variability: Statistical analysis of Ethernet LAN traffic at the source level," in IEEE/ACM Transactions on Networking, Vol.5, no.1, Feb. 1997.

D. Kuhn, "Sources of Failure in the Public Switched Telephone Network," in IEEE Computer, Vol. 30, no. 4, April 1997.

End System Design

G. Herlein, "The Linux Telephony Kernel API," in  Linux Journal, no. 82, February 2001.

• Describes integration of the telephony device driver into the Linux kernel. Explains why they wanted to add new API to support telephony cards: (1) support for sound cards isn't good enough for telephony, as they can't interface with the PSTN (2) need to access compression codecs built into card, otherwise you have to pay money to license them (3) sound cards can't support telephony applications, can't interface with phones or support codecs. Data channel uses read() and write() calls, device control uses ioctl(), select() and signal() used for event handling. This API can support applications built to run with sound cards. Supports asynchronous event notification to avoid polling the card.
Microsoft Corp., "IP Telephony with TAPI 3.0," Technical white paper.
• TAPI is an API for Windows that enables IP Telephony by providing  methods for making connections between two or more computers and accessing any media streams involved in the connection. It supports H.323, IP multicast, and integration with Active Directory. Uses SDP conference descriptors for resolving conference names to IP multicast directories. Has security model: controls who can create, delete and view conference announcements, prevents conference eavesdropping, associates ACL with each SDP descriptor. QoS provisioning is done by (1) negotiating bandwidth capabilities with the network using RSVP, (2) packet scheduling (token bucket parameter and priority), (3) Setting the delay, throughput, and reliability preferences in the IP TOS field. TAPI associates each SDP conference descriptor with an Access Control List (ACL) to specify who can create, delete, or view conferences and their announcements. A generic enterprise layout and how the H.323 Telephony Service Provider (TSP) can use the Active Directory service to do name-to-IP translation are described. TAPI supports IP Multicast conferencing, traditional telephony, direct show streaming. Use with Netmeeting is briefly discussed, which supports T.120 conferencing, and application sharing. Both support IP Telephony and H.323 voice and video.
A. Tolstoi, "IP-telephony problems," Web site.
• Considers technical issues concerned with IP telephony over the wide area. A comparison between different speed links given which show (1) the per-packet latency probability distributions, and (2) the correlation of packet losses by time. Provides a high level overview of some issues involved in gateway architecture design. Can use a PC with sound card, PC with DSP card. Efficient architectures must coordinate the codec algorithm and hardware.

Implementation/Deployment

Network Computing Magazine, "Voice Over IP, The Way It Should Be," January 1999.

• A web page with recent news articles on IP Telephony. Gives overview of different products and different companies' solutions (Cisco, lucent, 3com). There are surveys on biggest concerns in moving to converged network (reliability), top telecom priorities (lower costs), and how typical telecom network is structured (72% separate voice and data networks, 19% combined voice and data,  9% combined voice, data, and video) and which telecom apps are most critical to business strategy. Has reviews of gateways and voice/fax over data solutions. networkcomputing . This page has several audio files recorded with different audio codecs over different loss rates. Some of the codecs appear to be more resilient to loss than others. The performance of each of the codecs is scored here. Security issues: IP Telephony traffic is often unencrypted and hence one can listen in on call with LAN analyzer. A sample audio file recorded from such an experiment is provided here.

Cisco, "IP Telephony Solution Guide: Planning the IP Telephony Network," Technical white paper.

• A step-by-step guide to build an IP Telephony network. Cisco solutions are discussed throughout.
• Considerations for deploying the data network: (1) LAN/Campus environment: collect information about the topology, average/peak bandwidth, LAN QoS functionality, where servers and gateways will be located (2) WAN environment: decide on a topology (build using a hub and spoke model or multimeshed site model), investigate impact of WAN outage, available bandwidth and scalability on the existing network, QoS requirements for current network usage.
• Considerations for the telecom infrastructure: type and size of voice mail systems/PBXs/number of phones/fax requirements, how to route redundant/back-up paths, how to design/improve current cabling/power infrastructure.
• Should do availability planning in order to achieve SLA requirements, defines service classes based on availability requirements (reliable networks - 99.5%, high availability networks- 99.99%, non-stop networks - 99.999%) for network core. Metrics include MTBF (mean time between failures) and MTTR (mean time to repair). Factors that contribute to network design availability include: modular design, QoS for low delay and jitter, capacity management, and user error (which contributes to 40% of availability issues).
• IP telephony is a major change for most organizations, so need to do network planning: (1) baseline existing network utilization, (2) determine VoIP traffic overhead, (3) determine minimum bandwidth requirements, (4) determine required changes, (5) validate baseline performance, (6) determine trunking capacity, (7) design network management in order to handle Faults, Configurations, Accounting, Performance, Security (FCAPS).
Cisco, "Cisco AVVID QoS Design Guide," Technical white paper.
• Overviews the Architecture for Voice, Video, and Integrated Data (AVVID). Issues in connecting IP phones to an existing data network are discussed, including packet classification and queuing for QoS support. Lists problems involved in designing a campus/branch office/WAN for good QoS. For each problem, gives a Cisco product that can solve the problem and tells a bit about how to set it up.
Cisco, "Cisco IP Telephony Network Design Guide," Technical white paper.
• Overviews the AVVID deployment model. Issues in deploying an IP Telephony network over an existing LAN, campus, and WAN data infrastructure and migrating voice traffic onto the data network is discussed.
Cable Datacom News, "Cable IP Telephony Primer," Technical white paper.
• Discusses issues in IP Telephony specifically related to cable networks.
• The technology for IP telephony on Hybrid Fiber-Coaxial (HFC) networks has been around for years, but there are significant operational and economic reasons that prevented MSOs (multiple service operators) from deploying it. With the advent of cable modems, there has been demand to use these high speed data networks to carry voice so they don't have to buy separate HFC telephony equipment. This will allow them to give unique value added features (real-time provisioning of additional phone lines, integrated voice mail/email messaging.
• There are several problems that must be solved: (1) the Data over Cable Service Interface Specification (DOCSIS) must be enhanced to support the QoS requirements of IP Telephony, (2) operations support systems must be developed to support management and billing, (3) MSOs must develop interconnection standards for IP networks to share packet telephony traffic to avoid PSTN tolls. The PacketCable architecture, which was proposed to address these issues, specifies MGCP for call set up and management.
• DOCSIS: developed for cable modem equipment, doesn't provide QoS and latency controls to give toll quality IP voice services, these will be added into DOCSIS 1.1 in 2000 (update: DOCSIS 1.1 now provides QoS provisioning by using service flows to prioritize packets at the upstream router).
• IP Telephony support is expected to increase cost of a cable modem by 20-30%, adding a battery pack an additional 40-50$. Provisioning and managing the devices once they are installed is difficult, so in the short term just deploy a local-loop/ILEC bypass service, call then goes onto PSTN for long distance. Long term: make interconnection agreements with other MSOs to form a wide area IP Telephony network. • Other services vendors plan to provide include integrated telecommuting; an interactive set top that could offer directory services, caller id, and other features on the TV. Furthermore, it is expected that broadband packet networks operate with vastly different economic assumptions than PSTN so will result in different pricing and packaging models. G. Cook Jr., "Taking the hybrid road to IP telephony," in Communications Engineering and Design Magazine, December 2000. • Describes a "hybrid" incremental approach to IP Telephony deployment in a cable network: run converged IP services from the end customer's premises to the headend and then to a traditional telephone switching center as a local-loop bypass, rather than going straight to an all-IP network. • Cable operators are not currently using this technique, as many are also functioning as Competitive Local Exchange Carriers (CLECs) with telephone switches and operations support systems at central offices and digital carrier concentrators in their cable TV headends. The opportunity cost to replace the investment in the CLEC side of the business is too great. The hybrid solution proposed allows cable operators with circuit switched telephony equipment to begin offering converged IP services now without having to get rid of all their circuit switched equipment. • A system architecture is shown which requires a Network Call Signaling Gateway (NCSG). System design is discussed and a comparison with the (1) totally IP based (2) current CLEC based architectures are given. Several advantages are claimed: (1) it is vital that edge of network connecting to end customer be made future-proof through cost-effective systems architecture, (2) better for larger cable operators, (3) end users are insulated from obsolescence, (4) alternatives deliver relatively equal subscriber value but after taking into account operating and deprecation expenses the economics favor the hybrid approach, (5) can give value added services. Florida International University, "FIU: Architecture for Voice Video & Integrated Data," Web site. • A proposal for moving FIU's voice traffic to a VoIP system using AVVID. Their current system was old and expensive to maintain because of the coordination efforts required to perform moves, adds, and system changes. Key objectives include new added services, reduction in operational costs. The system was expected to operate for a minimum of 7 years. The proposal was listed as high risk, but the solution was chosen as opposed to alternatives because (1) it was expected to have the lowest cost, and (2) it fit with the university's objectives. The scope of the project involves upgrading to a new switching/routing layer, increasing bandwidth, and upgrading physical media on the data network; over 5,000 voice sets must be replaced. Proposed changes to the layer 2 network (broadcast domains/VLAN design) and the layer 3 network (IP/Appletalk routing, subnet size) are discussed. Must train 3,000 end users. Total cost estimated at$6,300,000.
C. Bajorek, "ABC's of Transitioning to VoIP," in {\em Computer Telephony Magazine}, January 2001.
Advice is given to a company in the current situation: (1) wants to design a VoIP network between six corporate sites, (2) plans to buy VoIP gateway option for PBXs, (3) the overseas branches of the company have a different PBX vendor (3) concerned about speech quality issues (4) unsure how to adequately plan the network (5) warned about multi-hop calls. Author notes that many companies are in the process of converting to VoIP, most VoIP products are using relatively mature with well debugged cores, thereby giving better speech quality. Gives list of issues to address before buying equipment:
• Analyze busy-hour loads at each site to determine how many ports of call handling equipment you will need. Keep in mind that the VoIP equipment will most likely be carrying long distance calls (unlike the cable network discussed above) when planning for bandwidth.
• Verify site VoIP interoperability: different PBX vendors might recommend different gateway manufacturers, hence it is best to choose a single vendor that can handle all sites. Even if all the vendors support H.323, interoperability is not guaranteed due to bugs or misinterpretations of the protocol specifications. Hence an extra verification step is required.
• Investigate voice compression issues.  The codec choice important for voice quality: greater compression means potentially lower quality, but less bandwidth consumption (which in turn can increase quality if there is congestion) A chart of MOS score with delay and bandwidth required is shown. (4) Delay issues should be effectively managed, as too much delay exacerbates echo problems. ITU recommendations for delay (> 400ms is poor, < 150ms is acceptable, in between is annoying) are given. End-to-end delay is composed of compression, serialization, and packetization (fewer frames per IP packet decreases packetization delay but increases bandwidth consumption) delays. The network designer should add up all expected values for these delays and see if they're less than 150ms.
W. Matthews, L. Cottrell, R. Nitzan, "1-800-CALL-HEP," slides presented at Computing in High Energy and Nuclear Physics 2000 (CHEP'00)}, February 2000. See also full text here.
• Describes VoIP test bed connecting several labs. Loss, jitter, and delay characteristics of voice traffic between LBNL and SLAC are reviewed and the effect of low, moderate, and high congestion on the link is be quantified. The testbed connects LBNL, SLAC, ANL, and Sandia Labs. It is composed of dedicated ATM links from the ESnet backbone with 3.5 Mbps allocated per link. Each site has a telephone connected to a PBX, which in turn is connected to a router. Policing is done at the edge router and a Committed Access Rate (CAR) is applied to the VoIP packets. Weighted Fair Queuing (WFQ) is implemented at the router by giving priority to the VoIP traffic based on the CAR setting.
L. Cottrell, "Internet VoIP Performance Measurements," Presented at the ESCC Meeting, San Diego, USA, April 1999.
• Slides from a talk about a VoIP testbed. Several charts are shown to show performance of VoIP streams over the network. The data was acquired over a period of several years.
• Key results: (1) 100 bytes pings found RTT is steadily improving over time. BCR [jan99, feb98] found that with 10% loss users can get toll quality, and the ITU says <3% loss is good. The authors show results with better than 2.5% loss, with loss rates decreasing over the years (although the codec these claims refer to is not discussed here). (2) the distribution of RTT obeys power law (3) network imposes jitter, the amount of which increases with load, (4) WFQ doesn't appear to help performance, and neither does setting the IP TOS field.
• Summary: some parts of internet are good enough to provide good voice quality and this potential is growing rapidly. However, reliability is still far from that provided by the PSTN.
K. Thompson, G. Miller, R. Wilder, "Wide-Area Internet Traffic Patterns and Characteristics," IEEE Network Magazine, Vol. 11, no. 6, pp. 10-23, November/December 1997.
• Presents observations on patterns and characteristics of wide area internet traffic, reports measurements from two OC-3 trunks in MCI's commercial internet backbone over two time intervals (24 hours and 7 days) in the presence of up to 240,000 flows. Characteristics of traffic is shown in terms of packet sizes, flow duration, volume, percentage composition by protocol and application. Usage patterns seen over the two time scales are shown.
• Composition statistics: 21% of traffic WWW, 14% FTP, 8% NNTP. UDP traffic makes up less than 5% of the total bytes sent, and is between 5% and 15% of all packets. RealPlayer traffic composes a negligible percentage of the flows, and comprises 0.5% to 2.5% of the packets and bytes sent. Each RealPlayer flow transfers 20Kilobytes on average.
• Key results: (1) the average packet size varies over course of day, (2) in the north-south domestic link the direction carrying the higher bit rate changes in rate inversely proportional to the other direction. Further, the average packet size in each direction also varies inverse proportionally. (3) Real Player traffic: more activity during business hours.
N. F. Maxemchuk, S. Lo, "Measurement and Interpretation of Voice Traffic on the Internet," in Proceedings of ICC, 1997.
• Describes a set of measurements on intrastate, cross country, and international internet connections. These are used to determine relationships between the (1) delay inserted to combat jitter, (2) strategy used to restore lost packets, and (3) "quality" of voice connection. With VoIP, coders can evolve more quickly, because only endpoints must agree to change a coder, rather than changing the whole network. Defines channel quality as the Minimum Loss Free Interval (MLFI): the fraction of the time that the signal is received without distortion for intervals of time that are "long enough" to convey useful speech segments. Interstate connections usually very good, international connections are bimodal - either good or very bad. A packet that traverses a longer distance is more likely to encounter an overloaded router with large queue delay. Can significantly improve crosscountry calls by changing buffer and # packets restored.
• Key observations: (1) Most people currently using the internet for voice are using it backwards: it's best suited for carrying voice short distances (as proposed for local-loop bypass). The argument for local bypass includes saving money and transmitting fewer bits. (2) The internet does not provide acceptable quality by telephone standards, but people are still willing to make calls because the service is almost free.
J. C. Bolot, "Characterizing End-to-End Packet Delay and Loss in the Internet," in {\em Proceedings of SIGCOMM}, San Francisco, 1993.
• Performed active measurements between end hosts by measuring the RTTs for UDP packets sent. Key observations: (1) probe packets tended to cluster together in time. (2) Losses are random when probe traffic uses small portion of available bandwidth (hence it is difficult to estimate voice quality from ping measurements). This has a significant effect on streaming media applications, as the codec works best when it receives packets at regular intervals. Probe compressions can effect voice quality because by increasing jitter.

Pricing

J. Altmann (Hewlett-Packard Labs), Huw Oliver (Hewlett-Packard Labs), Hans Daanen (Hewlett-Packard Labs), Alfonso Sanchez-Beato Suarez (Hewlett-Packard Labs), "How to market-manage a QoS network," INFOCOM 2002.

Tamer Basar (University of Illinois at Urbana-Champaign), R. Srikant (University of Illinois at Urbana-Champaign), "Revenue-maximizing pricing and capacity expansion in a many-users regime," INFOCOM 2002.

Bob Briscoe, Mike Rizzo, Jerome Tassel, Kostas Damianakis and Nicolai Guba (BT Laboratories) "Lightweight Policing and Charging for Packet Networks," OpenArch 2000.

J. Altmann, K. Chu, "A Proposal for a Flexible Service Plan that is Attractive to Users and Internet Service Providers," in IEEE INFOCOM, Anchorage, USA, April 2001.

• Gives a way for ISPs to provide different QoS levels without radically changing their current pricing model. Combines flat-rate and usage-based pricing: users receive a basic service, but given the choice of higher quality whenever they demand. Strictly flat pricing is not desirable, as light users subsidize heavy users. Based on results from the INDEX project they show that subjects take advantage of on-demand access to higher bandwidths when such access is available. They try a few different pricing options (users can buy out a week at a certain price, charge by minute, charge by byte). Users use 10 times more bandwidth when they buy out a week, 3 times more when they have unlimited access. However, subjects value flat rate pricing because they're willing to pay more to buy out bandwidth for a week. Subjects liked to have flat rate option but didn't care too much about what bandwidth that flat rate was as they could always pay more to get more bandwidth. Users also like to have a more flexible non-bought-out version to choose from as well.
X. Wang, H. Schulzrinne, "Pricing Network Resources for Adaptive Applications in a Differentiated Services Network," in IEEE INFOCOM, Anchorage, USA, April 2001.
• The authors propose a scheme to combine congestion sensitivity and QoS sensitivity into a single pricing scheme. The application may select different service types to achieve different levels of QoS. A congestion sensitive component is then added to this price. Uses the Resource Negotiation and Pricing (RNAP) protocol and architecture for prices and services. Constructs the user utility function based on an analytical model. Time of day, dependence on service class, access-rate dependant charges (AC), and volume dependant charges (V) are considered. The total charge of a session is given in terms of the number of intervals of the session, the bandwidth, the usage charge, the holding price, the number of bytes transmitted, and the congestion price. It appears that the method to empirically determine the best congestion price (p_c) for a network is understandably not given. Two topologies in their simulations: (1) two trees connected by a single link, and (2) several trees interconnected by a ring. Pareto and exponential on/off traffic sources are used. They have three user types, each with a elasticity factor (each user has a utility curve for how much they are willing to pay for a given QoS). Effects of traffic burstiness and traffic load, load balance between classes, and effect of admission control are evaluated. Significant price fluctuations occur throughout the run. It seems that a service class can be treated like an Internet Telephony Gateway (ITG), where more distant gateways offer a poorer QoS. Further, it seems this could be done for wireless devices - a cell phone could pay more for different service qualities (full screen video vs. regular 64kbps voice channel), and CS pricing could be implemented in the network.

M. Caesar, S. Balaraman, D. Ghosal, "A Comparative Study of Pricing Strategies for IP Telephony," IEEE Globecom 2000, Global Internet Symposium (San Francisco, USA) - Nov. 29, 2000.

• The paper presents a comparison of several pricing strategies: flat (FL), congestion sensitive (CS), QoS sensitive (QoSS), and a hybrid scheme that is both sensitive to congestion and QoS (CSQoSS). The QoS sensitive pricing mechanism takes into account the fact that the quality of received audio degrades as the number of hops traversed by the audio packets increase.
• The study is based on a two class user model: type I users pay any price for the best QoS (inelastic) and type II users request the best QoS at a cost that is less than some maximum price they are willing to pay (elastic).
• Experimental results show: (1) QoSS has the lowest blocking probability and also the lowest service distance for type I calls of any of the schemes. However, it does this by forcing the type II calls away from the home gateway and hence gives a lower QoS to those calls. (2) CS adapts to the current load at a gateway and hence is good at providing a low service distance to type II calls. However, it does this by increasing the blocking probability of a type II call. (3) CSQoSS is quite effective and incorporates the best elements of both schemes: it has a very low blocking probability while retaining a low service distance for type II calls. However, at very high loads the correlation between price and distance breaks down resulting in an increase in the service distance of type I calls.

A. Ganesh, K. Laevens, "Congestion pricing and user adaptation," in IEEE INFOCOM, Anchorage, USA, April 2001.

• Proposes a scheme in which the router writes the "social cost" (a reflection of router congestion) on the packet as it passes through. Users then shift demand according to this cost, thereby shifting the burden of rate-allocation to the end-system. Users are modeled as attempting to maximize a utility function that encapsulates their valuation of bandwidth. The congestion price used is a function of the aggregate data arriving on the link in that slot. Users don't know the social cost set by the network until the packet acknowledgement arrives from the remote host. Hence delay can be significant and the time for the price to converge can be large. Local (how much bandwidth user uses) vs. global stability (price charged) is evaluated. Simulation results show that transmission rates and prices converge over the long term.
P. Marbach, "Pricing Differentiated Services Networks: Bursty Traffic," in IEEE INFOCOM, Anchorage, USA, April 2001.
• Uses game theory to study pricing in DiffServ networks. Shows there exists an equilibrium for the game and pricing can be used to provide relative QoS guarantees. A DiffServ network with I priority levels is considered. Users are charged per submitted packet (although packets may be dropped) based on the service class they use. The user is modeled by a utility function. The elasticity of users' utility functions and burstiness of user traffic is varied. QoS sensitive traffic is also considered, where users measure QoS not only function of throughput but also fraction of packets that get lost.
M. Falkner, M. Devetsikiotis, I. Lambadaris, "An Overview of Pricing Concepts for Broadband IP Networks," in {\em IEEE Communications Surveys}, Vol. 3, no. 2, April 2000.
• A general overview of pricing techniques applicable to the current best-effort packet-based internet. Evaluation criteria include network, economic, and social efficiency. Charges are categorized into four classes: fixed access, usage, congestion, and QoS charges.
• We can utilize pricing schemes to design best effort versus networks that give QoS guarantees. Schemes that work on short time frames can respond quickly to congestion at the expense of short-term fluctuations. Compliance with existing networking technologies must be achieved and computational overheads of billing and accounting should be considered. They talk about network, economic, and social efficiency (proportional fairness: a resource allocation is considered fair if it is in proportion to the charge).
• Several key pricing techniques are discussed:
• (1) Flat pricing: advantages: simple, convenient, leads to social fairness. Disadvantages: users are deterred from being adaptive, leads to congestion and insufficient investment in network resources, providers deterred from provisioning QoS, decreases economic efficiency.
• (2) Paris-Metro Pricing: bandwidth divided up into separate channels, each with different price. Higher prices will probably experience less demand and hence lower utilization. Advantages: simple to implement, improved economic efficiency. Disadvantages: network needs to keep track of user's channel choice for billing/accounting, doesn't support individual QoS guarantees, may cause instability (during times of congestion, many users may switch over to higher priced network).
• (3) Priority Pricing: a priority field is placed in each packet header, low priority traffic is dropped or delayed. Disadvantages: does not provide individual QoS guarantees, might decrease social fairness. Advantages: users can get better relative positioning by paying more, increased economic efficiency.
• (4) Smart Market: each user transmits a bid, network drops bids below the congestion price. Advantages: encourages network and economic efficiency. Disadvantages: requires changes to most protocols, hurts social efficiency, does not give services guarantees.
• (5) Edge pricing: allows charge to take place at edge of network, allows for receiver charging. Advantages: compatible with ATM and RSVP, encourages users to multicast. Disadvantages: poor economic efficiency. Expected capacity pricing: user specifies expected capacity use. Advantages: compatible with ATM and RSVP, user's bill can be calculated at edge, socially fair. Disadvantages: must do traffic policing.
• (6) Responsive Pricing: dynamically change price based on congestion, adaptive users then decrease usage. Advantages: network utilization improved, compatible with ATM ABR, socially fair. Disadvantages: may be unstable, time frame must be short to quickly react to congestion.
• (7) Effective Bandwidth pricing: price chosen according to mean value submitted by user, but if the user goes over that value, charge by a line that's tangent to an "effective bandwidth curve". The user is charged a premium for not declaring the true mean rate that he sends at. Advantages: socially fair. Disadvantages: requires user to predict future usage.
• (8) Proportional fairness pricing: resource allocation is fair if it is in proportion to the users willingness to pay. Advantages: guarantees economic efficiency. Disadvantages: may be socially unfair.
L. DaSilva, "Pricing for QoS-Enabled Networks: A Survey," in {\em IEEE Communications Surveys}, Vol. 3, no. 2, April 2000.
• Overview of pricing techniques for networks with QoS support. The relationship between prices and traffic management functions (congestion control, resource provisioning and call admission control) for internet, ATM networks, and generic QoS enabled networks is discussed. The authors claim that although prices are traditionally set by a corporation's marketing department, engineering should help decide. Engineering issues affected by pricing include: congestion control, call admission control, resource management, and billing. User preferences can be modeled by inelastic, elastic, partially elastic utility functions. The customer surplus (difference between user is willing to pay and what they are charged) is a measure of network performance. Advantages and disadvantages of usage based pricing is discussed. Summaries of ATM pricing proposals are given, including: (1) network rents bandwidth and buffers to users, (2) smart market pricing, (3) charge for call setup, (4) VBR and ABR charges. Issues to be solved for pricing schemes include: scalability (e.g. with regards to billing), hierarchy (e.g. how to charge from customer to ISP or from ISP to NSP), impact of QoS support on performance. In particular, metrics for QoS must be resolved and must be made clear to end user. Finally, users have grown accustomed to flat rate services and are unlikely to easily accept a deviation from this type of market.
X. Wang, H. Schulzrinne, "Performance Study of Congestion Price based Adaptive Service," in NOSSDAV, Chapel Hill, USA, June 2000.
• Congestion sensitive and static pricing with adaptive users is simulated. Shows congestion sensitivity in a pricing takes advantage of user adaptability. Uses the Resource Negotiation and Pricing protocol (RNAP) to enable the user to select from services with different QoS properties and renegotiate contracted services. The tradeoff between blocking and raising congestion prices is investigated. The following economic performance metrics are used: average and total user benefit, price stability, user adjustment, user charge, network revenue. Results for bottleneck utilization, blocking probability (it appears to me that calls rejected due to admission control are not counted as blocks) total network revenue, total user benefit, avg. user benefit, variation of system price, avg. user demand are shown. Different target congestion control thresholds are used (it seems no analytical result is given to determine the "optimal" threshold).
N. Semret, R. Liao, A. Campbell, A. Lazar, "Pricing, Provisioning and Peering: Dynamic Markets for Differentiated Internet Services and Implications for Network Interconnections," in IEEE Journal on Selected Areas of Communications, 2001.

B. Stiller, G. Fankhauser, G. Joller, P. Reichl, N. Weiler, "Open Charging and QoS Interfaces for IP Telephony," in INET'99, San Jose, USA, June 1999

• Gives an overview of Open Charging Interfaces for IP Telephony (OCIT), an experimental platform for standards-based IP phones. These phones are enhanced with QoS support, which is based on IntServ, and support for implementing usage-based charging. RSVP is extended to include pricing/payment objects. Two pricing models are considered: Smart Market, and Effective bandwidth pricing. The signaling system works as follows: IP Phones speak H.323 to a gatekeeper for admission. Information is then signaled back to end systems which in turn is passed to the local RSVP API. A user interface is proposed which allows the user to choose the destination address and quality, review billing information is returned. The program asks the user questions to try to improve quality, or can use previously stored user profiles.
A. Odlyzko, "Paris Metro Pricing for the Internet," in Proceedings of the ACM Conference on Electronic Commerce, pp. 140-147, Denver, USA, November 1999.
• Paris Metro Pricing (PMP) is proposed. PMP partitions bandwidth into logically separate channels, each with a different price. Users will tend to congregate in the lower priced channels, causing more congestion there, and hence higher paying users will usually get better service. PMP provides congestion control essentially for free and requires only minor changes to network infrastructure. It may be used with DiffServ, and may be more useful at edges of network. The number of channels should be small. One can use priority queuing, but allowing lower priority packets to be blocked indefinitely violates the fairness criterion, so hence it is better to use a weighted round robin technique.
• Although PMP is simple, there are several problems: (1) QoS is not guaranteed. However, LANs usually don't either and users are happy with quality there, and it's better to multiplex data. The implementation is usually based on best-effort like Ethernet anyway, and can send voice over high price and video over low price. (2) It is difficult to interface with a network that doesn't implement PMP. One way to do this is to forward all incoming traffic onto the lowest priced link. (3) It is not clear how to split the revenue generated between network providers. However, solutions that have been proposed for other schemes can be applied here. (4)  How frequently would capacities and prices in PMP vary? How to define peak hours for a global network? (5) Would it survive in a competitive market? The scheme would work well in a monopolistic framework. (6) How to set prices and capacities of channels? The provider may be able to use customer surveys and time of day variations in traffic patterns. (7) It is not clear how to reconcile the users' desire for flat pricing with economic efficiency of usage sensitive pricing. (8) Finally, there may be stability problems. This can be dealt with by artificially lowering QoS on lower priced connections to increase demand for the higher priced connections. The implementation is easy to introduce: the provider can use IPv4's priority field and the cost on the lowest cost channel can be set to zero. However, it is necessary to install hw and sw to count packets.
L. McKnight, "Internet Telephony: Costs, Pricing, and Policy," in Proceedings of the Twenty-Fifth Annual Telecommunications Policy Research Conference, 1997.
• A cost model for an ISP offering an Internet telephony service is presented. The authors (1) note that a moderate increase in the use of internet telephony can double the costs of an ISP, (2) seems to assume the ISP will not offer Internet telephony as a separately priced service.
• The cost to the ISP is divided up into (1) capital equipment cost (allocated to subscriber type that uses it), (2) transport costs (the customer gets discount for dollar commitment), (3) customer service (number of minutes of support times cost per minute), (4) operations (network operations/maintenance, facilities, billing) , (5) other (marketing, administrative). The authors note that the bandwidth used per dial-up user is 5kbps for web browsing, but is 3 times that for Internet telephony.
• They develop a cost model that shows (1) pricing for internet services in US is currently efficient and competitive, (2) no user type subsidizes the others (unlike the PSTN, where business users subsidize home users) (3) with the introduction of Internet telephony to an ISP's service offerings, revenues increase slightly while costs increase by almost 50%. (4) if Internet telephony catches on, an ISP that operates its network most efficiently will have a competitive advantage (5) Internet telephony causes transport costs to increase by 75% for analog dial in subscribers. Transport costs increase for other user types too but by a smaller margin.
• Yield management, which can be used to maximize revenues, is a technique in which different classes of service are defined, and only high priority classes are reserved. In addition, access charges may be regulated (after this paper was written, the FCC raised caps on SLC and PICC charges).  The European community doesn't regulate Internet telephony because it doesn't meet certain necessary criteria (see table). Some countries tried to ban Internet telephony. It seems the main point they're making is that this will increase the load on the ISPs network and thereby decrease profit (as certain peer-to-peer applications are doing today). This may be an argument for separate, value-based pricing of Internet telephony services.
L. Murphy, J. Murphy, "Feedback and Pricing in ATM networks," ATM Networks: Performance Modeling and Evaluation,  Vol. 2, Chapman and Hall, 1996, p. 197-212.
• Proposes a distributed iterative pricing algorithm for ATM networks. A dynamic feedback signal about the current utilization of network resources is used to control usage over and above the congestion control and admission control techniques used in ATM.
• Two main questions are investigated in terms of ATM ABR traffic, namely, (1) how should congestion be defined and (2) how should resources be allocated? Metrics include network and economic efficiency.
• The authors point out there is nothing inherently monetary in applying pricing principles to communication networks. In the proposed scheme, feedback is provided to the user in terms of price. Users must be able to respond to dynamic prices. Users submit benefit functions, which provide more information to the network than a bid. Arguments against usage based pricing in networks are discussed. The system performance with elastic and inelastic users is compared. Simulations showed an increase of 15% in economic benefit and decrease of 70% in amount of cells lost with the new scheme.
J. Perloff,  Microeconomics, Addison-Wesley Publishing Company, 1999.
• Introductory economics textbook.
R. Gibbens, R. Mason, and R. Steinberg, "Internet Service Classes Under Competition," in IEEE Journal on Selected Areas in Communications, Vol. 18, no. 12, December 2000.
• Discusses the effect of competition on ISPs offering multiple services classes under a PMP type pricing infrastructure. They find that in any equilibrium outcome, ISPs will only offer a single service class. Secondly, in the situation of 2 competing ISPs: if both are offering a single service class and the other one subdivides its network to offer multiple service classes, both ISPs will suffer lower revenue. Thirdly, an equilibrium may fail to exist when both networks sub-divide their networks and choose their own respective prices. The net effect drives ISPs to offer a small quality range. Fourthly, if one network sub-divides and the other does not, the other's price equalizes at the mean of the other network's two prices.
N. Semret, R. R.-F. Liao, A. T. Campbell, and A. A. Lazar, "Pricing, Provisioning and Peering: Dynamic Markets for Differentiated Internet Services and Implications for Network Interconnections," in IEEE Journal on Selected Areas in Communications, Vol. 18, no. 12, December 2000.
• Presents an auction based approach to allocating bandwidth in a DiffServ network. The authors compare a sender-pay approach with the current receiver-pay model. In addition, the authors investigate the feasibility of maintaining stable and consistent SLAs among service providers operating DiffServ network components. These providers are RBS, which wholesale bandwidth; SBBs, who retail this bandwidth, and end-users who purchase this bandwidth. They note that if RBS and multiple SBBs on the same network are not owned by the same entity, a non-cooperative game formulation is the best way to model the problem. Also, auctioning is the pricing approach with the minimum information requirement.
• They investigate three types of SLA: expected capacity SLA (on average, the user will get the capacity she pays for), worst case capacity SLA ( each user always gets the amt of bandwidth she pays for) and local SLA (in a single network).
• Results: (1) not all configurations lead to convergent and stable allocations (2) some stable operating points lead to zero allocation for some brokers, meaning some certain classes of service are not offered at all by the network (3) the higher class service is more expensive, but has a smaller bottleneck, qualities which balance each other out (4) the large network is less expensive than the smaller ones (5) the high quality network has a slightly higher share in the high capacity network (6) all SBBs remain profitable over the long run (7) even though the demand for one service affects the amount of capacity available for another, the stability of each class is independent of the others'.
D. Raz, Y. Shavitt, "Optimal Partition of QoS Requirements with Discrete Cost Functions," in IEEE Journal on Selected Areas in Communications, Vol. 18, no. 12, December 2000.
• Discusses how to do QoS routing with discrete cost functions. QoS routing is a technology that may be used to enable QoS support over the internet for IP Telephony, an application with specific QoS constraints. In QoS routing we want to find a minimal cost path in the network that can satisfy some cost function. This cost function might be a function of bandwidth, delay, etc. and is used to define the utility of different links in the network (and is different from the pricing concept of cost).
• Results: this paper uses discrete cost functions to better model QoS routing in the internet. A polynomial-time approximation scheme for the problem of QoS routing with discrete cost functions is given.
R. J. Edell and Pravin Varaiya, "Providing Internet Access: What we learn from the INDEX Trial," Index Project Report 99-010W, University of California, Berkeley, April 27, 1999.

S. Shenker, Fundamental Design Issues for the Future Internet, IEEE Journal of Selected Areas in Communication, Vol. 13, No. 7, September 1995.

Routing

A. Dubrovsky, M. Gerla, S. Lee, D. Cavendish, "Internet QoS Routing with IP Telephony and TCP traffic," in Proceedings of ICC, New Orleans, June 2000.

• Proposes a new QoS routing method to enhance support of IP telephony. Significant delay and throughput improvement is achieved with the new scheme because congestion points can be avoided.
• The technique consists of the following steps: (1) end hosts receive OSPF topology information with QoS information about each link, (2) they pick a path to minimize a set of QoS constraints, (3) the call is rejected if path can't be found. The end nodes choose the path through the network and performs Call Admission Control (CAC).
• Experiments are performed with (1) only IP telephony traffic and (2) a mixture of IP telephony traffic and FTP traffic. They find that more calls accepted with minhop with CAC than in MC Bellman ford, as MC Bellman ford uses alternate path routing to provide additional capacity to calls. In addition, better delays and fewer losses are realized with QoS routing. However, QoS routing doesn't help much for uniform traffic patterns if both use CAC.
V. Paxson, "End-to-End Routing Behavior in the Internet" in Proceedings of SIGCOMM, 1996.

Regulation and Standards Development

K. Asatani, "IP and Telecommunication Integration: De Jure and De Facto Standards Have Entered a New Era"[abstract], in IEEE Communications Magazine, Vol. 37, no. 6, pp. 140-147, June 1999.

• An overview of the key organizations designing standards for IP telephony. IP is main focus of IETF, but the recent convergence with telecom networks is drawing attention of traditionally PSTN-oriented groups such as ITU-T, ATM-F, DAVIC, and TIPHON to focus on IP related technologies. The article describes the structure of IP related organizations, reviews IP related activities of other organizations, and overviews the collaborative relationship amongst them.  Definitions: De Jure = according to law; by right. De Facto = in reality or fact; actually.
• The IETF is part of the IAB, which in turn is a part of ISOC. The IETF publishes RFCs. IESG manages a standard track, which is a procedure of getting standard status for RFCs. RFCs change status from PS to DS, and then DS to Standard. Working groups write drafts which turn into RFCs with rough consensus. The ITU-T has established a liaison with IETF and feels that the IP project is of highest priority.
• The ATM forum development is carried out by a bunch of working groups. The forum deals with cross section of IP and ATM issues.
• DAVIC focuses on systems and applications development based on MPEG-2. DAVIC specifications are mostly available as standards and specifications from the ITU-T, ISO/IEC, JTC1, and IETF. However, DAVIC will prepare original specification if one doesn't exist.
• The TIPHON group works solely on IP telephony standards. It is composed of several working groups (WG) on IP telephony, including requirements, architecture, signaling, QoS, Wireless. TIPHON develops standards to support interactions between the IP network and the PSTN (SCN), in particular, the following scenarios are considered: IP to SCN, SCN to IP, SCN to IP to SCN, IP to SCN to IP.
• The Multimedia Switching Forum (MSF) provides standards for a multiservice switch based on the ATM platform.
• The Optical Internetworking Forum (OIF) develops specs for optical internetworking.
Dialogic, "Regulation," Technical white paper.
• A short history of regulation in the IP telephony market. Each time an Inter eXchange Carrier (IXC) terminates a call at a LEC, it must pay the LEC 3 cents per minute. The American Carriers Telecommunications Association (ACTA) petitioned the FCC to force IP Telephony Service providers to pay the same fees and to stop the sale of IP telephony software. The FCC decided to classify ISPs as "end users" and hence not subject to regulation. Hence IP telephony can be used to bypass tariffs for long distance access (although economics tend to favor using IP telephony to bypass the local loop, as mentioned in "Cable IP Telephony Primer").
Cisco Systems, "Architecture for Voice, Video and Integrated Data," Technical white paper.

K. Thiagarajan, "IP telephony must be embraced, not rejected," in The Hindu Business Line, December 2000.
Appears to be an interview. India maintained a ban on IP telephony (which was recently lifted). The loss of settlement revenues to India was \$37 million. The author believes that the ban should be lifted, as they might get some settlement revenues and make calls cheaper. It is claimed that incumbent carriers can survive if they make attractive packages, or can wholesale bandwidth.

Conferences

Internet Telephony Workshop, "Internet Telephony Workshop 2001 Technical Program," Web site.

IEEE, "Journal on Selected areas in Communications," Web site, December 2000.

IEEE, "Globecom," Web site.

Nossdav, "Nossdav: Network and Operating System Support for Digital Audio and Video," Web site.

IWQoS, "IWQoS: International Workshop on QoS," Web site.

ICC, "IEEE International Conference on Communications," Web site.

Web sites

Randy Katz, "Distributed Service Architectures in Converged Networks," Web site for class at UC Berkeley.

IETF, "IETF IPTEL Working Group Homepage," Web site.

• Several documents and presentations on IP Telephony, including drafts for telephony routing over IP, call processing language, etc. Particularly relevant are the articles on TRIP for Gateways, Internet Intelligent Networks, and the Gateway Location Protocol (GLP).
Henning Schulzrinne et. al. "IP Telephony Online Resources Page," Web site.

Henning Schulzrinne, "Internet Technical Resources: Internet Telephony," Web site.

Massachusetts Institute of Technology, "MIT Internet and Telecoms Convergence Consortium," Web site.

• Papers and presentations on IP telephony issues.

International Telecommunication Union, "Regulatory Issues Relating to IP Telephony," Web site.

• Web page with regulatory issues. Has regulatory analyses, economic analyses, regulatory surveys (these cover both IP telephony and the PSTN for many different countries), other links.
Organization for Economic Co-operation and Development, "OECD Communications Outlook 1999," Web site, 1999.
• A variety of statistics on the PSTN and IP telephony technologies. There is some useful free data regarding call statistics in the site.
T. Park, C. H. Lee, "Voice Over Internet Protocol (VoIP) - Current Technology Comparison of H.323 and SIP : References" Web site, 2001.

Internet Week Magazine, "IP Telephony source page," Web site.

• A web page with recent news articles on IP Telephony. Geared towards businesses interested in deploying IP Telephony solutions.
Cable Datacom News, "Cable IP Telephony," Web site.
• Web site for more information about cable IP telephony: list of companies developing DOCSIS cable modems with VoIP capabilities, broadband telephony interface products, VoIP gateway vendors.
International Engineering Consortium "Web ProForums," Web site.
• Tutorials include: access gateways, deployment of telecommunications networks, the coming of true convergence, desktop streaming media production, gatekeeper, H.323, HFC telephony, Internet telephony, Internet model for control of converged networks, local exchange softswitch, network computer telephony integration, real time billing for IP services, SS7, SS7 gateway, telephony billing, unified messaging, voice quality in converging telephony and internet, voice telephony over asynch transfer mode, voice data consolidation.
Roxen Community, "IP Telephony: New drafts," Web site.

Gecko Research and Publishing, " IP xStream," Web site.

Standards

R. Braden, L. Zhang, S. Berson, S. Herzog, S. Jamin, "Resource ReSerVation Protocol (RSVP)  -- Version 1 Functional Specification," RFC 2205, September 1997.

Audio-Video Transport Working Group, H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications," IETF, RFC 1889, January 1996.

"Recommendation H.310 - Broadband audiovisual communication systems and terminals," International Telecommunications Union - Telecommunications standardization sector (ITU-T), Geneva, Switzerland, November 1996.

"Recommendation H.320 - Narrow-band visual telephone systems and terminal equipment," International Telecommunications Union - Telecommunications standardization sector (ITU-T), Geneva, Switzerland, March 1996.

• Description of H.320, covers video and audio codecs, how to establish a "visual" telephone call, delay insertion (lip-sync).

"Recommendation H.321 - Adaption of H.320 visual telephone terminals to B-ISDN environments," International Telecommunications Union - Telecommunications standardization sector (ITU-T), Geneva, Switzerland, March 1996.

\bibitem{h322} "Recommendation H.322 - Visual telephone systems and terminal equipment for local area networks which provide a guaranteed quality of service," International Telecommunications Union - Telecommunications standardization sector (ITU-T), Geneva, Switzerland, March 1996.

\bibitem{h323} "Recommendation H.323 - Visual telephone systems and equipment for local area networks which provide a non-guaranteed quality of service," International Telecommunications Union - Telecommunications standardization sector (ITU-T), Geneva, Switzerland, November 1996.

\bibitem{h324} "Recommendation H.324 - Terminal for low bit rate Multimedia Communication," International Telecommunications Union - Telecommunications standardization sector (ITU-T), Geneva, Switzerland, March 1996.

• Lists functional elements, including codecs, control protocols, and data protocols. Gives functional requirements for the elements regarding delay and formats that should be supported. Gives protocol for the control channel: involving capabilities exchange and the logical channel setup/teardown. Discusses the audio channel: covers delay compensation (delay the audio signal to force lip-sync), maximum delay jitter (by transferring PDUs at regular intervals), and data encryption.
Henning Schulzrinne, et. al., "Session Initiation Protocol," Web site.

Henning Schulzrinne, et. al., "RTP web page," Web site.

E. Crawley, R. Nair, B. Rajagopalan, H. Sandick, "A Framework for QoS-based Routing in the Internet," IETF, RFC 2386, August 1998.

J. Rosenberg, H. Schulzrinne, "A Framework for Telephony Routing over IP," IETF, RFC 2871, June 2000.

• TRIP is a protocol to support the discovery and exchange of IP telephony gateway routing tables between providers.
• Considers architectures in which TRIP might be used: (1) a clearinghouse for gateway providers to exchange information about gateways, (2) confederations, which share information about gateways in a full mesh, without using a central clearinghouse (3) wholesalers, who sell services to smaller sized providers.
• The architecture consists of (1) Location Servers (LSs), which learn about gateways in their domain and exchange gateway information with other LSs. End users can query LSs by giving detailed information about their requirements. The LS has its own policy regarding how end user preferences are handled. (2) IT Administrative domains, each of which contains one or more gateways, at one or more LSs, and a bunch of end users. (3) Gateways, which connect IP networks to another type of network. (4) End users. Security issues are discussed, including mechanisms to authenticate peer LSs, message integrity, and encryption.

C. Agapi, C. Chiu, T. Chong, H. Phillips, B. Willingham, "Internet Telephony Gateway Location Service Protocol," IETF, INTERNET DRAFT, November 1998. [see also this]
% Based on the paper "Internet Telephony Gateway Location". Discusses GLP, an interdomain protocol used to distribute call routing tables between Internet telephony providers.

S. Black, D. Black, M. Carlson, E. Davies, Z. Wang, W. Weiss, "An Architecture for Differentiated Service," IETF, RFC 2475, December 1998.

• Defines an architecture for implementing scalable differentiated services (DiffServ) on the internet.
M. Arango, D. Dugan, I. Elliott, C. Huitema, S. Pickett, "Media Gateway Control Protocol (MGCP)," IETF, RFC 2705, October 1999.

\bibitem{rfc2543} M. Handley, H. Schulzrinne, E. Schooler, J. Rosenberg, "SIP: Session Initiation Protocol," IETF, RFC 2543, March 1999.

R. Braden, D. Clark, S. Shenker, "Integrated Services in the Internet Architecture: an Overview," IETF, RFC 1633, June 1994.

• Discusses a method to provide Integrated Services (IntServ) in the Internet. IntServ requires resource reservation and admission control to provide QoS for flows without changing the internet service model.
• Traffic control is implemented by a packet scheduler (which manages forwarding of packets), a classifier (maps packets into service classes by marking them), and admission control (decides whether new flows will be admitted into the system).
• Predictive service is proposed to give a reliable delay bound to applications that need it. ASAP service is proposed for applications with elastic requirements. Proposes that the less important packets in a flow be marked as preemptable. A Reservation Model is proposed to allow an application to negotiate for a QoS level. The network might grant a lower QoS rather than refusing the request. An "offered" flowspec is propagated along the multicast distribution tree, each router along the path records these values and adjusts them to reflect available capacity, and generate "requested" flowspecs back to sender.
C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J. Bolot, A. Vega-Garcia, S. Fosse-Parisis, "RFC 2189 - RTP Payload for Redundant Audio Data," IETF, September 1997.
• Describes a payload format for encoding redundant audio data. Can be used with lossy packet networks such as the Internet Mbone.
• Requirements: (1) packets have a primary encoding and one or more redundant encodings to allow the receiver to reconstruct lost packets (2) different types of encodings could be used for redundant encodings, so they should have an encoding type identifier (3) use variable size encodings so each encoded block can have a length indicator.
• A new payload type is defined. SDP is then used to bind a dynamic payload type to a particular codec, sample rate, and number of channels.
T. Russell, Signalling System 7, McGraw-Hill Series on Computer Communications, 1998.

W. Richard Stevens, TCP/IP Illustrated, Volume 1: The Protocols, Addison-Wesley Publishing Company, 1994.

Software

S. McCanne, S. Floyd, et. al., "ns -- Network Simulator'' Software.

Real Networks, Real Player for streaming audio, Software.

Microsoft Corp., Netmeeting, Software.

W. Richard Stevens, Unix Network Programming, 2nd Edition, Prentice Hall, 1998.

S. Fosse-Parisis, J. Bolot, FreePhone, Software.

UCL Multimedia, Robust Audio Tool (RAT), Software.

LBNL, "vat - LBNL Audio Conferencing Tool," Software. See also "vic - Video Conferencing Tool".

Project JXTA, "VoP2P -- Voice over JXTA ," Software. Provides a decentralized peer-to-peer based directory service for IP Telephony. See also a chat-conference and the discussion mailing list.

Erlang.com "Erlang B Calculator," Software. Can be used to calculate how many voice ports you need for a desired blocking probability if you know the Busy Hour Traffic (BHT). The "Lines to VoIP bandwidth calculator" can then be used to determine how much bandwidth to provision for IP carriage.