Cisco Telepresence



Rather than starting off with a set of technologies and then figuring out what it could build, Cisco took the opposite approach. Cisco decided that the time was right for telepresence, but rather than integrating off-the-shelf components, Cisco decided to build the system from the ground up. Taking a blank sheet of paper, a small group of engineers and executives at Cisco with many years of experience in video conferencing, IP Telephony, video and audio codec technology, and networking gathered together to define a set of requirements that would become the tenets used to design and build the Cisco TelePresence solution.


The first guiding principle of Cisco TelePresence was that the experience was paramount. It demanded an experience that was so lifelike and realistic that users would literally forget that they weren’t actually in the same room together. Cisco was so captivated by this idea, it created the mantra, “It’s All About the Experience,” and posted this mantra all over the hallways of the building it worked in. If the technology could not be made to deliver this level of quality at a reasonable price and bandwidth rate, then forget it, it wouldn’t build it.


The second guiding principle was that the system had to be so incredibly easy to schedule and use that literally anyone could do it. It had to completely do away with all the complicated user interfaces, remote controls, and dialing schemes of traditional video conferencing and would not require a help desk technician or a technically savvy user to set up and create a meeting. In fact, it made a conscious effort to purposely avoid adding in a lot of features, buttons, and nerd-knobs that video conferencing systems have, and focused on simplicity so dramatically that it reduced the entire user experience of initiating a Cisco TelePresence meeting down to a single button, which it coined, “One Button to Push.”


The third guiding principle was that the solution had to be utterly reliable. It had to work every single time, time after time after time. Only then would users trust it enough to actually use it; in fact, they would come to rely upon it, which in turn would drive high usage levels and deliver a true return on investment.


These three guiding principles, Quality, Simplicity, and Reliability, became the foundation from which all other design requirements would be based. From there, the team embarked on a journey to go and build, from the ground up, an experience that would deliver those attributes.


Although a great deal of the solution was designed and built from the ground up, including the 1080p multistream codec technology, audio subsystem, cameras, displays, and even the furniture. Cisco also had the benefit of leveraging much of the technology and standards it had used for IP Telephony, which at the time was redefined as Unified Communications. For example, Cisco TelePresence leveraged and reused the Cisco Session Initiation Protocol (SIP) stack and Cisco CallManager (now known as Unified Communications Manager) as its call control platform. It reused the 802.3af Power over Ethernet technology used by Cisco IP Phones to power its cameras. It reused the 802.1Q/p Automatic VLAN and quality of service (QoS) framework for attaching to the access layer of the LAN. It reused the same Cisco Media Convergence Server MCS-7800 series server platforms and Cisco Linux Voice Operating System used to run many of the voice server applications such as Cisco CallManager, Cisco Unity, and many others. It even reused the Cisco 7900 Series IP Phones, which serve as the user interface to the Cisco TelePresence system. By taking this approach, Cisco provided a solution that in many ways behaved just like an IP Phone on the network, allowing customers who had already invested in Cisco Unified Communications to see Cisco TelePresence as “just another type of endpoint” on that existing platform. Furthermore, by taking this approach, Cisco TelePresence was built upon an already proven platform, allowing Cisco TelePresence to achieve something that no other product in the history Cisco has ever done; release 1.0 was rock-solid stable, right out of the starting gate.


The other thing that differentiated Cisco TelePresence from other ventures in the history of Cisco was that with TelePresence, Cisco was not content to just offer the endpoints and let someone else provide the backend components, or just offer the network infrastructure and let other vendors provide the endpoints and backend components. Cisco decided that for Cisco TelePresence to be successful, it had to provide a complete, end-to-end solution: endpoints, multipoint, scheduling and management, call control, and network infrastructure. Furthermore, Cisco did not want to just sell the hardware and software and leave it up to the customers to figure out how to deploy it and manage it successfully. The product offering would need to be backed by a suite of Planning, Design, and Implementation (PDI) services, day-2 support and monitoring services.


Cisco also knew that the only way to prove to the market that telepresence was truly a new category of technology that could finally deliver on the promise of increasing productivity and reducing travel costs was to immediately deploy large numbers of TelePresence systems throughout Cisco, demonstrating that it could be done and that the Return on Investment (ROI) model was valid. Cisco took a bold step, slashing travel budgets globally and deploying more than 200 TelePresence systems in Cisco offices worldwide within the first 18 months of its first TelePresence shipment. This not only catapulted Cisco into the market leadership position in the number of TelePresence systems installed, but also made Cisco the largest user of telepresence in the world. At the time this book was written, Cisco had more than 350 production TelePresence systems installed internally, with more than 65,000 employees using them day in and day out. The average weekly utilization rate for these systems is more than 46 percent, with more than 4000 meetings conducted per month, an estimated savings of $174,000,000 in travel cost, and a total savings of 95,000 metric tons of carbon emissions to date. Furthermore, it is estimated that these numbers will dramatically increase with the deployment of personal TelePresence systems.


Cisco’s aggressive launch into the telepresence market caused a huge ground swell around telepresence. Cisco had always been viewed as an infrastructure company, and for the first time, Cisco was viewed as a video company. Many people questioned whether Cisco could make this move into high-end video communications and compete with existing video vendors, but Cisco has proven it can make this transition with a market-leading telepresence solution. The Cisco entrance into the telepresence market has prompted existing video conferencing vendors to develop telepresence solutions instead of continuing to focus strictly on high-definition video conferencing. Only time will tell where telepresence leads us, but it is off to an interesting beginning.

Evolution of Video Communications

Video conferencing has been around for more than four decades. In 1964, at the World’s Fair in New York, AT&T provided the world a preview of the first video conferencing endpoint, the AT&T Picturephone, illustrated in Figure 1. Six years later, AT&T released the Picturephone to consumers in downtown Pittsburgh, PA. Although the Picturephone ultimately failed to gain mass adoption because of its high price tag and the fact that users in 1970 weren’t ready for video phones in their homes, it exposed the world to the possibility of video communications that ultimately sparked interest and development of private video conferencing systems that debuted throughout the 1970s.




Figure 1: AT&T Picturephone


In 1982, Compression Labs introduced the first commercial group video conferencing system, the CLI T1, enabling video communications over leased-line T1 circuits at 1.544 Mbps. Even with its high price of ~$250,000 and $1000 per hour line costs, the CLI T1 once again sparked the interest in video communications, bringing additional vendors into the market. In 1986, PictureTel introduced its first video conferencing system with a price of ~$80,000 and $100 per hour line cost, dramatically reducing the price of the system and its operational cost. The rapid cost reductions, market adoption, and overall visibility accelerated the development of video standards and new product development.


Throughout the late 1980s and early 1990s, work on new video standards continued. In 1990, two standards emerged that would provide a basis for interoperability between various vendors’ video conferencing systems. H.320 provided a standard for running multimedia (audio/video/data) over ISDN networks, whereas H.261 provided a standard for video coding at low bit rates (40 kbps to 2 Mbps). With the release of these two standards, video conferencing as most of us know it today was born. Throughout the mid-1990s, H.320-based video conferencing systems were introduced to the market by a number of vendors. These standards enabled vendors to provide endpoints with lower cost of ownership, utilizing a public ISDN network, and providing interoperability with other vendors’ H.320 systems. The fact that a public network was now available for video conferencing meant the sky was the limit. Users could dial one another over a public network for the first time, providing the first opportunity for mass adoption.


In the mid-1990s, new protocols were released for video conferencing over analog telephone lines (H.324) and data conferencing (T.120). Along with these new protocols came enhancements to video coding with the introduction of the H.263 standard, which provided more efficient video coding and higher resolutions. About the same time, Internet Protocol (IP) networks (and specifically, Ethernet LANs) were starting to take hold. In 1996, the H.323 standard was released, which defined protocols used for providing multimedia over packet-based networks. With the release of H.323, the market saw a slew of low cost, IP-based, desktop video conferencing endpoints, such as Microsoft NetMeeting, PictureTel LiveLan, Intel ProShare, and several others. At the same time, traditional video conferencing vendors started to introduce H.323 support into their room and group systems.


It wasn’t until the late 1990s and early 2000s that H.323 video conferencing started to pick up steam. Low-cost desktop video conferencing endpoints never took off like everyone expected, but the mid- to low-end group systems started to take hold. In 1999, Polycom introduced the ViewStation, as shown in Figure 1-2, offering a “set top box” style video conferencing unit that revolutionized the video conferencing market. Unlike its large, complex, and expensive predecessors, the ViewStation was compact, simple to set up, much easier to use, and much less expensive. Immediately, a large portion of the market started to shift from large expensive systems to the smaller lower-cost systems in hopes of outfitting more rooms and expanding the reach of video conferencing.




Figure 2: Polycom Viewstation

During the early and mid-2000s, the video conferencing market moved ahead slowly, never really fulfilling the expectations of analysts or vendors. Even with the availability of lower-cost, easier-to-deploy endpoints, video conferencing couldn’t break the trend. Every year seemed to be the year video would break out, but it never seemed to happen. Vendors made great strides in lowering the cost of systems and improving overall video quality, but the systems couldn’t seem to gain mass adoption. Even in companies like Cisco that deployed hundreds of these lower-cost systems, utilization remained low, in many cases below 10 percent. Users seemed intimidated by the custom touch panels or the remote controls used to initiate calls and control the systems. Users often complained about wasting half the meeting trying to get the video call connected due to the complicated remote control or custom touch screen interfaces. In many cases, companies deployed video systems from different vendors, further complicating the life of users by introducing different remote controls for each vendor system. Custom touch panels are a great alternative to remote controls, allowing complete control of the entire video conferencing room including lighting, audio, and full control of the system. However, the more devices the touch panels controlled, the more complicated the interface became, making it difficult for the average user to navigate. Even when users could get calls connected, the overall experience often provided little value-add to the meeting. Poorly designed rooms and small images of multiple people around large tables made it difficult to read body language and facial expressions. These two factors played a large roll in the low utilization of most video conferencing deployments.


During this period, Cisco was heavily involved in pushing H.323 video conferencing. Cisco entered into an original equipment manufacturer (OEM) agreement with RADVision providing the first Cisco H.323 video conferencing solution. These products included H.323 Multipoint Control Units (MCU), H.320/H.323 gateways, with an H.323 Gatekeeper and Proxy that ran within the Cisco Internetwork Operating System (IOS) on various Cisco router platforms. Cisco produced H.323 video conferencing deployment guides and assisted numerous customers in building out large-scale H.323 video conferencing networks. Despite its efforts, video conferencing continued to experience low user adoption rates. Cisco had recently introduced IP Telephony to the market in 1999 and was enjoying excellent market penetration in that arena and believed that the answer to making video conferencing ubiquitous was to make it “as easy to use as a phone call.” In 2004, Cisco introduced Video Telephony to the market, allowing customers to use their Cisco IP Phones as the user interface to make and receive video calls, simply by dialing the phone number of another user’s IP Phone. Intuitive telephony-like features were also included, such as putting the call on hold, transferring the call, and conferencing in a third participant. Hundreds of thousands of Cisco Video Telephony endpoints were deployed in the market, but despite its extreme ease-of-use and telephony-like user experience, usage rates for Video Telephony were only slightly better than that of existing video conferencing systems.


Also in the mid-2000s, the H.264 standard was created that provided even more-efficient encoding, enabling higher quality video at low-bit rates, and support for high-definition video (720p and 1080p) at higher bit rates (~ 2 Mbps and above). At the time this book was written, video conferencing vendors such as Polycom, Tandberg, LifeSize, and others had begun offering high-definition-capable endpoints, hoping that the improved image quality would breathe life back into the video conferencing market. It’s just now that high-definition video conferencing endpoints are deployed and used in good numbers, so only time will tell the fate of these next-generation video conferencing endpoints. At the same time, Microsoft, Cisco, and others began a renewed effort to push collaboration applications to the desktop, with applications such as Microsoft Office Communicator and Cisco Unified Personal Communicator.


As early as 2000, telepresence systems started to show up on the market, providing a more holistic approach to creating a virtual meeting experience than existing video conferencing systems. Rather than focus on providing a low-cost video conferencing experience, they focused on providing a high-quality, immersive experience, so people felt as if they were actually in the same room together. Using multiple screens and cameras, they divided the meeting room in half, positioning the screens, cameras, tables, and chairs in such a way as to mimic the feeling that everyone in the meeting was sitting at the same table. These early telepresence pioneers were small, privately owned companies serving a relatively small niche market.


In 2004, Hewlett-Packard was the first large, multinational vendor to bring an immersive telepresence system, immediate credibility, and increased focus to the telepresence market. The Halo Telepresence system offered a white glove service and targeted the executive meeting rooms. Rooms within a room were built to provide the proper environment, and a new dedicated Halo Exchange Video Network (HEVN) was introduced, providing a fully managed telepresence service. However, even with the backing of a major technology company, telepresence was still challenged with limited adoption and slow growth rates.
Early Telepresence systems were targeted at the executive ranks with an expensive high-touch model. Due to this executive level approach, vendors offered turnkey solutions and required a fully managed service over a dedicated network. This approach seemed like a good idea at the time because there were similar video conferencing solutions with the same model. However, the high system costs coupled with the high recurring fee for the managed service severely limited their deployment. Even though many customers’ networks were not ready for telepresence at the time, customers realized that providing separate networks for specific applications was not the right path in the long term.


In late 2006, Cisco entered the market with its first telepresence system focused on providing a truly immersive experience, deployable over existing IP networks and used by anyone within a company. Cisco leveraged its vast networking knowledge to design their TelePresence system to run over converged IP networks. At the same time, a grass roots effort was underway that would provide design guidance for service providers looking to offer Cisco TelePresence as a hosted or managed service. Additionally, this design guidance allowed service providers to offer an Inter-Company solution for Cisco TelePresence. This work also allowed Cisco TelePresence to extend past companies’ intranetwork boundaries for the first time, further broadening the power of Cisco TelePresence. Cisco TelePresence Inter-Company offerings continue to expand, providing even more momentum to Cisco TelePresence. As previously discussed, early telepresence systems were supported on overlay networks and managed by providers, which severely limited the proliferation of telepresence systems. Providing telepresence over converged IP networks requires systems that provide standard management tools, security, and a well-defined network architecture. Figure 3 shows the first Cisco TelePresence system: the CTS-3000.




Figure 3: Cisco TelePresence CTS-3000

What Is Telepresence?

Telepresence as a concept has been around for many years and can be applied to a large number of applications. From virtual dining-room applications in which people are made to feel that they are sharing a meal at the same table together, to mystical “beam me up, Scotty” scenarios, such as projecting a presenter onto a stage using holographic projection technologies so the presenter appears to be standing on stage in front of the audience. Any immersive application that makes one person feel as though another person is physically present in their environment with them can be called telepresence. Wikipedia.org defines telepresence as “a set of technologies which allow a person to feel as if they were present, to give the appearance that they were present, or to have an effect, at a location other than their true location.” Although these types of applications are “cool,” their usefulness has so far been isolated to niche markets, one-off events, or to futuristic research studies.

However, there is one application in particular in which telepresence has found a viable market: the business meeting. In today’s global economic climate, companies are hungry for technologies that enable them to communicate with their customers, partners, and employees more frequently and more effectively. They want to speed their decision-making processes, allowing geographically separated groups to collaborate more effectively together on projects, increasing intimacy with their customers, and lowering their costs of doing business. Business travel is at an all-time high, yet traveling is expensive, time-consuming, and takes a toll on people’s bodies and personal lives.

However, the market is skeptical of video technologies that promise to deliver these benefits. For years the video conferencing industry has promised that it would replace the need for face-to-face meetings and lower travel costs, but for the vast majority of companies that have deployed it, video conferencing has for the most part failed to deliver on those promises. Video conferencing has struggled for years with complicated user interfaces, lack of integrated scheduling, and in many cases poor video quality. These issues have directly impacted overall utilization rates and caused major skepticism about the true value of video’s use as a communications tool.

Many people refer to telepresence as high-end video conferencing, or the “next generation” of video conferencing. Many people categorize any video conferencing system that provides high-definition video and wideband audio as telepresence, but in reality telepresence is its own unique video technology. Telepresence is much more than just high-definition video and wideband audio. Providing a true telepresence experience requires attention to details overlooked in most video conferencing environments. Later in this chapter, video conferencing and telepresence will be compared highlighting the difference in the two technologies.