Freshers Aptitude technical questions
Freshers Job Alert
Bookmark and Share

Video Conferencing

This Video Conferencing tutorial will help you to learn more.Please update yourselves. Also let usknow more about Video Conferencing

 

1. Introduction

The World Wide Web is no longer silent, and it's no longer standing still. New technologies have brought sound, video and animation instead of Text HTML pages and still graphics. There is not much to it, every CD-ROM does it, the difficulty begins when we demand from the communication to be streaming.

Streaming in the context of communication means that the audio/video file is transmitted while it is being created, and converted at the receiving point into a continues video and sound. One can add the demand of "real time", I.E., the delay between a received physical input (sound from a microphone or picture from a video camera) at one side, to physical output on the other side of the communication line, should be small enough to enable normal interaction in which both sides would be unaware to the delay.

2.1. Video Bandwidth needs

Streaming technologies are designed to overcome the fundamental problem facing multimedia elements distributed over the Web: limited bandwidth. While your 28.8-kbps modem or 128-kbps ISDN connection may seem screamingly fast, it pales in comparison even to an ancient single-spin CD-ROM drive that can transfer 150K of data each second.

Where most of us think in bytes per second, the communication would have to think in bits, which come eight to a byte. Your 28.8-kbps modem has a throughput capacity of about 3.6K per second, approximately 1/40 the speed of the ancient CD-ROM drive.

As an example, lets take the simple case of transferring sound from one computer to another through a modem. In order to sample the voice, we connect a microphone to a standard sound card that uses a single audio channel, with sample rate of 8-KHZ, 8-bit audio. The binary file then transferred to V.34 modem that in optimal conditions can transmit up to 28,800-bps. Since the sound card generates 64,000 -bps, there is a need for compression, which most modems can perform. The receiving side has to reverse this process, decompress the file and to feed the sound card with continues data.

A CD-quality sample rate has been set to 44,100 a second with 16 bit per channel. In our case, with a single channel the bandwidth we need is 705,600 bit's per second.

The first compression would be to sample at 8-KHZ instead of 44.1, and at 8-bit instead of 16-bit. That decreases the bandwidth to 1:11 to the original and the result would be a lower sound quality, which shows that we used a lossy compression techniques.

When trying to transfer live video, the bandwidth problem becomes much more critical. To transfer voice through modems we needed a compression factor between 25 to 50, however when trying to transfer a video picture that was captures by a low resolution video camera, we need a compression ratio between 2500 to 5000. Note that the eye resolution is higher at least by a factor of 100 from the resolution of the camera, and the eye viewing angle is at least twice as large as the camera's. Hence, it seams that in the near future no compression will achieve this target without fundamental changes in the communication bandwidth.

Increasing the bandwidth is possible in several ways, all of them are expensive:

Replacing the analog phone lines with ISDN lines which provide a 128-Kbps bandwidth, more that 4 times the fastest modem. Large organizations can rent digital phone lines with a 256-Kbps to a 34-Mbps.

In the near future we will use a cable modem which will be connected to the cable network and will use the high Coax cables bandwidth. In the far future we will use CyberOptics cables which in labs achieves bandwidth of billions bits per second and theoretically we are far from exploiting their potential.

2.2. Internet Delays

The Internet will not replace the telephone system. There are a number of basic differences between the two networks that result in very different and distinctive performance characteristics. The most obvious difference is that the telephone system is based on analog signal switching whereas the Internet is based on digital packet switching. The phone system's main strength is it's ability to transmit real-time continuous speech. This is unfortunately, unreliable when it comes to transmitting data. The Internet's main strength is it's ability to transmit asynchronous data to anywhere in the world, but is unreliable if the data transmission is required in real time.

Unfortunately, the Internet is notoriously unpredictable when it comes to transmission performance. Heavy traffic load and internal transmission problems can cause delays that are beyond anyone's control. This can result in disrupted speech and video reproduction at the destination computer. Unlike digital cellular or radio phones, however, there is no loss of data. Disrupted speech in an Internet transmission is purely a gap in the data stream reproduction. As the Internet grows and expands in overall bandwidth, this problem should be continuously less evident.

3. Solutions & Technologies in use

3.1. Video compression - introduction

The compression of the video signal is done in a number of stages, which causes loss in the visual information. In the first stage the video camera captures only a small part of the view area in a low resolution. In the second stage we convert the analog signal that comes from the camera to a digital one in 18 Mbps. In the third stage we convert the 3D colors of RGB into 2D presentation, which means that the absolute intensity is separated from the direction of the color vector, and also reducing the resolution of the picture to a lower one - 320*240 pixels only. In this stage we reduce the bit rate to lower than 4 Mbps in a "brutal" way, before we start the smart compression.

3.2. Compression techniques

The real challenge would be to compress the picture in a factor of 240 up to more than 1000, without making an abstract scratch. Compression in a factor of 240:1 is enough for transmitting in double ISDN line and compression in a factor of 1067:1 is required for transmitting in a 28.8 Kbps Modem. This enormous compression ratio can be achieved in a couple of ways:

Looking at a pattern of video frames as a "3D" information. Two dimensions of horizontal and vertical and the third dimension is the time. The Compression in the time dimension is different than the static "2D" image, because the time pattern needs to transmit only the changes from frame to frame. By using the base frame it is possible to make continues frames by transferring small amount of data that describes the differences between two continues frames, until a new base frame should be transferred.

3.2.1. MPEG compression

Another technique that can be used is performing interpolation to guess how the pattern is created from the beginning to the end. The MPEG compression is based on this method and the three kind of frames called I-frame (intra frame) P-frame (predicted frame) B-frame (bi-directional interpolated frame).

The 2D compression JPEG of the static frame is used to compress the base frame, I-frame. The JPEG algorithm is based on spectral analysis of the frame and focusing on the major frequency components of the image. The required image quality sets the compression ratio. Changing the compression in the time dimension allows setting the number of interpolated frames (B-Frame) and the predicted frames (P-Frame) that are stored between one base frame to another.

3.2.2. Wavelet compression

Different applications handle compression with different success with each compression type. Some preserve better the static details (but there are unaccepted jumps between frames), while others maintain continues motion (but only the rude details could be recognized). The compression technology today, allows transferring a continuous video in 15 frames per second and a resolution of 320*240 pixels on a digital line of 256 Kbps. To achieve a similar quality under standard telephone lines a 10:1 compression ratio is required. One of the technologies today that might deliver this compression ratio is called WaveLet.

This technology used by the Israeli product VDOLive,is chosen by some important Internet Web Sites to use as base for their video services, it seems that this product and this technology are going to lead the market. One of the most attractive aspects of the WaveLet compression is the ability to make a degraded improvement. The same compression creates different "layers" of details, with different quality, so the quality can be improve by combining more and more layers in the frame, when the bandwidth allows it.

4. Video Encoding and Decoding

4.1. Embedding Multimedia into the Web page

After understanding the difficulties of transferring an audio/video files over the Internet, there is a better understanding of the limitations that should be considered, especially while creating the video files. Reduced view area with a low detail background should be chosen, objects that are participating in the frames and the camera should avoid fast movements. Fancy stereo effects should also be avoided.

The analog signals conversion to digital is done by the audio card and the video capture card. Nowadays every standard audio card can provide the performance needed for "audio over the Internet". The video capturing also doesn't require professional equipment.

It's worth recalling that as a medium, video is much more demanding than audio both technically and artistically. You can achieve surprisingly good audio results in a quiet room with a simple microphone and a sound blaster.By contrast , video demands proper acting,lighting,staging,and professional-level equipment to achieve good quality,even when you have the luxury of double spin CD-ROM playback. With video made for 28.8-Kpbs,64-Kpbs ,or even 128-Kbps connections,high production standards are absolutely critical.

Before entering the process of creating a site that supports video , Web authors must set minimum artistic and technical quality levels by evaluating the purpose of the video transfers to the end user. They must also check weather the desired quality is obtainable at the connection speed typically used by the target audience.

The audio and the video files should now be inserted into the HTML pages, which are used as a graphical environment for the video window. The binary file is stored separately from the page as a MIME Extension (Multipurpose Internet Mail Extension), which are standard format for add-ons that are not written in HTML language, and are linked with pages and other hyper-text links to words, via a "dummy" file that is connected to the HTML language and points to the location of the file.

4.2. The Player

A Media player should be installed at the client computer that should have specific compatibility to the file format and the compression algorithm.

Players are most commonly integrated into the browser as an ADD-ON applications. Like other ADD-ON's, players are registered with the browser and are loaded when the browser detects an incoming audio/video file in the appropriate format. Once loaded, the player functions as a standalone application. Players offers a range of features, which may include volume control, fast-forward, stop and resume.

As mentioned above the file is linked to the HTML page by a "dummy" file. When the specific option is selected as a result, the "dummy" file is loaded into the HTML server and makes a request for reading the binary file from the media server. And by this chain reaction, the audio and the video information is transferred between from server to server, and then from the net to the client host. This operation takes a few seconds, but after the essential waiting period the continuous audio and video information is presented on the HTML page.

The description above may seem that the required technology is not so trivial, however, there are some products that make it possible.

5. Avaliable products

Two products currently support streaming video technology : Stream Works,from Xing Technology corp. , and VDOLive from VDOnet corp.

VDOLive uses a proprietary algorithm based on Wavelet compression while Xing uses the standard MPEG compression, this main difference between the products effects the overall quality of video displayed - frames quality and refresh rate.Stream Works which uses a derivative of MPEG-1 put emphasis on the quality of a static picture. In extremely low-bandwidth connection Xing sacrifices the frames rate but retains the basic quality of each picture.VDO lives uses the opposite strategy , When bandwidth constricts , fewer parts of a frame are transmitted , degrading video quality but preserving motion and minimizing audio breaks.

However, there are several other alternatives which might provide the requested quality to the user. For example Shockware lets you integrate Macromedia Director multimedia animation's into Web pages , Playback occurs seamlessly inside the Navigator browser , so Shcokware elements appear as an integral part of the page. Shockware is not a streaming technology so the entire file must be downloaded before playback occurs,But the presentation is not limited by the bandwidth of the Internet connection.

5.1. Stream Works

5.1.1. Stream Works features :

Transmission protocol : UDP

Support for multipoint broadcasting

Input file formats : composite analog video

Scaleable Bandwidth : 28.8K - 256K

Video Compression Algorithm : MPEG-1

Audio Compression Algorithm : MPEG1or MPEG-2 or LBR ( low bir rate)

Minimum suggested connection speed : 9600 bps

Resolution : 96 by 96 up to 352 by 240 pixels

Frames per second : 24 or 30

5.1.2.How it works

Xing uses an MPEG-1 derivative as its compression algorithm which builds an image from an hierarchy of frames.each Image is composed of intra frame (I-frame) which contains all the main details ( high quality information) in a static picture and "gap frames" which are used to create the motion effect by filling the gap between the I-frames. Two kind of "gap frames" are used fill the missing details between 2 images (the process is called interpolation) : Predicted frame (P-frame) which is based on the past images ,Bi-directional-interpolated frame (B-frame) which is based on the first and last images in the sequence.These different frame types are used to maintain image quality , at lower data rates.

Stream Works drops B-and-P-frames,transmiting only the higher quality I-frames.This scheme provides virtually unlimited control over the number and quality of frames delivered over the system each second.For example at the configuration of 320 by 240 video resolution with an I-frame rate of 1 per second ,using a 28.8-Kpbs transmission the screen may update only once every 4 or 5 seconds ,but the frame quality is still consistently high.Whenever the system drops frames it drops all B-and-P-frames,which are the frames most effect by motion under the MPEG compression system.In effect ,that means that motion has little impact on video at lower bandwidths .This slow frame-rate approach ensures that no one gets ugly pictures from the internet connection.

5.2. VDOlive

5.2.1 VDOlive features :

Transmition protocol : TCP/IP , UDP

No Support for multipoint broadcasting

Input file formats : .AVI

Scaleable Bandwidth : 14.4 to 256k

Compression Algorithm : Wavelet

Minimum suggested connection speed : 14400bps

Maximum resolution : 240 by 180 pixels

Frames per second : 15

5.2.2. How it works

VDOLive uses the Wavelet algorithm which offers high quality at low bandwidths and scalability. Regular compression techniques are based on spectrum analysis (such as fourier transform) of the image , the compression is done by ignoring the high frequency components which represents small local changes in the image , choosing a compression factor sets the frequency range which will be compressed or even ignored .The Wavelet algorithm doesn't involve time to frequency transformation , instead it is implemented by dividing each video frame into multiple layers,each of which provides additional detail and image quality.The compression factor sets the number of layers transmitted to the user.When sufficient bandwidth exists,all layers are transmitted to the viewer and quality is optimized.When bandwidth constricts , fewer layers are transmitted ,degrading video quality but preserving motion and minimizing audio breaks..This approach allows VDOnet to create high bandwidth file that can be used to service different kinds of connection speeds.

6. Future Direction

As the Internet grows in global size and bandwidth, and as computer technology increases in speed and drops in price, Internet Video Conferencing will become increasingly more feasible. Ultra-efficient CODEC software, full duplex sound boards, and fast Internet connections will soon bring low-cost, CD-quality telephony and high quality video to all Internet users. The next generation of Internet Video Conferencing will also include a number of interesting features such as voice mail and on-the-fly data encryption. Soon distance will no longer be a factor in video and voice-based communications, replaced by a very reasonable flat monthly rate - much less than the average long distance bill right now. We are at the very beginning of a revolution in communications, and Internet Video is the ultimate first step.