The growth of the Internet and wireless networks boosts digital video applications in a wide range. Network video, such as digital video broadcasting, streaming, or surveillance, involve various networks and diverse clients. The networks may have varying bandwidths, loss rates, and best-effort or Quality of Service (QoS) capabilities. Video transmission over heterogeneous networks suffers from delay, congestion, losses, and errors. Video clients also tend to have various system resources and display resolutions. Often, these conditions are unknown in advance. In such an environment, the challenge of reliably delivery of video over error-prone networks requires better error resilience and flexible rate control. The usual codec design tradeoff between bit rate and quality is complicated by these requirements. The design of robust and scalable video codecs is desirable and the key to the success of these applications. Robust video coding plays an important role in limiting the error propagation and improving visual quality in case of errors. It addresses the issue of error concealment by designing proper structures and maintaining acceptable redundancy while minimizing the complexity. Scalable video coding is to encode a video sequences in a way that multiple levels of quality can be obtained based on what parts of the video bit stream are available. It has the potential for high flexibility and error resilience capabilities due to their separable bit stream structures. Conventional layered video coding enables nested scalabilities among different parts of the coded bit streams. Fixed decoding order is necessary in layered coding to improve quality. The emerging multiple description coding provides parallel scalabilities that enable decoding using all available information. In-depth research has revealed that multiple description coding has great potentials in best effort networks, while layered coding performances better with QoS. In scenarios when tight delay constraints are imposed, which is true in most real time applications such as video conference and video streaming, multiple description become the best choice. Additional complexities of coders and redundancies in the information content are part of the price for the convenience and depend on actual algorithms. The works in this dissertation aimed to develop new approaches for robust and scalable video coding that meet the requirements of reliably delivery of video to diverse clients over heterogeneous networks. In study of robust video coding technologies, we proposed an error resilience video coding algorithm using reference diversity. In the effort to increase robustness of multiple description video streams, a modified multiple state video coding scheme is introduced for better error concealment results. Significant quality improvement is observed by using the enhanced algorithm, and the complexity of error concealment is also reduced which can benefit power constrained clients. A hybrid scalable video coding algorithm with both layered and parallel scalabilities has been proposed based on the multiple state video coding. Quality scalability is achieved by layered coding, while temporal scalability is implemented using multiple description principle. Experimental evidence indicates that it has better error concealment in networks with high error rates.