``Many interesting applications have arisen whose computational demands for scaling and timeliness stress even our current supercomputers'. These applications arise from a variety of sources including physics, engineering, biology, sociology, and national defense, to name just a few, and many of them have characteristics that make them ill-suited to conventional high performance architectures. As conventional parallel architectures reach the limits of their scalability, revolutionary new architectures are needed to achieve the performance demanded by these increasingly large and complex applications. This dissertation evaluates one such architecture, a lightweight multithreaded architecture, and explores how multithreaded applications can be organized to take advantage of the architecture's key features. The lightweight multithreaded architecture is composed of a network of lightweight processors and memories. The lightweight processors utilize large numbers of lightweight threads of execution to tolerate latencies that conventional architectures cannot avoid. When a given thread of execution is stalled waiting on a long latency event, such as a memory access, the processor can switch to a different thread rather than sitting idle. If there are enough threads available, the processor remains fully utilized, effectively hiding the latency involved in a memory access. Direct architectural support for lightweight threads with minimal state and low-overhead, memory-based synchronization enable threads to be started, stopped, suspended, and synchronized in only a few cycles. This enables applications to utilize fine grain parallelism with large numbers of dynamic threads and intense synchronization that would restrict performance on traditional parallel architectures. %utilizes multiple threads of execution to provide parallelism and tolerate latencies that conventional architectures cannot avoid. This dissertation addresses two main questions: 1. How should the architecture be configured to be effective with applications? 2. How should the applications be organized to effectively use the architecture? In order to answer these questions, experiments and analysis focused on understanding how key factors at the architectural and application level interact to affect performance. It uses an organized approach to evaluation that targets the modeling, simulation, and evaluation tools to the issue under investigation, avoiding the need to simulate and build the entire machine. It then integrates these results to propose an architectural configuration and application programming model. This work comprises the first attempt to code and compare real applications using the architecture's unique architectural features.