Nanomagnet Logic (NML) is an emerging technology being studied as a possible replacement or supplementary device for Complimentary Metal-Oxide-Semiconductor (CMOS) Field-Effect Transistors (FET) by the year 2020. NML devices offer numerous potential advantages including: low energy operation, steady state non-volatility, radiation hardness and a clear path to fabrication and integration with CMOS. However, maintaining both low-energy operation and non-volatility while scaling from the device to the architectural level is non-trivial as (i) nearest neighbor interactions within NML circuits complicate the modeling of ensemble nanomagnet behavior and (ii) the energy intensive clock structures required for re-evaluation and NML's relatively high latency challenge its ability to offer system-level performance wins against other emerging nanotechnologies. Thus, further research efforts are required to model more complex circuits while also identifying circuit design techniques that balance low-energy operation with steady state non-volatility. In addition, further work is needed to design and model low-power on-chip clocks while simultaneously identifying application spaces where NML systems (including clock overhead) offer sufficient energy savings to merit their inclusion in future processors. This dissertation presents research advancing the understanding and modeling of NML at all levels including devices, circuits, and line clock structures while also benchmarking NML against both scaled CMOS and tunneling FETs (TFET) devices. This is accomplished through the development of design tools and methodologies for (i) quantifying both energy and stability in NML circuits and (ii) evaluating line-clocked NML system performance. The application of these newly developed tools improves the understanding of ideal design criteria (i.e., magnet size, clock wire geometry, etc.) for NML architectures. Finally, the system-level performance evaluation tool offers the ability to project what advancements are required for NML to realize performance improvements over scaled-CMOS hardware equivalents at the functional unit and/or application-level.