Applying deep learning methods to solve high-dimensional and nonlinear differential equations(DE) has raised much attention recently. A goal of using machine learning in systems of differential equations is to train a surrogate model with prior physics information and generate predictions with stability and accuracy. However, training such models for high-dimensional/nonlinear multi-scale ODE or PDE systems with limited or labeled data is a grant challenge; and the proper design of the architecture of the neural network is still poorly understood.This dissertation explores data-driven methods in modeling and predicting differential equation governing systems. To tackle the training issue in learning switch systems with imbalanced scales, we propose a novel PINN-based neural network model that resolves the training issue of regular PINN in learning nonlinear switch systems. We explore and incorporate batch statistics in physics-constrained loss functions. The numerical results are demonstrated via three examples by semi-supervised learning and supervised learning algorithms with a small batch of the signal dataset.For learning the system of PDEs, a sequence to sequence supervised learning model for PDEs named Neural-PDE is proposed in this work. Unlike the conventional machine learning approaches for learning PDEs, such as CNN and MLP, which require a great number of parameters for model precision, the Neural-PDE utilizes an RNN based structure, which shares parameters among all-time steps. Thus the Neural-PDE considerably reduces computational complexity and leads to a fast learning algorithm. We showcase the prediction power of the Neural-PDE by applying it to problems from $1D$ PDEs to a multi-scale complex fluid system.Motivated by those innovative methodologies for learning systems of differential equations, we develop a machine learning framework for learning the dynamics of time-dependent oceanic variables across multiply detection sensors. The prediction accurately replicates complex signals and provides comparable performance to state-of-the-art benchmarks.