1 of 1

Using Newtonian PINNs

Traditional Seq2Seq LSTM models have long been the workhorse for trajectory forecasting. They excel at learning temporal dependencies in AIS data, and with attention mechanisms they can capture complex nonlinear patterns over long histories. However, they remain purely statistical. This means the model can generate plausible-looking trajectories from a data perspective but with no guarantee that those predictions respect the underlying physics of vessel motion. In practice, this often manifests as sharp turns, unrealistic accelerations, or trajectories that deviate significantly when the model faces sparse or noisy data.

The NPINN-based approach directly addresses these shortcomings. By embedding smoothness and kinematic penalties into training, it enforces constraints on velocity and acceleration while still benefiting from the representational power of deep sequence models. Instead of simply fitting residuals between past and future positions, NPINN ensures that predictions evolve in ways consistent with how vessels actually move in the physical world. This leads to more reliable extrapolation, especially in data-scarce regions or unusual navigation scenarios.

Preprocessing

The first step in building a trajectory learning pipeline is preprocessing AIS tracks into a model-friendly format. Raw AIS messages are noisy, irregularly sampled, and inconsistent across vessels, so we need to enforce structure before feeding them into neural networks. The function below does several things in sequence:

Data cleaning – removes spurious pings based on unrealistic speeds, encodes great-circle distances, and interpolates trajectories at fixed 5-minute intervals.
Track filtering – groups data by vessel (MMSI) and keeps only sufficiently long tracks to ensure stable training samples.
Feature extraction – converts lat/lon into projected coordinates (x, y

The result is a clean, scaled DataFrame where each row represents a vessel state at a timestamp, enriched with both absolute position features and relative motion features.

We first query the AIS database for the training and testing periods and geographic bounds, producing generators of raw vessel tracks. These raw tracks are then preprocessed using preprocess_aisdb_tracks, which cleans the data, computes relative motion (dx, dy), scales features, and outputs a ready-to-use DataFrame. Training data fits new scalers, while test data is transformed using the same scalers to ensure consistency.

The create_sequences function transforms the preprocessed track data into supervised sequences suitable for model training. For each vessel, it slides a fixed-size window over the time series, building input sequences of past absolute features (x, y, dx, dy, cog_sin, cog_cos, sog_scaled) and target sequences of future residual movements (dx, dy). Using this, the dataset is split into training, validation, and test sets, with each set containing sequences ready for direct input into a trajectory prediction model.

Save the Dataset

This block saves all processed data and supporting objects needed for training and evaluation. The preprocessed input and target sequences for training, validation, and testing are serialized using PyTorch (datasets_npin.pt). The fitted scalers for features, speed, and residuals are saved with joblib to ensure consistent scaling during inference. Finally, the projection parameters used to convert geographic coordinates to UTM are stored in JSON, allowing consistent coordinate transformations later.

Model

Seq2SeqLSTM is the sequence-to-sequence backbone used in the NPINN-based trajectory prediction framework. Within NPINN, the encoder LSTM processes past vessel observations (positions, speed, and course) to produce a hidden representation that captures motion dynamics. The decoder LSTMCell predicts future residuals in x and y, with an attention mechanism that selectively focuses on relevant past information at each step. Predicted residuals are added to the last observed position to reconstruct absolute trajectories.

This setup enables NPINN to generate smooth, physically consistent multi-step vessel trajectories, leveraging both historical motion patterns and learned dynamics constraints.

Model Training

This training loop implements NPINN-based trajectory learning, combining data fidelity with physics-inspired smoothness constraints. The weighted_coord_loss enforces accurate prediction of future x, y positions, while xy_npinn_smoothness_loss encourages smooth velocity and acceleration profiles, reflecting realistic vessel motion.

By integrating these two objectives, NPINN learns trajectories that are both close to observed data and physically plausible, with the smoothness weight gradually decaying during training to balance learning accuracy with dynamic consistency. Validation is performed each epoch to ensure generalization, and the best-performing model is saved. This approach differentiates NPINN from standard Seq2Seq training by explicitly incorporating motion dynamics into the loss, rather than relying purely on sequence prediction.

Training

This training sets a fixed random seed for reproducibility (torch, numpy, random) and enables deterministic CuDNN behavior. It initializes a Seq2SeqLSTM NPINN model to predict future vessel trajectory residuals from past sequences of absolute features. Global min/max of xy coordinates are computed from the training set for NPINN’s smoothness loss normalization. The model is trained on GPU if available using Adam, combining data loss on predicted xy positions with a physics-inspired smoothness penalty to enforce realistic, physically plausible trajectories.

Evaluate

This snippet sets up the environment for inference or evaluation of the NPINN Seq2Seq model:

Chooses GPU if available.
Loads preprocessed datasets (train_X/Y, val_X/Y, test_X/Y).
Loads the saved RobustScaler

Essentially, this is the full recovery pipeline for NPINN inference, ensuring consistency with training preprocessing, scaling, and projection.

This code provides essential postprocessing and geometric utilities for working with NPINN outputs and AIS trajectory data. The inverse_dxdy_np function is designed to convert scaled residuals (dx, dy) back into real-world units (meters) using a previously fitted scaler. It handles both 1D and 2D inputs, making it suitable for batch or single-step predictions. This is particularly useful for interpreting NPINN predictions in absolute physical units rather than in the normalized or scaled feature space, allowing for meaningful evaluation of the model’s accuracy in real-world terms. Using this, the code also computes the standard deviation of the residuals across the training dataset, providing a quantitative measure of typical displacement magnitudes along the x and y axes.

The code also includes geometry-related helpers to analyze trajectories in geospatial terms. The haversine function calculates the geodesic distance between longitude/latitude points in meters using the haversine formula, with safeguards for numerical stability and invalid inputs. Building on this, the trajectory_length function computes the total length of a vessel’s trajectory, summing distances between consecutive points while handling incomplete or non-finite data gracefully. Together, these utilities allow NPINN outputs to be mapped back to real-world coordinates, facilitate evaluation of trajectory smoothness and accuracy, and provide interpretable metrics for model validation and downstream analysis.

Evaluate Function

This function evaluate_with_errors is designed to evaluate NPINN trajectory predictions in a geospatial context and optionally visualize them. It takes a trained model, a test DataLoader, coordinate projection, scalers, and device information. For each batch, it reconstructs the predicted trajectories from residuals (dx, dy), inverts the scaling back to meters, and converts them to absolute positions starting from the last observed point. Different decoding modes (cumsum, independent, stdonly) allow flexibility in how residuals are integrated into absolute trajectories, and it handles cases where the first residual is effectively a duplicate of the last input.

The evaluation computes per-timestep errors in meters using the haversine formula and tracks differences in trajectory lengths. All errors are summarized with mean and median statistics across the prediction horizon. When plot_map=True, the function generates separate maps for each trajectory, overlaying the true (green) and predicted (red dashed) paths, giving a clear visual inspection of the model’s performance. This approach is directly aligned with NPINN, as it evaluates predictions in physical units and emphasizes smooth, physically plausible trajectory reconstructions.

Results

Trajectory Plots

Trajectory 2
- True length: 521.39 m
- Predicted length: 508.66 m

Summary Statistics

t=0 mean error: 45.03 m The first prediction step already has a ~45 m average discrepancy from the true position, which is common since the model starts accumulating error immediately after the last observed point.
t=1 mean error: 80.80 m Error grows with the horizon, reflecting cumulative effects of residual inaccuracies.
Mean over horizon: 62.92 m | Median: 61.72 m On average, predictions are within ~60–63 m of the true trajectory at any given time step. Median being close to mean suggests a fairly symmetric error distribution without extreme outliers.

Interpretation

The model captures trajectory trends well but shows small offsets in absolute positions.
Errors grow with horizon, which is typical for sequence prediction models using residuals.
Smoothness is maintained (no erratic jumps), indicating that the NPINN smoothness regularization is effective.

Using Newtonian PINNs

Preprocessing

Data cleaning – removes spurious pings based on unrealistic speeds, encodes great-circle distances, and interpolates trajectories at fixed 5-minute intervals.
Track filtering – groups data by vessel (MMSI) and keeps only sufficiently long tracks to ensure stable training samples.
Feature extraction – converts lat/lon into projected coordinates (x, y

The result is a clean, scaled DataFrame where each row represents a vessel state at a timestamp, enriched with both absolute position features and relative motion features.

Save the Dataset

Model

This setup enables NPINN to generate smooth, physically consistent multi-step vessel trajectories, leveraging both historical motion patterns and learned dynamics constraints.

Model Training

Training

Evaluate

This snippet sets up the environment for inference or evaluation of the NPINN Seq2Seq model:

Chooses GPU if available.
Loads preprocessed datasets (train_X/Y, val_X/Y, test_X/Y).
Loads the saved RobustScaler

Essentially, this is the full recovery pipeline for NPINN inference, ensuring consistency with training preprocessing, scaling, and projection.

Evaluate Function

Results

Trajectory Plots

Trajectory 2
- True length: 521.39 m
- Predicted length: 508.66 m

Summary Statistics

t=0 mean error: 45.03 m The first prediction step already has a ~45 m average discrepancy from the true position, which is common since the model starts accumulating error immediately after the last observed point.
t=1 mean error: 80.80 m Error grows with the horizon, reflecting cumulative effects of residual inaccuracies.
Mean over horizon: 62.92 m | Median: 61.72 m On average, predictions are within ~60–63 m of the true trajectory at any given time step. Median being close to mean suggests a fairly symmetric error distribution without extreme outliers.

Interpretation

The model captures trajectory trends well but shows small offsets in absolute positions.
Errors grow with horizon, which is typical for sequence prediction models using residuals.
Smoothness is maintained (no erratic jumps), indicating that the NPINN smoothness regularization is effective.

def evaluate_with_errors(
    model,
    test_dl,
    proj,
    feature_scaler,
    delta_scaler,
    device,
    num_batches=None,   # None = use full dataset
    dup_tol: float = 1e-4,
    outputs_are_residual_xy: bool = True,
    residual_decode_mode: str = "cumsum",   # "cumsum", "independent", "stdonly"
    residual_std: np.ndarray = None,
    plot_map: bool = True   # <--- PLOT ALL TRAJECTORIES
):
    """
    Evaluate model trajectory predictions and report errors in meters.
    Optionally plot all trajectories on a map.
    """
    model.eval()
    errors_all = []
    length_diffs = []
    bad_count = 0

    # store all trajectories
    all_real = []
    all_pred = []

    with torch.no_grad():
        batches = 0
        for xb, yb in test_dl:
            xb, yb = xb.to(device), yb.to(device)
            pred = model(xb, teacher_forcing_ratio=0.0)  # [B, T_out, F]

            # first sample of the batch
            input_seq = xb[0].cpu().numpy()
            real_seq  = yb[0].cpu().numpy()
            pred_seq  = pred[0].cpu().numpy()

            # Extract dx, dy residuals
            pred_resid_s = pred_seq[:, :2]
            real_resid_s = real_seq[:, :2]

            # Invert residuals to meters
            pred_resid_m = inverse_dxdy_np(pred_resid_s, delta_scaler)
            real_resid_m = inverse_dxdy_np(real_resid_s, delta_scaler)

            # Use last observed absolute position as starting point (meters)
            last_obs_xy_m = inverse_xy_only_np(input_seq[-1, :2], feature_scaler)

            # Reconstruct absolute positions
            if residual_decode_mode == "cumsum":
                pred_xy_m = np.cumsum(pred_resid_m, axis=0) + last_obs_xy_m
                real_xy_m = np.cumsum(real_resid_m, axis=0) + last_obs_xy_m
            elif residual_decode_mode == "independent":
                pred_xy_m = pred_resid_m + last_obs_xy_m
                real_xy_m = real_resid_m + last_obs_xy_m
            elif residual_decode_mode == "stdonly":
                if residual_std is None:
                    raise ValueError("residual_std must be provided for 'stdonly' mode")
                noise = np.random.randn(*pred_resid_m.shape) * residual_std
                pred_xy_m = np.cumsum(noise, axis=0) + last_obs_xy_m
                real_xy_m = np.cumsum(real_resid_m, axis=0) + last_obs_xy_m
            else:
                raise ValueError(f"Unknown residual_decode_mode: {residual_decode_mode}")

            # Remove first target if duplicates last input
            if np.allclose(real_resid_m[0], 0, atol=dup_tol):
                real_xy_m = real_xy_m[1:]
                pred_xy_m = pred_xy_m[1:]

            # align horizon
            min_len = min(len(pred_xy_m), len(real_xy_m))
            if min_len == 0:
                bad_count += 1
                continue

            pred_xy_m = pred_xy_m[:min_len]
            real_xy_m = real_xy_m[:min_len]

            # project to lon/lat
            lon_real, lat_real = proj(real_xy_m[:,0], real_xy_m[:,1], inverse=True)
            lon_pred, lat_pred = proj(pred_xy_m[:,0], pred_xy_m[:,1], inverse=True)

            all_real.append((lon_real, lat_real))
            all_pred.append((lon_pred, lat_pred))

            # compute per-timestep errors
            errors = haversine(lon_real, lat_real, lon_pred, lat_pred)
            errors_all.append(errors)

            # trajectory length diff
            real_len = trajectory_length(lon_real, lat_real)
            pred_len = trajectory_length(lon_pred, lat_pred)
            length_diffs.append(abs(real_len - pred_len))

            print(f"Trajectory length (true): {real_len:.2f} m | pred: {pred_len:.2f} m | diff: {abs(real_len - pred_len):.2f} m")

            batches += 1
            if num_batches is not None and batches >= num_batches:
                break

    # summary
    if len(errors_all) == 0:
        print("No valid samples evaluated. Bad count:", bad_count)
        return

    max_len = max(len(e) for e in errors_all)
    errors_padded = np.full((len(errors_all), max_len), np.nan)
    for i, e in enumerate(errors_all):
        errors_padded[i, :len(e)] = e

    mean_per_t = np.nanmean(errors_padded, axis=0)
    print("\n=== Summary (meters) ===")
    for t, v in enumerate(mean_per_t):
        if not np.isnan(v):
            print(f"t={t} mean error: {v:.2f} m")
    print(f"Mean over horizon: {np.nanmean(errors_padded):.2f} m | Median: {np.nanmedian(errors_padded):.2f} m")
    print(f"Mean trajectory length diff: {np.mean(length_diffs):.2f} m | Median: {np.median(length_diffs):.2f} m")
    print("Bad / skipped samples:", bad_count)

    # --- plot all trajectories ---
       # --- plot each trajectory separately ---
    if plot_map and len(all_real) > 0:
        import matplotlib.pyplot as plt
        import cartopy.crs as ccrs
        import cartopy.feature as cfeature

        for idx, ((lon_r, lat_r), (lon_p, lat_p)) in enumerate(zip(all_real, all_pred)):
            fig = plt.figure(figsize=(10, 8))
            ax = plt.axes(projection=ccrs.PlateCarree())
            ax.add_feature(cfeature.LAND)
            ax.add_feature(cfeature.COASTLINE)
            ax.add_feature(cfeature.BORDERS, linestyle=':')
            
            ax.plot(lon_r, lat_r, color='green', linewidth=2, label="True")
            ax.plot(lon_p, lat_p, color='red', linestyle='--', linewidth=2, label="Predicted")
            ax.legend()

            ax.set_title(f"Trajectory {idx+1}: True vs Predicted")
            plt.show()

def evaluate_with_errors(
    model,
    test_dl,
    proj,
    feature_scaler,
    delta_scaler,
    device,
    num_batches=None,   # None = use full dataset
    dup_tol: float = 1e-4,
    outputs_are_residual_xy: bool = True,
    residual_decode_mode: str = "cumsum",   # "cumsum", "independent", "stdonly"
    residual_std: np.ndarray = None,
    plot_map: bool = True   # <--- PLOT ALL TRAJECTORIES
):
    """
    Evaluate model trajectory predictions and report errors in meters.
    Optionally plot all trajectories on a map.
    """
    model.eval()
    errors_all = []
    length_diffs = []
    bad_count = 0

    # store all trajectories
    all_real = []
    all_pred = []

    with torch.no_grad():
        batches = 0
        for xb, yb in test_dl:
            xb, yb = xb.to(device), yb.to(device)
            pred = model(xb, teacher_forcing_ratio=0.0)  # [B, T_out, F]

            # first sample of the batch
            input_seq = xb[0].cpu().numpy()
            real_seq  = yb[0].cpu().numpy()
            pred_seq  = pred[0].cpu().numpy()

            # Extract dx, dy residuals
            pred_resid_s = pred_seq[:, :2]
            real_resid_s = real_seq[:, :2]

            # Invert residuals to meters
            pred_resid_m = inverse_dxdy_np(pred_resid_s, delta_scaler)
            real_resid_m = inverse_dxdy_np(real_resid_s, delta_scaler)

            # Use last observed absolute position as starting point (meters)
            last_obs_xy_m = inverse_xy_only_np(input_seq[-1, :2], feature_scaler)

            # Reconstruct absolute positions
            if residual_decode_mode == "cumsum":
                pred_xy_m = np.cumsum(pred_resid_m, axis=0) + last_obs_xy_m
                real_xy_m = np.cumsum(real_resid_m, axis=0) + last_obs_xy_m
            elif residual_decode_mode == "independent":
                pred_xy_m = pred_resid_m + last_obs_xy_m
                real_xy_m = real_resid_m + last_obs_xy_m
            elif residual_decode_mode == "stdonly":
                if residual_std is None:
                    raise ValueError("residual_std must be provided for 'stdonly' mode")
                noise = np.random.randn(*pred_resid_m.shape) * residual_std
                pred_xy_m = np.cumsum(noise, axis=0) + last_obs_xy_m
                real_xy_m = np.cumsum(real_resid_m, axis=0) + last_obs_xy_m
            else:
                raise ValueError(f"Unknown residual_decode_mode: {residual_decode_mode}")

            # Remove first target if duplicates last input
            if np.allclose(real_resid_m[0], 0, atol=dup_tol):
                real_xy_m = real_xy_m[1:]
                pred_xy_m = pred_xy_m[1:]

            # align horizon
            min_len = min(len(pred_xy_m), len(real_xy_m))
            if min_len == 0:
                bad_count += 1
                continue

            pred_xy_m = pred_xy_m[:min_len]
            real_xy_m = real_xy_m[:min_len]

            # project to lon/lat
            lon_real, lat_real = proj(real_xy_m[:,0], real_xy_m[:,1], inverse=True)
            lon_pred, lat_pred = proj(pred_xy_m[:,0], pred_xy_m[:,1], inverse=True)

            all_real.append((lon_real, lat_real))
            all_pred.append((lon_pred, lat_pred))

            # compute per-timestep errors
            errors = haversine(lon_real, lat_real, lon_pred, lat_pred)
            errors_all.append(errors)

            # trajectory length diff
            real_len = trajectory_length(lon_real, lat_real)
            pred_len = trajectory_length(lon_pred, lat_pred)
            length_diffs.append(abs(real_len - pred_len))

            print(f"Trajectory length (true): {real_len:.2f} m | pred: {pred_len:.2f} m | diff: {abs(real_len - pred_len):.2f} m")

            batches += 1
            if num_batches is not None and batches >= num_batches:
                break

    # summary
    if len(errors_all) == 0:
        print("No valid samples evaluated. Bad count:", bad_count)
        return

    max_len = max(len(e) for e in errors_all)
    errors_padded = np.full((len(errors_all), max_len), np.nan)
    for i, e in enumerate(errors_all):
        errors_padded[i, :len(e)] = e

    mean_per_t = np.nanmean(errors_padded, axis=0)
    print("\n=== Summary (meters) ===")
    for t, v in enumerate(mean_per_t):
        if not np.isnan(v):
            print(f"t={t} mean error: {v:.2f} m")
    print(f"Mean over horizon: {np.nanmean(errors_padded):.2f} m | Median: {np.nanmedian(errors_padded):.2f} m")
    print(f"Mean trajectory length diff: {np.mean(length_diffs):.2f} m | Median: {np.median(length_diffs):.2f} m")
    print("Bad / skipped samples:", bad_count)

    # --- plot all trajectories ---
       # --- plot each trajectory separately ---
    if plot_map and len(all_real) > 0:
        import matplotlib.pyplot as plt
        import cartopy.crs as ccrs
        import cartopy.feature as cfeature

        for idx, ((lon_r, lat_r), (lon_p, lat_p)) in enumerate(zip(all_real, all_pred)):
            fig = plt.figure(figsize=(10, 8))
            ax = plt.axes(projection=ccrs.PlateCarree())
            ax.add_feature(cfeature.LAND)
            ax.add_feature(cfeature.COASTLINE)
            ax.add_feature(cfeature.BORDERS, linestyle=':')
            
            ax.plot(lon_r, lat_r, color='green', linewidth=2, label="True")
            ax.plot(lon_p, lat_p, color='red', linestyle='--', linewidth=2, label="Predicted")
            ax.legend()

            ax.set_title(f"Trajectory {idx+1}: True vs Predicted")
            plt.show()

Using Newtonian PINNs

hashtagPreprocessing

hashtagSave the Dataset

hashtagModel

hashtagModel Training

hashtagTraining

hashtagEvaluate

hashtagEvaluate Function

hashtagResults

hashtagTrajectory Plots

hashtagSummary Statistics

hashtagInterpretation

Using Newtonian PINNs

hashtagPreprocessing

hashtagSave the Dataset

hashtagModel

hashtagModel Training

hashtagTraining

hashtagEvaluate

hashtagEvaluate Function

hashtagResults

hashtagTrajectory Plots

hashtagSummary Statistics

hashtagInterpretation

Preprocessing

Save the Dataset

Model

Model Training

Training

Evaluate

Evaluate Function

Results

Trajectory Plots

Summary Statistics

Interpretation

Preprocessing

Save the Dataset

Model

Model Training

Training

Evaluate

Evaluate Function

Results

Trajectory Plots

Summary Statistics

Interpretation