πŸ““
Documentation
  • βš“Introduction
  • Default Start
    • πŸ›°οΈQuick Start
    • πŸ—„οΈSQL Database
    • πŸ“‘AIS Hardware
  • Tutorials
    • πŸ“₯Database Loading
    • πŸ”ŽData Querying
    • 🚿Data Cleaning
    • πŸ—ΊοΈData Visualization
    • πŸ–‡οΈTrack Interpolation
    • 🌎Haversine Distance
    • 🚀Vessel Speed
    • 🏝️Coast, shore, and ports
    • πŸ”Using Your AIS Data
    • ⬇️Vessel Metadata
    • πŸ“’AIS Data to CSV
    • πŸ“Decimation with AISdb
    • 🌊Bathymetric Data
    • 🌦️Weather Data
    • AIS - Automatic Identification System
  • Machine Learning
    • seq2seq in PyTorch
    • AutoEncoders in Keras
    • Using Newtonian PINNs
    • Embedding with traj2vec
    • TGNs with TorchGeometric
    • Clustering with Scikit Learn
    • Kalman Filters with FilterPy
    • Deploying an AISdb ChatBot
  • Keep Exploring
    • ReadTheDocs
    • MARS Group
    • MAPS Lab
    • MERDIAN
Powered by GitBook
On this page
  • Query functions
  • Query database
  • Generate trajectories
  • Query types with practical examples
  • Within time range
  • Within bounding box
  • Combination of multiple conditions
  • Filtering MMSI
Export as PDF
  1. Tutorials

Data Querying

PreviousDatabase LoadingNextData Cleaning

Last updated 3 months ago

Data querying with AISdb involves setting up a connection to the database, defining query parameters, creating and executing the query, and processing the results. Following the previous tutorial, , we set up a database connection and made simple queries and visualizations. This tutorial will dig into data query functions and parameters and show you the queries you can make with AISdb.

Query functions

Data querying with AISdb includes two components: DBQuery and TrackGen. In this section, we will introduce each component with examples. Before starting data querying, please ensure you have connected to the database. If you have not done so, please follow the instructions and examples in or .

Query database

The class is used to create a query object that specifies the parameters for data retrieval, including the time range, spatial domain, and any filtering callbacks. Here is an example to create a DBQuery object and use parameters to specify the time range and geographical locations:

from aisdb.database.dbqry import DBQuery

# Specify database path
dbpath = ...

# Specify constraints (optional)
start_time = ...
end_time = ...
domain = ...

# Create a query object to fetch data within time and geographical range
qry = DBQuery(
    dbconn=dbconn,                  # Database connection object
    start=start_time,               # Start time for the query
    end=end_time,                   # End time for the query
    xmin=domain.boundary['xmin'],   # Minimum longitude of the domain
    xmax=domain.boundary['xmax'],   # Maximum longitude of the domain
    ymin=domain.boundary['ymin'],   # Minimum latitude of the domain
    ymax=domain.boundary['ymax'],   # Maximum latitude of the domain
    callback=aisdb.database.sqlfcn_callbacks.in_time_bbox_validmmsi  # Callback function to filter data
)

Callback functions

Method gen_qry

Two callback functions can be passed to gen_qry. They are:

  • crawl_dynamic : Iterates only over the position reports table. By default this is called.

  • crawl_dynamic_static: Iterates over both position reports and static messages tables.

After creating the DBQuery object, we can generate rows with gen_qry :

# Generate rows from the query
rowgen = qry.gen_qry(fcn=sqlfcn.crawl_dynamic_static) # callback parameter is optional

# Process the generated rows as needed
for row in rowgen:
    print(row)

Each row from gen_qry is a tuple or dictionary representing a record in the database.

Generate trajectories

Following the generated rows above, here is how to use the TrackGen class:

from aisdb.track_gen import TrackGen

# Convert the generated rows into tracks
tracks = TrackGen(rowgen, decimate=False)

The TrackGen class yields "tracks," which is a generator object. While iterating over tracks, each component is a dictionary representing a track for a specific vessel:

for track in tracks:
    mmsi = track['mmsi']
    lons = track['lon']
    lats = track['lat']
    speeds = track['sog']
    
    print(f"Track for vessel MMSI {mmsi}:")
    for lon, lat, speed in zip(lons[:3], lats[:3], speeds[:3]):
        print(f" - Lon: {lon}, Lat: {lat}, Speed: {speed}")
    break  # Exit after the first track

This is the output with our sample data:

Track for vessel MMSI 316004240:
 - Lon: -63.54868698120117, Lat: 44.61691665649414, Speed: 7.199999809265137
 - Lon: -63.54880905151367, Lat: 44.61708450317383, Speed: 7.099999904632568
 - Lon: -63.55659866333008, Lat: 44.626953125, Speed: 1.5

Query types with practical examples

First, we need to import the necessary packages and prepare data:

import os
import aisdb
from datetime import datetime, timedelta
from aisdb import DBConn, DBQuery, DomainFromPoints

dbpath='YOUR_DATABASE.db' # Define the path to your database

Within time range

start_time = datetime.strptime("2018-01-01 00:00:00", '%Y-%m-%d %H:%M:%S')
end_time = datetime.strptime("2018-01-02 00:00:00", '%Y-%m-%d %H:%M:%S')

with aisdb.SQLiteDBConn(dbpath=dbpath) as dbconn:
    qry = aisdb.DBQuery(
        dbconn=dbconn, start=start_time, end=end_time,
        callback=aisdb.database.sqlfcn_callbacks.in_timerange_validmmsi,
    )
    rowgen = qry.gen_qry()
    
    # Convert queried rows to vessel trajectories
    tracks = aisdb.track_gen.TrackGen(rowgen, decimate=False)
    
    # Visualization
    aisdb.web_interface.visualize(
        tracks,
        domain=domain,
        visualearth=True,
        open_browser=True,
    )

This will display the queried vessel tracks (within a time range, has a valid MMSI) on the map:

Within bounding box

In practical scenarios, people may have specific points/areas of interest. DBQuery includes parameters to define a bounding box and has relevant callbacks. Let's look at an example:

domain = DomainFromPoints(points=[(-63.6, 44.6)], radial_distances=[50000]) # a circle with a 100km radius around the location point

with aisdb.SQLiteDBConn(dbpath=dbpath) as dbconn:
    qry = aisdb.DBQuery(
        dbconn=dbconn, start=start_time, end=end_time,
        xmin=domain.boundary['xmin'], xmax=domain.boundary['xmax'],
        ymin=domain.boundary['ymin'], ymax=domain.boundary['ymax'],
        callback=aisdb.database.sqlfcn_callbacks.in_validmmsi_bbox,
    )
    rowgen = qry.gen_qry()
    tracks = aisdb.track_gen.TrackGen(rowgen, decimate=False)
    
    aisdb.web_interface.visualize(
        tracks,
        domain=domain,
        visualearth=True,
        open_browser=True,
    )

This will show all the vessel tracks with valid MMSI in the defined bounding box:

Combination of multiple conditions

callback=aisdb.database.sqlfcn_callbacks.in_time_bbox_validmmsi

The displayed vessel tracks:

Filtering MMSI

In addition to time and location range, you can track single and multiple vessel(s) of interest by specifying their MMSI in the query. Here is an example of tracking several vessels within a time range:

import random

def assign_colors(mmsi_list):
    colors = {}
    for mmsi in mmsi_list:
        colors[mmsi] = "#{:06x}".format(random.randint(0, 0xFFFFFF))  # Random color in hex
    return colors

# Create a function to color tracks
def color_tracks(tracks, colors):
    colored_tracks = []
    for track in tracks:
        mmsi = track['mmsi']
        color = colors.get(mmsi, "#000000")  # Default to black if no color assigned
        track['color'] = color
        colored_tracks.append(track)
    return colored_tracks

# Set the start and end times for the query
start_time = datetime.strptime("2018-01-01 00:00:00", '%Y-%m-%d %H:%M:%S')
end_time = datetime.strptime("2018-12-31 00:00:00", '%Y-%m-%d %H:%M:%S')

# Create a list of vessel MMSIs you want to track 
MMSI = [636017611,636018124,636018253]

# Assign colors to each MMSI
colors = assign_colors(MMSI)

with aisdb.SQLiteDBConn(dbpath=dbpath) as dbconn:
    qry = aisdb.DBQuery(
        dbconn=dbconn, start=start_time, end=end_time, mmsis = MMSI,
        callback=aisdb.database.sqlfcn_callbacks.in_timerange_inmmsi,
    )
    rowgen = qry.gen_qry()
    
    tracks = aisdb.track_gen.TrackGen(rowgen, decimate=False)
    colored_tracks = color_tracks(tracks, colors)

    # Visualizing the tracks
    aisdb.web_interface.visualize(
        colored_tracks,
        visualearth=True,
        open_browser=True,
    )

Callback functions are used in the class to filter data based on specific criteria. Some common callbacks include: in_bbox, in_time_bbox, valid_mmsi, and in_time_bbox_validmmsi. These callbacks ensure that the data retrieved matches the specific criteria defined in the query. Please find examples of using different callbacks with other parameters in .

For more callback functions, refer to the API documentation here:

The function is a method of the class in AISdb. It is responsible for generating rows of data that match the query criteria specified when creating the DBQuery object. This function acts as a generator, yielding one row at a time and efficiently handling large datasets.

The class converts the generated rows from gen_qry into tracks (trajectories). It takes the row generator and, optionally, a decimate parameter to control point reduction. This conversion is essential for analyzing vessel movements, identifying patterns, and visualizing trajectories in later steps.

In this section, we will provide practical examples of the most common querying types you can make using the DBQuery class, including querying within a time range, geographical areas, and tracking vessels by MMSI. Different queries can be achieved by changing the parameters and other parameters defined in the DBQuery class. Then, we will use TrackGen to convert these query results into structured tracks for further analysis and visualization.

Querying data within a specified time range can be done by using the callback in the DBQuery class:

You may find noise in some of the track data. In , we introduced the de-noising methods in AISdb that can effectively remove unreasonable or error data points, ensuring more accurate and reliable vessel trajectories.

In the above examples, we queried data in a time range and a geographical area. If you want to combine multiple query criteria, please check out available in the API Docs. In the last example above, we can simply modify the callback type to obtain vessel tracks within both the time range and geographical area:

πŸ”Ž
API-Doc
gen_qry
DBQuery
TrackGen
callbacks
in_timerange_validmmsi
Data Cleaning
types of callbacks
DBQuery
Query types with practical examples
Database Loading
DBQuery
Queried vessel tracks in specified time range
Queried vessel tracks within a defined bounding box
Queried vessel tracks within a defined bounding box and time range
Queried tracks of vessels of interest within a specified time range
Quick Start
Database Loading