🛰️Quick Start

A hands-on quick start guide for using AISdb.

Python Environment

To use the AISdb Python package, you must have Python version 3.8 or above. If you want to use SQLite, you don't need to install anything extra because it's already included in Python. However, if you prefer to use the PostgreSQL server, you'll need to install it separately. You can easily install the AISdb Python package using pip. It's highly recommended to create a virtual Python environment and install the package within it.

Linux
python -m venv AISdb
source ./AISdb/bin/activate    #activating python virtual environment
pip install aisdb  # from https://pypi.org/project/aisdb/
Windows
python -m venv AISdb
./AISdb/Scripts/activate  
pip install aisdb

Alternatively, you may also use AISdb on Docker. Regardless of the installation procedure you decide to use, you can test your installation by running the following commands:

$ python
>>> import aisdb
>>> aisdb.__version__  # should return '1.7.0' or newer

Notice that if you are running Jupyter, ensure it is installed in the same environment as AISdb:

$ source ./AISdb/bin/activate
$ pip install jupyter
$ jupyter notebook

The Python code in the rest of this document can be run in the Python environment you created.

Database Handling

Connecting to a Postgres database

This option requires an optional dependency psycopg for interfacing with Postgres databases. Beware that Postgres accepts these keyword arguments. Alternatively, a connection string may be used. Information on connection strings and Postgres URI format can be found here.

from aisdb.database.dbconn import PostgresDBConn

# [OPTION 1]
dbconn = PostgresDBConn(
    hostaddr='127.0.0.1',  # Replace this with the Postgres address (supports IPv6)
    port=5432,  # Replace this with the Postgres running port (if not the default)
    user='postgres',  # Replace this with the Postgres username
    password='YOUR-PASSWORD',  # Replace this with your password
    dbname='postgres',  # Replace this with your database name
)

# [OPTION 2]
dbconn = PostgresDBConn('postgresql://USERNAME:PASSWORD@HOST:PORT/DATABASE')

Attaching a SQLite database

Querying SQLite is as easy as informing the name of a ".db" file with the same entity-relationship as the databases supported by AIS, which are detailed in the SQL Database section.

# Generate tracks using the database query
dbconn = aisdb.DBQuery(
    dbconn=aisdb.DBConn('/home/test_database.db'),  # new connection
    start="15012020", end="31012020",  # time-range of interest
    xmin=-68, ymin=45, xmax=-56, ymax=51.5  # Gulf of St. Lawrence
    callback=aisdb.database.sql_query_strings.in_bbox,  # callbacks for the data
)

Querying the Database

Parameters for the database query can be defined using aisdb.database.dbqry.DBQuery. Iterate over rows returned from the database for each vessel with aisdb.database.dbqry.DBQuery.gen_qry(). Convert the results into generator-yielding dictionaries with NumPy arrays describing position vectors, e.g., lon, lat, and time using aisdb.track_gen.TrackGen().

The following query will return vessel positions from the past 48 hours:

import datetime as dt
import aisdb

with aisdb.DBConn('/home/test_database.db') as dbconn:
    qry = aisdb.DBQuery(
        dbconn=dbconn,
        dbpath='AIS.sqlitedb',
        callback=aisdb.database.sql_query_strings.in_timerange,
        start=dt.datetime.utcnow() - dt.timedelta(hours=48),
        end=dt.datetime.utcnow(),
    )
    
    vessels_generator = aisdb.TrackGen(qry.gen_qry())
    for vessel in vessels_generator:
        print(vessel)

A specific region can be queried for AIS data using aisdb.gis.Domain or one of its subclasses to define a collection of shapely polygon features. For this example, the domain contains a single bounding box polygon derived from a longitude/latitude coordinate pair and radial distance specified in meters. If multiple features are included in the domain object, the domain boundaries will encompass the convex hull of all features contained within.

with DBConn('/home/test_database.db') as dbconn:
    domain = aisdb.DomainFromPoints(points=[(-63.6, 44.6),], radial_distances=[5000,])
    qry = aisdb.DBQuery(
        dbconn=dbconn,
        dbpath='AIS.sqlitedb',
        callback=aisdb.database.sqlfcn_callbacks.in_bbox_time_validmmsi,
        start=datetime.utcnow() - timedelta(hours=48),
        end=datetime.utcnow(),
        xmin=domain.boundary['xmin'],
        xmax=domain.boundary['xmax'],
        ymin=domain.boundary['ymin'],
        ymax=domain.boundary['ymax'],
    )

  for vessel in aisdb.TrackGen(qry.gen_qry(), decimate=False):
      print(vessel)

Additional query callbacks for filtering by region, timeframe, identifier, etc. can be found in aisdb.database.sql_query_strings and aisdb.database.sqlfcn_callbacks

Processing

Voyage Modelling

The generator described above can be input into a processing function, yielding modified results. For example, to model the activity of vessels on a per-voyage or per-transit basis, each voyage is defined as a continuous vector of vessel positions where the time between observed timestamps never exceeds a 24-hour period.

import aisdb
from datetime import datetime, timedelta

maxdelta = timedelta(hours=24)

with aisdb.DBConn('/home/test_database.db') as dbconn:
  qry = aisdb.DBQuery(
    dbconn=dbconn,
    dbpath='AIS.sqlitedb',
    callback=aisdb.database.sql_query_strings.in_timerange,
    start=datetime.utcnow() - timedelta(hours=48),
    end=datetime.utcnow(),
  )

  tracks = aisdb.TrackGen(qry.gen_qry(), decimate=False)
  track_segments = aisdb.split_timedelta(tracks, maxdelta)

  for segment in track_segments:
      print(segment)

Data cleaning and MMSI deduplication

A common issue with AIS is that the data is noisy, and databases may contain multiple vessels broadcasting with same identifier at the same time. The aisdb.denoising_encoder.encode_greatcircledistance() function uses an encoder to check the approximate distance between each vessel’s position, and then segments resulting vectors where a surface vessel couldn’t reasonably travel there using the most direct path, e.g. above 50 knots. A distance threshold and speed threshold are used as a hard limit on the maximum delta distance or delta time allowed between messages to be considered continuous. A score is computed for each position delta, with sequential messages in close proximity at shorter intervals given a higher score, calculated by haversine distance divided by elapsed time. Any deltas with a score not reaching the minimum threshold are considered as the start of a new segment. Finally, the beginning of each new segment is compared to the end of each existing segment with a matching vessel identifier, and if the delta exceeds the minimum score, the segments are concatenated. If multiple existing trajectories meet the minimum score threshold, the new segment will be concatenated the existing segment with the highest score.

Processing functions may be executed in sequence as a processing chain or pipeline, so after segmenting the individual voyages as shown above, results can be input into the encoder to effectively remove noise and correct for vessels with duplicate identifiers.

import aisdb
from datetime import datetime, timedelta

maxdelta = timedelta(hours=24)
distance_threshold = 200000  # meters
speed_threshold = 50  # knots
minscore = 1e-6

with aisdb.DBConn('/home/test_database.db''/home/test_database.db') as dbconn:
    qry = aisdb.DBQuery(
      dbconn=dbconn,
      dbpath='AIS.sqlitedb',
      callback=aisdb.database.sql_query_strings.in_timerange,
      start=datetime.utcnow() - timedelta(hours=48),
      end=datetime.utcnow(),
    )

    tracks = aisdb.TrackGen(qry.gen_qry())
    track_segments = aisdb.split_timedelta(tracks, maxdelta)
    tracks_encoded = aisdb.encode_greatcircledistance(track_segments, distance_threshold=distance_threshold, speed_threshold=speed_threshold, minscore=minscore)

In this second example, artificial noise is introduced into the tracks as a hyperbolic demonstration of the denoising capability. The resulting cleaned tracks are then displayed in the web interface.

import os
from datetime import datetime

import aisdb
from aisdb import DBQuery, DBConn
from aisdb.gis import DomainFromTxts

from dotenv import load_dotenv

load_dotenv()

dbpath = os.environ.get('EXAMPLE_NOISE_DB', 'AIS.sqlitedb')
trafficDBpath = os.environ.get('AISDBMARINETRAFFIC', 'marinetraffic.db')
domain = DomainFromTxts('EastCoast', folder=os.environ.get('AISDBZONES'))

start = datetime(2021, 7, 1)
end = datetime(2021, 7, 2)

default_boundary = {'xmin': -180, 'xmax': 180, 'ymin': -90, 'ymax': 90}


def random_noise(tracks, boundary=default_boundary):
    for track in tracks:
        i = 1
        while i < len(track['time']):
            track['lon'][i] *= track['mmsi']
            track['lon'][i] %= (boundary['xmax'] - boundary['xmin'])
            track['lon'][i] += boundary['xmin']
            track['lat'][i] *= track['mmsi']
            track['lat'][i] %= (boundary['ymax'] - boundary['ymin'])
            track['lat'][i] += boundary['ymin']
            i += 2
        yield track


with DBConn('/home/test_database.db') as dbconn:
    vinfoDB = aisdb.webdata.marinetraffic.VesselInfo(trafficDBpath).trafficDB

    qry = DBQuery(
        dbconn=dbconn,
        dbpath=dbpath,
        start=start,
        end=end,
        callback=aisdb.database.sqlfcn_callbacks.in_bbox_time_validmmsi,
        **domain.boundary,
    )

    rowgen = qry.gen_qry(fcn=aisdb.database.sqlfcn.crawl_dynamic_static)

    tracks = aisdb.track_gen.TrackGen(rowgen, decimate=True)
    tracks = aisdb.webdata.marinetraffic.vessel_info(tracks, vinfoDB)
    tracks = random_noise(tracks, boundary=domain.boundary)
    tracks = aisdb.encode_greatcircledistance(tracks,
                                              distance_threshold=50000,
                                              minscore=1e-5,
                                              speed_threshold=50)

    if __name__ == '__main__':
        aisdb.web_interface.visualize(
            tracks,
            domain=domain,
            visualearth=True,
            open_browser=True,
        )

Interpolating, geofencing and filtering

Building on the above processing pipeline, the resulting cleaned trajectories can then be geofenced and filtered for results contained by atleast one domain polygon, and interpolated for uniformity.

domain = aisdb.DomainFromPoints(points=[(-63.6, 44.6),], radial_distances=[5000,])
tracks_filtered = aisdb.track_gen.fence_tracks(tracks_encoded, domain)
tracks_interp = aisdb.interp_time(tracks_filtered, step=timedelta(minutes=15))

for segment in track_segments:
    print(segment)

Additional processing functions can be found in the aisdb.track_gen module.

Exporting as CSV

The resulting processed voyage data can be exported in CSV format instead of being printed:

aisdb.write_csv(tracks_interp, 'ais_24h_processed.csv')

Integration with external metadata

AISDB supports integration with external data sources such as bathymetric charts and other raster grids.

Bathymetric charts

To determine the approximate ocean depth at each vessel position, the aisdb.webdata.bathymetry module can be used.

import aisdb

# set the data storage directory
data_dir = './testdata/'

# download bathymetry grid from the internet
bathy = aisdb.webdata.bathymetry.Gebco(data_dir=data_dir)
bathy.fetch_bathymetry_grid()

Once the data has been downloaded, the Gebco() class may be used to append bathymetric data to tracks in the context of a TrackGen processing pipeline in the same manner as the processing functions described above.

tracks = aisdb.TrackGen(qry.gen_qry())
tracks_bathymetry = bathy.merge_tracks(tracks)

Also see aisdb.webdata.shore_dist.ShoreDist for determining approximate nearest distance to shore from vessel positions.

Rasters

Similarly, abritrary raster coordinate-gridded data may be appended to vessel tracks

tracks = aisdb.TrackGen(qry.gen_qry())
raster_path './GMT_intermediate_coast_distance_01d.tif'
raster = aisdb.webdata.load_raster.RasterFile(raster_path)
tracks = raster.merge_tracks(tracks, new_track_key="coast_distance")

Detailed metadata from marinetraffic.com

Visualization

AIS data from the database may be overlayed on a map such as the one shown above by using the aisdb.web_interface.visualize() function. This function accepts a generator of track dictionaries such as those output by aisdb.track_gen.TrackGen(). The color of each vessel track is determined by vessel type metadata.

import o
from datetime import datetime, timedelta

import aisdb
import aisdb.web_interface
from aisdb.tests.create_testing_data import (
    sample_database_file,
    random_polygons_domain,
)

domain = random_polygons_domain()

example_dir = 'testdata'
if not os.path.isdir(example_dir):
    os.mkdir(example_dir)

dbpath = os.path.join(example_dir, 'example_visualize.db')
months = sample_database_file(dbpath)
start = datetime(int(months[0][0:4]), int(months[0][4:6]), 1)
end = datetime(int(months[1][0:4]), int(months[1][4:6]) + 1, 1)


def color_tracks(tracks):
    ''' set the color of each vessel track using a color name or RGB value '''
    for track in tracks:
        track['color'] = 'red' or 'rgb(255,0,0)'
        yield track


with aisdb.SQLiteDBConn('/home/test_database.db') as dbconn:
    qry = aisdb.DBQuery(
        dbconn=dbconn,
        dbpath=dbpath,
        start=start,
        end=end,
        callback=aisdb.sqlfcn_callbacks.valid_mmsi,
    )
    rowgen = qry.gen_qry()
    tracks = aisdb.track_gen.TrackGen(rowgen, decimate=False)
    tracks_segment = aisdb.track_gen.split_timedelta(tracks,
                                                     timedelta(weeks=4))
    tracks_colored = color_tracks(tracks_segment)

    if __name__ == '__main__':
        aisdb.web_interface.visualize(
            tracks_colored,
            domain=domain,
            visualearth=True,
            open_browser=True,
        )

Last updated