Quick Start
A hands-on quick start guide for using AISdb.
Python Environment and Installation
To work with the AISdb Python package, please ensure that you have Python version 3.8 or higher. If you plan to use SQLite, no additional installation is required, as it is included with Python by default. However, for those who prefer using a PostgreSQL server, it will need to be installed separately.
The AISdb Python package can be conveniently installed using pip. It's highly recommended that a virtual Python environment be created and the package installed within it.
Alternatively, you may also use AISdb on Docker. Regardless of the installation procedure you decide to use, you can test your installation by running the following commands:
Notice that if you are running Jupyter, ensure it is installed in the same environment as AISdb:
The Python code in the rest of this document can be run in the Python environment you created.
For using nightly builds (not mandatory), you can install it from the source:
We may introduce new changes on different branches; however, the master branch contains changes that have passed testing and is generally more stable.
Database Handling
Connecting to a Postgres database
This option requires an optional dependency psycopg
for interfacing with Postgres databases. Beware that Postgres accepts these keyword arguments. Alternatively, a connection string may be used. Information on connection strings and Postgres URI format can be found here.
Attaching a SQLite database
Querying SQLite is as easy as informing the name of a ".db" file with the same entity-relationship as the databases supported by AIS, which are detailed in the SQL Database section. We prepared an example SQLite database example_data.db
based on 2-month of AIS data (01/01/2022 - 03/01/2022) from Marine Cadastre, which is available in this Tutorial GitHub repository.
If you want to create your own database using your data, we have a tutorial with examples that shows you how to create an SQLite database from open-source data.
Querying the Database
Parameters for the database query can be defined using aisdb.database.dbqry.DBQuery
. Iterate over rows returned from the database for each vessel with aisdb.database.dbqry.DBQuery.gen_qry()
. Convert the results into generator-yielding dictionaries with NumPy arrays describing position vectors, e.g., lon, lat, and time, using aisdb.track_gen.TrackGen()
.
The following query will return vessel trajectories from a given 1-hour time window:
A specific region can be queried for AIS data using aisdb.gis.Domain
or one of its sub-classes to define a collection of shapely
polygon features. For this example, the domain contains a single bounding box polygon derived from a longitude/latitude coordinate pair and radial distance specified in meters. If multiple features are included in the domain object, the domain boundaries will encompass the convex hull of all features.
Additional query callbacks for filtering by region, timeframe, identifier, etc. can be found in aisdb.database.sql_query_strings
and aisdb.database.sqlfcn_callbacks
Processing
Voyage Modelling
The above generator can be input into a processing function, yielding modified results. For example, to model the activity of vessels on a per-voyage or per-transit basis, each voyage is defined as a continuous vector of vessel positions where the time between observed timestamps never exceeds a 24-hour period.
Data cleaning and MMSI deduplication
A common problem with AIS data is noise, where multiple vessels might broadcast using the same identifier simultaneously. AISdb integrates data cleaning techniques to denoise the vessel track data; for details:
(1) Denoising with Encoder: The aisdb.denoising_encoder.encode_greatcircledistance()
function checks the approximate distance between each vesselβs position. It separates vectors where a vessel couldnβt reasonably travel using the most direct path, such as speeds over 50 knots.
(2) Distance and Speed Thresholds: A distance and speed threshold limits the maximum distance or time between messages that can be considered continuous.
(3) Scoring and Segment Concatenation: A score is computed for each position delta, with sequential messages nearby at shorter intervals given a higher score. This score is calculated by dividing the Haversine distance by elapsed time. Any deltas with a score not reaching the minimum threshold are considered the start of a new segment. New segments are compared to the end of existing segments with the same vessel identifier; if the score exceeds the minimum, they are concatenated. If multiple segments meet the minimum score, the new segment is concatenated to the existing segment with the highest score.
Processing functions may be executed in sequence as a processing chain or pipeline, so after segmenting the individual voyages as shown above, results can be input into the encoder to remove noise and correct for vessels with duplicate identifiers effectively.
Interpolating, geofencing, and filtering
Building on the above processing pipeline, the resulting cleaned trajectories can be geofenced and filtered for results contained by at least one domain polygon and interpolated for uniformity.
Additional processing functions can be found in the aisdb.track_gen
module.
Exporting as CSV
The resulting processed voyage data can be exported in CSV format instead of being printed:
Integration with external metadata
AISDB supports integrating external data sources such as bathymetric charts and other raster grids.
Bathymetric charts
To determine the approximate ocean depth at each vessel position, theaisdb.webdata.bathymetry
module can be used.
Once the data has been downloaded, the Gebco()
class may be used to append bathymetric data to tracks in the context of a TrackGen()
processing pipeline like the processing functions described above.
Also, see aisdb.webdata.shore_dist.ShoreDist
for determining the approximate nearest distance to shore from vessel positions.
Rasters
Similarly, arbitrary raster coordinate-gridded data may be appended to vessel tracks
Detailed metadata from marinetraffic.com
Visualization
AIS data from the database may be overlayed on a map such as the one shown above using the aisdb.web_interface.visualize()
function. This function accepts a generator of track dictionaries such as those output by aisdb.track_gen.TrackGen()
.
Last updated