Data querying with AISdb involves setting up a connection to the database, defining query parameters, creating and executing the query, and processing the results. Following the previous tutorial, Database Loading, we set up a database connection and made simple queries and visualizations. This tutorial will dig into data query functions and parameters and show you the queries you can make with AISdb.
Data querying with AISdb includes two components: DBQuery
and TrackGen
. In this section, we will introduce each component with examples. Before starting data querying, please ensure you have connected to the database. If you have not done so, please follow the instructions and examples in Database Loading or Quick Start.
The DBQuery
class is used to create a query object that specifies the parameters for data retrieval, including the time range, spatial domain, and any filtering callbacks. Here is an example to create a DBQuery object and use parameters to specify the time range and geographical locations:
Callback functions are used in the DBQuery
class to filter data based on specific criteria. Some common callbacks include: in_bbox
, in_time_bbox
, valid_mmsi
, and in_time_bbox_validmmsi
. These callbacks ensure that the data retrieved matches the specific criteria defined in the query. Please find examples of using different callbacks with other parameters in Query types with practical examples.
gen_qry
The function gen_qry
is a method of the DBQuery
class in AISdb. It is responsible for generating rows of data that match the query criteria specified when creating the DBQuery
object. This function acts as a generator, yielding one row at a time and efficiently handling large datasets.
After creating the DBQuery
object, we can generate rows with gen_qry
:
Each row from gen_qry
is a tuple or dictionary representing a record in the database.
The TrackGen
class converts the generated rows from gen_qry
into tracks (trajectories). It takes the row generator and, optionally, a decimate
parameter to control point reduction. This conversion is essential for analyzing vessel movements, identifying patterns, and visualizing trajectories in later steps.
Following the generated rows above, here is how to use the TrackGen
class:
The TrackGen
class yields "tracks," which is a generator object. While iterating over tracks, each component is a dictionary representing a track for a specific vessel:
This is the output with our sample data:
In this section, we will provide practical examples of the most common querying types you can make using the DBQuery
class, including querying within a time range, geographical areas, and tracking vessels by MMSI. Different queries can be achieved by changing the callbacks
parameters and other parameters defined in the DBQuery
class. Then, we will use TrackGen
to convert these query results into structured tracks for further analysis and visualization.
First, we need to import the necessary packages and prepare data:
Querying data within a specified time range can be done by using the in_timerange_validmmsi
callback in the DBQuery
class:
This will display the queried vessel tracks (within a time range, has a valid MMSI) on the map:
You may find noise in some of the track data. In Data Cleaning, we introduced the de-noising methods in AISdb that can effectively remove unreasonable or error data points, ensuring more accurate and reliable vessel trajectories.
In practical scenarios, people may have specific points/areas of interest. DBQuery
includes parameters to define a bounding box and has relevant callbacks. Let's look at an example:
This will show all the vessel tracks with valid MMSI in the defined bounding box:
In the above examples, we queried data in a time range and a geographical area. If you want to combine multiple query criteria, please check out available types of callbacks in the API Docs. In the last example above, we can simply modify the callback type to obtain vessel tracks within both the time range and geographical area:
The displayed vessel tracks:
In addition to time and location range, you can track single and multiple vessel(s) of interest by specifying their MMSI in the query. Here is an example of tracking several vessels within a time range: