Brief Introduction
To support efficiently ST-KNN join over huge amount of Spatial-temporal data with any geometry type,
we propose a novel distributed solution based on Apache Spark. Specifically, our method adopts a
two-round join framework. Furthermore, we propose an automatic knob tuning framework based on Bayesian optimization, which determines good values of introduced parameters. The experimental results show the efficiency and effectiveness of our proposed method.
Framework of ST-kNN Join Executor
Framework of Automatic Knob Tuning
Key Features
(1) Scalability
Our algorithm supports large number of spatio-temporal data, but requires little for the clusters.
(2) Efficiency
In our experimental settings, our algorithm shows a competitive performance and is
much more scalable than two state-of-the-art distributed spatio-temporal data management
systems (i.e., LocationSpark and Simba).
(3) Various spatial and temporal types supported
Users are able to execute st-kNN join in different spatio-temporal situations.
-
Various geometry types (e.g Point, Linestring, Polygon) are supported.
-
Both time range and time point are supported.
(4) Easy of Use
Our algorithm has been integrated into JUST (JD Urban Spatio-Temporal Data Engine),
which incorporates a complete SQL engine, and presets plenty of out-of-the-box
spatio-temporal analysis functions. ST-KNN join can be used in this system with JustQL.
Download
System Demonstration
Execution Step
Users can input a JustQL in the JustQL Panel
, then click the button 运行
(means Run
). The execution result will display in the Result Panel
.
SQL Syntax for ST-kNN join
We can perform ST-KNN join with a SQL-like statement:
SELECT * FROM R, S WHERE st_knnjoin(R.geom, S.geom, R.t_min, R.t_max, S.t_min, S.t_max, k, delta)
where R and S are the names of two spatio-temporal tables; R.geom and S.geom are the spatial field names of these two tables; and R.t_min, R.t_max, S.t_min, S.t_max are the temporal field names of the two tables.
Notice!!!
This demonstration is based on a dsitributed environment. JUST SQL engine can change to the distributed execution engine intelligently for ST-kNN join. You can also change the execution engine to 分布式
(means distributed
) manually, as shown in the following picture.
Follow Me!
Now, let's experimence ST-kNN join by executing the following statements one by one.
Data Preparing
1. Create tables:
CREATE TABLE point01 (
pt01 point,
pt_st01 timestamp,
pt_et01 timestamp
);
CREATE TABLE point02 (
pt02 point,
pt_st02 timestamp,
pt_et02 timestamp
);
CREATE TABLE linestring01 (
ls01 linestring,
ls_st01 timestamp,
ls_et01 timestamp
);
CREATE TABLE linestring02 (
ls02 linestring,
ls_st02 timestamp,
ls_et02 timestamp
);
CREATE TABLE polygon01 (
pg01 polygon,
pg_st01 timestamp,
pg_et01 timestamp
);
CREATE TABLE polygon02 (
pg02 polygon,
pg_st02 timestamp,
pg_et02 timestamp
);
2. Load Data:
LOAD HDFS:"/sigspatial_data/point01.txt" TO JUST: point01 (
pt01 st_geomfromwkt(0),
pt_st01 to_timestamp(1),
pt_et01 to_timestamp(2)
) WITH (
"just.separator" = "|",
"just.header" = "false"
)
LOAD HDFS:"/sigspatial_data/point02.txt" TO JUST: point02 (
pt02 st_geomfromwkt(0),
pt_st02 to_timestamp(1),
pt_et02 to_timestamp(2)
) WITH (
"just.separator" = "|",
"just.header" = "false"
)
LOAD HDFS:"/sigspatial_data/linestring01.txt" TO JUST: linestring01 (
ls01 st_geomfromwkt(0),
ls_st01 to_timestamp(1),
ls_et01 to_timestamp(2)
) WITH (
"just.separator" = "|",
"just.header" = "false"
)
LOAD HDFS:"/sigspatial_data/linestring02.txt" TO JUST: linestring02 (
ls02 st_geomfromwkt(0),
ls_st02 to_timestamp(1),
ls_et02 to_timestamp(2)
) WITH (
"just.separator" = "|",
"just.header" = "false"
)
LOAD HDFS:"/sigspatial_data/polygon01.txt" TO JUST: polygon01 (
pg01 st_geomfromwkt(0),
pg_st01 to_timestamp(1),
pg_et01 to_timestamp(2)
) WITH (
"just.separator" = "|",
"just.header" = "false"
)
LOAD HDFS:"/sigspatial_data/polygon02.txt" TO JUST: polygon02 (
pg02 st_geomfromwkt(0),
pg_st02 to_timestamp(1),
pg_et02 to_timestamp(2)
) WITH (
"just.separator" = "|",
"just.header" = "false"
)
Point-Point ST-KNN Join
ST-KNN join has been supported in JUST (JD Urban Spatio-Temporal Data Engine). You can join any two tables created before, including join themselves.
In this section, the algorithm extracts the k Nearest Cars for each pedestrain location.
SELECT
*
FROM
point01 t1,
point02 t2
WHERE
st_knnjoin(
t1.pt01,
t2.pt02,
t1.pt_st01,
t1.pt_et01,
t2.pt_st02,
t2.pt_et02,
2,
100
)
LIMIT
200
Linestring-Linestring ST-KNN Join
In this section, the algorithm extracts the k Nearest Car trajecotries for each pedestrain trajectory.
SELECT
*
FROM
linestring01 t1,
linestring02 t2
WHERE
st_knnjoin(
t1.ls01,
t2.ls02,
t1.ls_st01,
t1.ls_et01,
t2.ls_st02,
t2.ls_et02,
2,
100
)
LIMIT
200
Polygon-Polygon ST-KNN Join
In this section, the algorithm extracts the k Nearest Car stay point area (polygon) for each pedestrain stay point area (polygon).
SELECT
*
FROM
point01 t1,
polygon01 t2
WHERE
st_knnjoin(
t1.pt01,
t2.pg01,
t1.pt_st01,
t1.pt_et01,
t2.pg_st01,
t2.pg_et01,
2,
100
)
LIMIT
200
Polygon-Linestring ST-KNN Join
In this section, the algorithm extracts the k Nearest Car stay point area (polygon) for each pedestrain trajectory.
SELECT
*
FROM
polygon01 t1,
linestring01 t2
WHERE
st_knnjoin(
t1.pg01,
t2.ls01,
t1.pg_st01,
t1.pg_et01,
t2.ls_st01,
t2.ls_et01,
2,
100
)
LIMIT
200
Data Elimination
Drop tables:
DROP table point01;
DROP table point02;
DROP table linestring01;
DROP table linestring02;
DROP table polygon01;
DROP table polygon02;