Question:
I have several tracked specimens that can move between different areas throughout the day (and its periods). In the example below the calculated fieldmov
has to restart counting whenever there is a change of specimen or day. If the period
varies but specimen
, day
and area
still the same then mov
= previous mov
. If both specimen
and day
remain the same and only the area
changes then mov
= previous mov
+ 1.Like this:

My questions are:
- How to copy the previous value of
mov
when period varies butspecimen
,day
andarea
remain the same? - How to increment the value when both
specimen
andday
remain the same and only thearea
changes?
I’m using PostgreSQL 12.11 (Ubuntu 12.11-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu.
Here is what I did so far:
DROP TABLE IF EXISTS mytable;
CREATE TABLE mytable(
specimen INTEGER,
day INTEGER,
period INTEGER,
area INTEGER
);
INSERT INTO mytable (specimen,day,period,area)
VALUES
(1,1,1,1),
(1,1,2,1),
(1,1,3,2),
(1,1,4,2),
(1,1,5,3),
(1,1,6,2),
(1,2,1,2),
(1,2,2,1),
(1,2,3,2),
(1,2,4,3),
(1,2,5,2),
(1,2,6,1),
(2,1,1,3),
(2,1,2,3),
(2,1,3,1);
SELECT
*,
CASE
WHEN previous_specimen Is NULL THEN 1
WHEN specimen != previous_specimen THEN 1
WHEN specimen = previous_specimen AND day != previous_day THEN 1
-- WHEN specimen = previous_specimen AND day = previous_day AND area = previous_area THEN LAG(mov,1) OVER (ORDER BY specimen,day,period,area) -- **repeat previous value**
-- WHEN specimen = previous_specimen AND day = previous_day AND area != previous_area THEN LAG(mov,1) OVER (ORDER BY specimen,day,period,area) + 1 -- **add 1 to previous value**
ELSE NULL
END AS mov
FROM (
SELECT
*,
LAG(specimen,1) OVER (ORDER BY specimen,day,period,area) previous_specimen,
LAG(day,1) OVER (ORDER BY specimen,day,period,area) previous_day,
LAG(area,1) OVER (ORDER BY specimen,day,period,area) previous_area
FROM mytable
) t1;
Answer:
This simpler query should do it:SELECT specimen, day, period, area
, count(*) FILTER (WHERE step)
OVER (PARTITION BY specimen, day ORDER BY period) AS mov
FROM (
SELECT *
, lag(area) OVER (PARTITION BY specimen, day ORDER BY period) <> area AS step
FROM tbl
) sub
ORDER BY specimen, day; -- optional
If period
is an incremented number without gaps for every (specimen, day)
, this is equivalent:SELECT t1.*
, count(*) FILTER (WHERE t1.area <> t2.area)
OVER (PARTITION BY t1.specimen, t1.day ORDER BY t1.period) AS mov
FROM tbl t1
LEFT JOIN tbl t2 ON t2.specimen = t1.specimen
AND t2.day = t1.day
AND t2.period = t1.period - 1
;
db<>fiddle hereNot sure which is faster. An index on
(specimen, day, period)
would help (a lot) either way.See:
About the aggregate
FILTER
clause:Should be fastest:
If you have better answer, please add a comment about this, thank you!