MySQL - Group Data By 5, 10, 15, 20, or 30 Minute Intervals
When aggregating data for high traffic split tests, the business stakeholders often ask data analysts to group and report data by 15 minute intervals. It is not detailed enough to group data hourly as high volume split tests need to be monitored closely for any critical issue that might have significant negative impact on sales or other metrics.
In today's post, we'll walk through a few SQL snippets to round Date/Time records to the 15 minute increment and group them by 15 minute intervals.
If you are using Microsoft SQL Server database engine, you can visit SQL Server - Group Data By 5, 10, 15, 20, or 30 Minute Intervals to check out T-SQL scripts.
Example Data
create table login_logs (
account_id INT,
login_datetime DATETIME
);
insert into login_logs (account_id, login_datetime) values (1, '2022-03-02 02:05:02');
insert into login_logs (account_id, login_datetime) values (2, '2022-03-02 02:09:07');
insert into login_logs (account_id, login_datetime) values (3, '2022-03-02 02:14:07');
insert into login_logs (account_id, login_datetime) values (4, '2022-03-02 02:16:07');
insert into login_logs (account_id, login_datetime) values (5, '2022-03-02 02:27:07');
insert into login_logs (account_id, login_datetime) values (6, '2022-03-02 02:28:07');
insert into login_logs (account_id, login_datetime) values (7, '2022-03-02 02:30:00');
insert into login_logs (account_id, login_datetime) values (8, '2022-03-02 02:36:02');
insert into login_logs (account_id, login_datetime) values (9, '2022-03-02 02:40:07');
insert into login_logs (account_id, login_datetime) values (10, '2022-03-02 02:46:17');
insert into login_logs (account_id, login_datetime) values (11, '2022-03-02 02:54:27');
insert into login_logs (account_id, login_datetime) values (12, '2022-03-02 02:58:10');
Use
DB-Fiddle to execute SQL scripts on sample data.
Group By 15 Minute Interval - Round Up Date/Time
If the business would like to round up each login_datetime value to 15 minute interval, we can apply the following logics.
- Use
TIMESTAMPDIFF
function to calculate the minute difference between the login_datetime and MySQL begin_datetime '1000-01-01 00:00:00' - Divide the calculated minute difference by 15
- Invoke
CEILING
function to return the smallest integer value that is greater than or equal to the division result - Multiply the ceiling result by 15 to get the round up minute difference value
- Finally, add the round up minute difference value to MySQL begin_datetime to get the new round up login_datetime values in 15 minute intervals
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 15;
SELECT account_id,
login_datetime,
TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) AS diff,
CEILING(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins AS diff_round_up,
DATE_ADD(@begin_datetime, Interval CEILING(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute) AS login_datetime_15_min_interval
FROM login_logs
ORDER BY login_datetime;
account_id | login_datetime | diff | diff_round_up | login_datetime_15_min_interval |
---|---|---|---|---|
1 | 2022-03-02 02:05:02 | 537606845 | 537606855 | 2022-03-02 02:15:00 |
2 | 2022-03-02 02:09:07 | 537606849 | 537606855 | 2022-03-02 02:15:00 |
3 | 2022-03-02 02:14:07 | 537606854 | 537606855 | 2022-03-02 02:15:00 |
4 | 2022-03-02 02:16:07 | 537606856 | 537606870 | 2022-03-02 02:30:00 |
5 | 2022-03-02 02:27:07 | 537606867 | 537606870 | 2022-03-02 02:30:00 |
6 | 2022-03-02 02:28:07 | 537606868 | 537606870 | 2022-03-02 02:30:00 |
7 | 2022-03-02 02:30:00 | 537606870 | 537606870 | 2022-03-02 02:30:00 |
8 | 2022-03-02 02:36:02 | 537606876 | 537606885 | 2022-03-02 02:45:00 |
9 | 2022-03-02 02:40:07 | 537606880 | 537606885 | 2022-03-02 02:45:00 |
10 | 2022-03-02 02:46:17 | 537606886 | 537606900 | 2022-03-02 03:00:00 |
11 | 2022-03-02 02:54:27 | 537606894 | 537606900 | 2022-03-02 03:00:00 |
12 | 2022-03-02 02:58:10 | 537606898 | 537606900 | 2022-03-02 03:00:00 |
Now you can group accout login data by round up 15 minute intervals.
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 15;
SELECT DATE_ADD(@begin_datetime, Interval CEILING(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute) AS login_datetime_15_min_interval,
COUNT(DISTINCT (account_id)) AS distinct_account_logins
FROM login_logs
GROUP BY login_datetime_15_min_interval
ORDER BY login_datetime_15_min_interval;
login_datetime_15_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:15:00 | 3 |
2022-03-02 02:30:00 | 4 |
2022-03-02 02:45:00 | 2 |
2022-03-02 03:00:00 | 3 |
Group By 15 Minute Interval - Round Down Date/Time
If the business would like to round down each login_datetime value to 15 minute interval, we can apply the same logics described above and simply substitute CEILING
with FLOOR
function.
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 15;
SELECT account_id,
login_datetime,
TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) AS diff,
FLOOR(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins AS diff_round_down,
DATE_ADD(@begin_datetime, Interval FLOOR(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute) AS login_datetime_15_min_interval
FROM login_logs
ORDER BY login_datetime;
account_id | login_datetime | diff | diff_round_down | login_datetime_15_min_interval |
---|---|---|---|---|
1 | 2022-03-02 02:05:02 | 537606845 | 537606840 | 2022-03-02 02:00:00 |
2 | 2022-03-02 02:09:07 | 537606849 | 537606840 | 2022-03-02 02:00:00 |
3 | 2022-03-02 02:14:07 | 537606854 | 537606840 | 2022-03-02 02:00:00 |
4 | 2022-03-02 02:16:07 | 537606856 | 537606855 | 2022-03-02 02:15:00 |
5 | 2022-03-02 02:27:07 | 537606867 | 537606855 | 2022-03-02 02:15:00 |
6 | 2022-03-02 02:28:07 | 537606868 | 537606855 | 2022-03-02 02:15:00 |
7 | 2022-03-02 02:30:00 | 537606870 | 537606870 | 2022-03-02 02:30:00 |
8 | 2022-03-02 02:36:02 | 537606876 | 537606870 | 2022-03-02 02:30:00 |
9 | 2022-03-02 02:40:07 | 537606880 | 537606870 | 2022-03-02 02:30:00 |
10 | 2022-03-02 02:46:17 | 537606886 | 537606885 | 2022-03-02 02:45:00 |
11 | 2022-03-02 02:54:27 | 537606894 | 537606885 | 2022-03-02 02:45:00 |
12 | 2022-03-02 02:58:10 | 537606898 | 537606885 | 2022-03-02 02:45:00 |
Now you can group accout login data by round down 15 minute intervals.
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 15;
SELECT DATE_ADD(@begin_datetime, Interval FLOOR(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute) AS login_datetime_15_min_interval,
COUNT(DISTINCT (account_id)) AS distinct_account_logins
FROM login_logs
GROUP BY login_datetime_15_min_interval
ORDER BY login_datetime_15_min_interval;
login_datetime_15_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:00:00 | 3 |
2022-03-02 02:15:00 | 3 |
2022-03-02 02:30:00 | 3 |
2022-03-02 02:45:00 | 3 |
Group By 15 Minute Interval - Round To Nearest
If the business would like to round each login_date value to its nearest 15 minute interval, we could use the following logics to achieve it.
- Use
TIMESTAMPDIFF
function to calculate the minute difference between the login_datetime and MySQL begin_datetime '1000-01-01 00:00:00' - Divide the calculated minute difference by 15
- Invoke
MOD
function to return returns the remainder of the above step result divided by 1. This would give us a number between 0 and 1 which we can use us a nearest indicator - If the nearest_indicator value is greater or equal to 0.5, we would apply
CEILING' function, otherwise apply
FLOOR' function to get the nearest 15 minute interval
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 15;
SELECT account_id,
login_datetime,
MOD(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins, 1) AS nearest_indicator,
CASE
WHEN MOD(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins, 1) >= 0.5
THEN DATE_ADD(@begin_datetime, Interval CEILING(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute)
ELSE DATE_ADD(@begin_datetime, Interval FLOOR(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute)
END AS login_datetime_15_min_interval
FROM login_logs
ORDER BY login_datetime;
account_id | login_datetime | nearest_indicator | login_datetime_15_min_interval |
---|---|---|---|
1 | 2022-03-02 02:05:02 | 0.3333 | 2022-03-02 02:00:00 |
2 | 2022-03-02 02:09:07 | 0.6000 | 2022-03-02 02:15:00 |
3 | 2022-03-02 02:14:07 | 0.9333 | 2022-03-02 02:15:00 |
4 | 2022-03-02 02:16:07 | 0.0667 | 2022-03-02 02:15:00 |
5 | 2022-03-02 02:27:07 | 0.8000 | 2022-03-02 02:30:00 |
6 | 2022-03-02 02:28:07 | 0.8667 | 2022-03-02 02:30:00 |
7 | 2022-03-02 02:30:00 | 0.0000 | 2022-03-02 02:30:00 |
8 | 2022-03-02 02:36:02 | 0.4000 | 2022-03-02 02:30:00 |
9 | 2022-03-02 02:40:07 | 0.6667 | 2022-03-02 02:45:00 |
10 | 2022-03-02 02:46:17 | 0.0667 | 2022-03-02 02:45:00 |
11 | 2022-03-02 02:54:27 | 0.6000 | 2022-03-02 03:00:00 |
12 | 2022-03-02 02:58:10 | 0.8667 | 2022-03-02 03:00:00 |
Now you can group accout login data by round to nearest 15 minute intervals.
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 15;
SELECT CASE
WHEN MOD(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins, 1) >= 0.5
THEN DATE_ADD(@begin_datetime, Interval CEILING(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute)
ELSE DATE_ADD(@begin_datetime, Interval FLOOR(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute)
END AS login_datetime_15_min_interval,
COUNT(DISTINCT (account_id)) AS distinct_account_logins
FROM login_logs
GROUP BY login_datetime_15_min_interval
ORDER BY login_datetime_15_min_interval;
login_datetime_15_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:00:00 | 1 |
2022-03-02 02:15:00 | 3 |
2022-03-02 02:30:00 | 4 |
2022-03-02 02:45:00 | 2 |
2022-03-02 03:00:00 | 2 |
Group By 5, 10, 20, 30 Minute Intervals
You can use any of the code snippets above and simply set @interval_mins
value to 5, 10, 20, or 30 to group your Data/Time to your desired X minute interval.
For example, setting @interval_mins
variable to 5 using the round to nearest technique, you can retrieve aggregated login results in 5 minute interval.
SET @begin_datetime = '1000-01-01 00:00:00';
SET @interval_mins = 5;
SELECT CASE
WHEN MOD(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins, 1) >= 0.5
THEN DATE_ADD(@begin_datetime, Interval CEILING(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute)
ELSE DATE_ADD(@begin_datetime, Interval FLOOR(TIMESTAMPDIFF(MINUTE, @begin_datetime, login_datetime) / @interval_mins) * @interval_mins minute)
END AS login_datetime_5_min_interval,
COUNT(DISTINCT (account_id)) AS distinct_account_logins
FROM login_logs
GROUP BY login_datetime_5_min_interval
ORDER BY login_datetime_5_min_interval;
login_datetime_5_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:05:00 | 1 |
2022-03-02 02:10:00 | 1 |
2022-03-02 02:15:00 | 2 |
2022-03-02 02:25:00 | 1 |
2022-03-02 02:30:00 | 2 |
2022-03-02 02:35:00 | 1 |
2022-03-02 02:40:00 | 1 |
2022-03-02 02:45:00 | 1 |
2022-03-02 02:55:00 | 1 |
2022-03-02 03:00:00 | 1 |
Same data grouped by 10 minute interval - @interval_mins = 10
login_datetime_10_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:10:00 | 3 |
2022-03-02 02:20:00 | 1 |
2022-03-02 02:30:00 | 3 |
2022-03-02 02:40:00 | 2 |
2022-03-02 02:50:00 | 2 |
2022-03-02 03:00:00 | 1 |
Same data grouped by 20 minute interval - @interval_mins = 20
login_datetime_20_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:00:00 | 2 |
2022-03-02 02:20:00 | 4 |
2022-03-02 02:40:00 | 4 |
2022-03-02 03:00:00 | 2 |
Same data grouped by 30 minute interval - @interval_mins = 30
login_datetime_30_min_interval | distinct_account_logins |
---|---|
2022-03-02 02:00:00 | 3 |
2022-03-02 02:30:00 | 6 |
2022-03-02 03:00:00 | 3 |