Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells3435
Missing cells (%)24.7%
Duplicate rows2040
Duplicate rows (%)14.7%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T14:18:13.745842image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:18:14.162151image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 2040 (14.7%) duplicate rowsDuplicates
Flow has 3435 (24.7%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:18:11.985445
Analysis finished2024-05-12 18:18:13.552609
Duration1.57 second
MissingQ_Station_NA_28037090_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct3532
Distinct (%)33.8%
Missing3435
Missing (%)24.7%
Infinite0
Infinite (%)0.0%
Mean60.669406
Minimum0.1
Maximum299
Zeros0
Zeros (%)0.0%
Memory size216.9 KiB
2024-05-12T14:18:14.875081image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile2.2
Q110.4
median39.578
Q397.3
95-th percentile178.4
Maximum299
Range298.9
Interquartile range (IQR)86.9

Descriptive statistics

Standard deviation59.580421
Coefficient of variation (CV)0.9820505
Kurtosis0.29555196
Mean60.669406
Median Absolute Deviation (MAD)33.978
Skewness1.0588233
Sum633691.94
Variance3549.8265
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value1.947472341 × 10-18
2024-05-12T14:18:15.536469image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:18:18.294848image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps31
min3 days
max4 years and 6 days
mean15 weeks, 6 days and 14 hours
std41 weeks, 2 days and 9 hours
2024-05-12T14:18:18.586364image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
3.2 36
 
0.3%
3.7 35
 
0.3%
2 32
 
0.2%
2.3 31
 
0.2%
2.1 31
 
0.2%
4.7 27
 
0.2%
3.5 27
 
0.2%
2.2 27
 
0.2%
4.6 26
 
0.2%
3.4 26
 
0.2%
Other values (3522) 10147
73.1%
(Missing) 3435
 
24.7%
ValueCountFrequency (%)
0.1 2
 
< 0.1%
0.14 1
 
< 0.1%
0.175 2
 
< 0.1%
0.192 1
 
< 0.1%
0.21 2
 
< 0.1%
0.227 1
 
< 0.1%
0.245 4
< 0.1%
0.262 2
 
< 0.1%
0.28 7
0.1%
0.298 5
< 0.1%
ValueCountFrequency (%)
299 1
< 0.1%
288.5 2
< 0.1%
283.2 1
< 0.1%
278.6 1
< 0.1%
278 1
< 0.1%
277.1 1
< 0.1%
269.8 1
< 0.1%
268.6 1
< 0.1%
267.5 2
< 0.1%
266.9 1
< 0.1%
2024-05-12T14:18:17.676619image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:18:12.893339image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:18:13.254455image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:18:13.465834image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-013.2
1983-01-023.1
1983-01-033.0
1983-01-042.9
1983-01-052.9
1983-01-062.9
1983-01-072.7
1983-01-082.7
1983-01-092.6
1983-01-102.5
Flow
Date
2020-12-2223.047
2020-12-2321.002
2020-12-2419.275
2020-12-2517.389
2020-12-2616.135
2020-12-2715.620
2020-12-2815.131
2020-12-2914.940
2020-12-30NaN
2020-12-31NaN

Duplicate rows

Most frequently occurring

Flow# duplicates
2039NaN3435
1483.236
1673.735
662.032
722.131
912.331
822.227
1573.527
1994.727
1543.426