Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells1538
Missing cells (%)11.1%
Duplicate rows1411
Duplicate rows (%)10.2%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T14:18:06.050076image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:18:06.483612image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1411 (10.2%) duplicate rowsDuplicates
Flow has 1538 (11.1%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:18:04.090302
Analysis finished2024-05-12 18:18:05.948218
Duration1.86 second
MissingQ_Station_NA_28037030_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct3972
Distinct (%)32.2%
Missing1538
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean32.272723
Minimum0.3
Maximum435.6
Zeros0
Zeros (%)0.0%
Memory size216.9 KiB
2024-05-12T14:18:07.194659image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0.3
5-th percentile2.304
Q15.7
median15.1295
Q337.5
95-th percentile127.3985
Maximum435.6
Range435.3
Interquartile range (IQR)31.8

Descriptive statistics

Standard deviation46.054609
Coefficient of variation (CV)1.4270444
Kurtosis11.77997
Mean32.272723
Median Absolute Deviation (MAD)11.2075
Skewness3.034682
Sum398309.95
Variance2121.027
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value1.638485994 × 10-20
2024-05-12T14:18:07.886143image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:18:10.609512image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps27
min3 days
max2 years and 4 days
mean8 weeks, 1 day and 8 hours
std20 weeks, 1 day and 22 hours
2024-05-12T14:18:11.043530image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
2.444 86
 
0.6%
2.867 76
 
0.5%
3.148 64
 
0.5%
2.726 61
 
0.4%
2.585 52
 
0.4%
2 49
 
0.4%
2.304 49
 
0.4%
1.5 48
 
0.3%
4 48
 
0.3%
4.6 47
 
0.3%
Other values (3962) 11762
84.7%
(Missing) 1538
 
11.1%
ValueCountFrequency (%)
0.3 10
0.1%
0.4 1
 
< 0.1%
0.5 2
 
< 0.1%
0.6 1
 
< 0.1%
0.7 1
 
< 0.1%
0.8 1
 
< 0.1%
1 1
 
< 0.1%
1.0167 1
 
< 0.1%
1.0333 2
 
< 0.1%
1.0556 1
 
< 0.1%
ValueCountFrequency (%)
435.6 1
< 0.1%
434.1 1
< 0.1%
423.7 1
< 0.1%
394.7 1
< 0.1%
379.5 1
< 0.1%
373.5 1
< 0.1%
369.9 1
< 0.1%
368.8 1
< 0.1%
365.2 1
< 0.1%
360.8 1
< 0.1%
2024-05-12T14:18:09.863883image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:18:05.412699image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:18:05.714486image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:18:05.875794image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-016.6
1983-01-026.6
1983-01-036.6
1983-01-046.3
1983-01-056.4
1983-01-066.5
1983-01-076.5
1983-01-086.8
1983-01-096.7
1983-01-105.0
Flow
Date
2020-12-2225.9540
2020-12-2326.2340
2020-12-2424.2070
2020-12-2522.6220
2020-12-2618.9260
2020-12-2716.6780
2020-12-2812.9970
2020-12-2910.5280
2020-12-309.6073
2020-12-317.8895

Duplicate rows

Most frequently occurring

Flow# duplicates
1410NaN1538
492.44486
632.86776
773.14864
582.72661
532.58552
332.00049
442.30449
191.50048
1114.00048