Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells2368
Missing cells (%)17.1%
Duplicate rows2056
Duplicate rows (%)14.8%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T14:18:21.129031image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:18:21.534289image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 2056 (14.8%) duplicate rowsDuplicates
Flow has 2368 (17.1%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:18:19.553262
Analysis finished2024-05-12 18:18:21.056556
Duration1.5 second
MissingQ_Station_NA_28047050_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct4559
Distinct (%)39.6%
Missing2368
Missing (%)17.1%
Infinite0
Infinite (%)0.0%
Mean13.432012
Minimum0
Maximum115.9
Zeros22
Zeros (%)0.2%
Memory size216.9 KiB
2024-05-12T14:18:22.256046image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.417
Q12.24
median7.464
Q318.8325
95-th percentile48.0115
Maximum115.9
Range115.9
Interquartile range (IQR)16.5925

Descriptive statistics

Standard deviation15.484489
Coefficient of variation (CV)1.1528048
Kurtosis3.5431967
Mean13.432012
Median Absolute Deviation (MAD)6.014
Skewness1.7911589
Sum154629.33
Variance239.76938
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value1.46416046 × 10-15
2024-05-12T14:18:22.898249image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:18:24.942451image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps48
min3 days
max2 years and 4 days
mean7 weeks, 1 day and 1 hour
std21 weeks, 6 days and 18 hours
2024-05-12T14:18:25.456728image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
1.67 43
 
0.3%
0.3 34
 
0.2%
0.83 32
 
0.2%
1.27 30
 
0.2%
1.7 29
 
0.2%
1.95 29
 
0.2%
1.32 28
 
0.2%
0.38 27
 
0.2%
2.12 27
 
0.2%
1.78 25
 
0.2%
Other values (4549) 11208
80.7%
(Missing) 2368
 
17.1%
ValueCountFrequency (%)
0 22
0.2%
0.01 1
 
< 0.1%
0.019 2
 
< 0.1%
0.02 2
 
< 0.1%
0.03 4
 
< 0.1%
0.038 9
0.1%
0.04 1
 
< 0.1%
0.05 2
 
< 0.1%
0.057 1
 
< 0.1%
0.06 6
 
< 0.1%
ValueCountFrequency (%)
115.9 1
< 0.1%
114.7 1
< 0.1%
114.2 1
< 0.1%
110.3 1
< 0.1%
109.8 1
< 0.1%
106.1 1
< 0.1%
104.4 1
< 0.1%
102 1
< 0.1%
101.9 1
< 0.1%
99.4 1
< 0.1%
2024-05-12T14:18:24.102343image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:18:20.652999image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:18:20.884248image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:18:21.006413image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-014.87
1983-01-025.00
1983-01-035.00
1983-01-044.87
1983-01-054.58
1983-01-064.02
1983-01-074.20
1983-01-083.86
1983-01-093.61
1983-01-103.52
Flow
Date
2020-12-225.9266
2020-12-235.5249
2020-12-244.9864
2020-12-254.5485
2020-12-264.1122
2020-12-273.5635
2020-12-283.2134
2020-12-292.7960
2020-12-302.4226
2020-12-312.2350

Duplicate rows

Most frequently occurring

Flow# duplicates
2055NaN2368
2261.6743
400.3034
1330.8332
1721.2730
2291.7029
2571.9529
1771.3228
580.3827
2792.1227