Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells2312
Missing cells (%)16.7%
Duplicate rows1096
Duplicate rows (%)7.9%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:36:47.581285image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:36:47.965755image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1096 (7.9%) duplicate rowsDuplicates
Flow has 2312 (16.7%) missing valuesMissing
Flow has 637 (4.6%) zerosZeros

Reproduction

Analysis started2024-05-12 19:36:45.601679
Analysis finished2024-05-12 19:36:47.514333
Duration1.91 second
MissingQ_Station_NA_25027370_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  ZEROS 

Distinct2657
Distinct (%)23.0%
Missing2312
Missing (%)16.7%
Infinite0
Infinite (%)0.0%
Mean-0.015109786
Minimum-243
Maximum144
Zeros637
Zeros (%)4.6%
Memory size216.9 KiB
2024-05-12T15:36:48.441345image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-243
5-th percentile-12.579
Q1-3
median0
Q33
95-th percentile12.8
Maximum144
Range387
Interquartile range (IQR)6

Descriptive statistics

Standard deviation9.4414159
Coefficient of variation (CV)-624.85439
Kurtosis70.398624
Mean-0.015109786
Median Absolute Deviation (MAD)3
Skewness-1.6902988
Sum-174.79
Variance89.140334
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:36:48.835222image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:36:49.846511image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps36
min4 days
max2 years and 6 days
mean9 weeks, 2 days and 4 hours
std17 weeks, 4 days and 3 hours
2024-05-12T15:36:50.211617image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 637
 
4.6%
-1 236
 
1.7%
-2 235
 
1.7%
1 222
 
1.6%
2 189
 
1.4%
-3 181
 
1.3%
3 177
 
1.3%
4 138
 
1.0%
-5 127
 
0.9%
-4 124
 
0.9%
Other values (2647) 9302
67.0%
(Missing) 2312
 
16.7%
ValueCountFrequency (%)
-243 1
< 0.1%
-184.8 1
< 0.1%
-120 1
< 0.1%
-107.5 1
< 0.1%
-85.7 1
< 0.1%
-73.4 1
< 0.1%
-64.3 1
< 0.1%
-62.6 1
< 0.1%
-62 1
< 0.1%
-58.4 1
< 0.1%
ValueCountFrequency (%)
144 1
< 0.1%
116.9 1
< 0.1%
112 1
< 0.1%
92.9 1
< 0.1%
83.7 1
< 0.1%
82.4 1
< 0.1%
71.7 1
< 0.1%
69.6 1
< 0.1%
69 1
< 0.1%
65.1 1
< 0.1%
2024-05-12T15:36:49.269334image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:36:47.077641image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:36:47.346250image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:36:47.462764image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-032.5
1983-01-04-1.3
1983-01-051.5
1983-01-06-2.4
1983-01-076.0
1983-01-087.2
1983-01-09-13.2
1983-01-102.4
Flow
Date
2020-12-22-4.21
2020-12-237.90
2020-12-245.46
2020-12-251.35
2020-12-260.60
2020-12-27-6.76
2020-12-28-7.53
2020-12-298.89
2020-12-301.83
2020-12-315.04

Duplicate rows

Most frequently occurring

Flow# duplicates
1095NaN2312
5420.0637
477-1.0236
417-2.0235
6061.0222
6662.0189
360-3.0181
7193.0177
7614.0138
275-5.0127