Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells1283
Missing cells (%)9.2%
Duplicate rows2473
Duplicate rows (%)17.8%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T14:18:45.933814image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:18:46.333029image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 2473 (17.8%) duplicate rowsDuplicates
Flow has 1283 (9.2%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:18:43.779473
Analysis finished2024-05-12 18:18:45.726078
Duration1.95 second
MissingQ_Station_NA_25027680_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct5345
Distinct (%)42.4%
Missing1283
Missing (%)9.2%
Infinite0
Infinite (%)0.0%
Mean5373.0043
Minimum997.38
Maximum11127
Zeros0
Zeros (%)0.0%
Memory size216.9 KiB
2024-05-12T14:18:47.036567image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum997.38
5-th percentile2334.2
Q14160.8
median5308.3
Q36614
95-th percentile8346.8
Maximum11127
Range10129.62
Interquartile range (IQR)2453.2

Descriptive statistics

Standard deviation1810.3204
Coefficient of variation (CV)0.3369289
Kurtosis-0.15153936
Mean5373.0043
Median Absolute Deviation (MAD)1228.3
Skewness0.15155502
Sum67683735
Variance3277260.1
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value8.617728573 × 10-17
2024-05-12T14:18:47.667051image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:18:50.460210image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps29
min5 days
max32 weeks and 3 days
mean6 weeks, 1 day and 1 hour
std6 weeks, 2 days and 59 minutes
2024-05-12T14:18:50.986252image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
8040 20
 
0.1%
5520 19
 
0.1%
4662 16
 
0.1%
7335 16
 
0.1%
7350 15
 
0.1%
4603 15
 
0.1%
5638 15
 
0.1%
6607 15
 
0.1%
4310 15
 
0.1%
4116 14
 
0.1%
Other values (5335) 12437
89.6%
(Missing) 1283
 
9.2%
ValueCountFrequency (%)
997.38 1
< 0.1%
1002.9 1
< 0.1%
1009.4 1
< 0.1%
1011.4 2
< 0.1%
1012.4 1
< 0.1%
1020.7 1
< 0.1%
1021.4 1
< 0.1%
1063.3 1
< 0.1%
1071.9 1
< 0.1%
1078.1 1
< 0.1%
ValueCountFrequency (%)
11127 1
 
< 0.1%
11108 1
 
< 0.1%
11099 1
 
< 0.1%
11090 2
 
< 0.1%
11081 1
 
< 0.1%
11072 1
 
< 0.1%
11053 1
 
< 0.1%
11044 1
 
< 0.1%
11026 1
 
< 0.1%
11017 5
< 0.1%
2024-05-12T14:18:49.713913image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:18:45.158592image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:18:45.483704image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:18:45.649426image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03NaN
1983-01-04NaN
1983-01-05NaN
1983-01-06NaN
1983-01-07NaN
1983-01-08NaN
1983-01-09NaN
1983-01-10NaN
Flow
Date
2020-12-227383.0
2020-12-237332.3
2020-12-247280.6
2020-12-257210.8
2020-12-267157.3
2020-12-277093.9
2020-12-287048.6
2020-12-296983.3
2020-12-306905.5
2020-12-316849.5

Duplicate rows

Most frequently occurring

Flow# duplicates
2472NaN1283
22758040.020
13425520.019
8794662.016
21107335.016
7194310.015
8474603.015
13965638.015
18466607.015
21137350.015