Overview

Dataset statistics

Number of variables1
Number of observations16237
Missing cells5787
Missing cells (%)35.6%
Duplicate rows2088
Duplicate rows (%)12.9%
Total size in memory253.7 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length16237
Starting point1977-02-01 00:00:00
Ending point2021-07-16 00:00:00
Period1 day
2024-05-12T14:17:01.532458image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:17:01.966079image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 2088 (12.9%) duplicate rowsDuplicates
Flow has 5787 (35.6%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:16:59.729856
Analysis finished2024-05-12 18:17:01.432532
Duration1.7 second
MissingQ_Station_NA_23167010_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct4931
Distinct (%)47.2%
Missing5787
Missing (%)35.6%
Infinite0
Infinite (%)0.0%
Mean2892.5135
Minimum732.6
Maximum8562
Zeros0
Zeros (%)0.0%
Memory size253.7 KiB
2024-05-12T14:17:02.758397image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum732.6
5-th percentile1363.36
Q12033
median2732
Q33576
95-th percentile4995.22
Maximum8562
Range7829.4
Interquartile range (IQR)1543

Descriptive statistics

Standard deviation1108.1123
Coefficient of variation (CV)0.38309667
Kurtosis0.075370999
Mean2892.5135
Median Absolute Deviation (MAD)752
Skewness0.66117762
Sum30226766
Variance1227912.9
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value2.815390935 × 10-23
2024-05-12T14:17:03.450515image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:17:06.291960image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps97
min4 days
max5 years, 9 weeks and 5 days
mean8 weeks, 4 days and 1 hour
std30 weeks, 2 days and 4 hours
2024-05-12T14:17:06.823255image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
1982 33
 
0.2%
2445 23
 
0.1%
2188 17
 
0.1%
3208 16
 
0.1%
3010 14
 
0.1%
3030 14
 
0.1%
3163 14
 
0.1%
3265 14
 
0.1%
1841 14
 
0.1%
1944 13
 
0.1%
Other values (4921) 10278
63.3%
(Missing) 5787
35.6%
ValueCountFrequency (%)
732.6 1
 
< 0.1%
776.9 2
< 0.1%
793.44 1
 
< 0.1%
824.59 1
 
< 0.1%
835.3 1
 
< 0.1%
846.31 1
 
< 0.1%
854.3 2
< 0.1%
865.28 1
 
< 0.1%
873.56 1
 
< 0.1%
880.2 3
< 0.1%
ValueCountFrequency (%)
8562 1
< 0.1%
8182 1
< 0.1%
6796 2
< 0.1%
6788 1
< 0.1%
6745 2
< 0.1%
6712 1
< 0.1%
6690 1
< 0.1%
6649 1
< 0.1%
6618 2
< 0.1%
6562 1
< 0.1%
2024-05-12T14:17:05.620506image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:17:00.687034image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:17:01.143452image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:17:01.346324image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1977-02-011202.0
1977-02-021284.0
1977-02-031189.0
1977-02-041270.0
1977-02-051307.0
1977-02-061264.0
1977-02-071001.0
1977-02-081110.0
1977-02-091164.0
1977-02-101129.0
Flow
Date
2021-07-072154.6
2021-07-081995.3
2021-07-091957.8
2021-07-101859.7
2021-07-112131.8
2021-07-122981.7
2021-07-132701.1
2021-07-142700.9
2021-07-152397.8
2021-07-162213.0

Duplicate rows

Most frequently occurring

Flow# duplicates
2087NaN5787
4681982.033
7912445.023
6092188.017
13023208.016
3751841.014
11843010.014
11973030.014
12793163.014
13423265.014