Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells2545
Missing cells (%)18.3%
Duplicate rows1527
Duplicate rows (%)11.0%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:35:43.579072image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:35:43.952713image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1527 (11.0%) duplicate rowsDuplicates
Flow has 2545 (18.3%) missing valuesMissing

Reproduction

Analysis started2024-05-12 19:35:41.655665
Analysis finished2024-05-12 19:35:43.381604
Duration1.73 second
MissingQ_Station_NA_28047050_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING 

Distinct8348
Distinct (%)73.6%
Missing2545
Missing (%)18.3%
Infinite0
Infinite (%)0.0%
Mean-0.00054143979
Minimum-69
Maximum63.8
Zeros79
Zeros (%)0.6%
Memory size216.9 KiB
2024-05-12T15:35:44.738078image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-69
5-th percentile-7.573
Q1-0.93
median2.7755576 × 10-17
Q30.979
95-th percentile7.513
Maximum63.8
Range132.8
Interquartile range (IQR)1.909

Descriptive statistics

Standard deviation5.8203853
Coefficient of variation (CV)-10749.829
Kurtosis25.947656
Mean-0.00054143979
Median Absolute Deviation (MAD)0.941
Skewness-0.16190039
Sum-6.13722
Variance33.876885
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:35:45.388587image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:35:47.696102image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps50
min5 days
max2 years and 1 week
mean7 weeks, 2 days and 20 hours
std21 weeks, 4 days and 7 hours
2024-05-12T15:35:48.205603image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 79
 
0.6%
-0.08 23
 
0.2%
0.08 21
 
0.2%
-0.09 20
 
0.1%
-0.02 16
 
0.1%
-0.12 14
 
0.1%
-0.01 13
 
0.1%
-0.04 13
 
0.1%
-4.440892099 × 10-1613
 
0.1%
0.02 12
 
0.1%
Other values (8338) 11111
80.1%
(Missing) 2545
 
18.3%
ValueCountFrequency (%)
-69 1
< 0.1%
-63 1
< 0.1%
-62.86 1
< 0.1%
-58.47 1
< 0.1%
-54.34 1
< 0.1%
-54.2 1
< 0.1%
-53.8 1
< 0.1%
-52.07 1
< 0.1%
-52 1
< 0.1%
-50.38 1
< 0.1%
ValueCountFrequency (%)
63.8 1
< 0.1%
61.55 1
< 0.1%
59.41 1
< 0.1%
58.02 1
< 0.1%
56.63 1
< 0.1%
52.1 1
< 0.1%
51.97 1
< 0.1%
51.47 1
< 0.1%
50.8 1
< 0.1%
47.67 1
< 0.1%
2024-05-12T15:35:46.904289image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:35:42.754004image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:35:43.104057image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:35:43.299469image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03NaN
1983-01-040.00
1983-01-05-0.03
1983-01-06-0.11
1983-01-071.01
1983-01-08-1.26
1983-01-090.61
1983-01-100.07
Flow
Date
2020-12-22-0.2046
2020-12-230.3213
2020-12-24-0.3444
2020-12-250.2374
2020-12-26-0.0990
2020-12-27-0.1140
2020-12-280.3110
2020-12-29-0.2659
2020-12-300.1113
2020-12-310.1418

Duplicate rows

Most frequently occurring

Flow# duplicates
1526NaN2545
7610.000000e+0079
669-8.000000e-0223
8548.000000e-0221
663-9.000000e-0220
734-2.000000e-0216
631-1.200000e-0114
708-4.000000e-0213
749-1.000000e-0213
758-4.440892e-1613