Dataset statistics
| Number of variables | 1 |
|---|---|
| Number of observations | 16237 |
| Missing cells | 5787 |
| Missing cells (%) | 35.6% |
| Duplicate rows | 2088 |
| Duplicate rows (%) | 12.9% |
| Total size in memory | 253.7 KiB |
| Average record size in memory | 16.0 B |
Variable types
| TimeSeries | 1 |
|---|
Timeseries statistics
| Number of series | 1 |
|---|---|
| Time series length | 16237 |
| Starting point | 1977-02-01 00:00:00 |
| Ending point | 2021-07-16 00:00:00 |
| Period | 1 day |
| Dataset has 2088 (12.9%) duplicate rows | Duplicates |
Flow has 5787 (35.6%) missing values | Missing |
Flow is non stationary | Non stationary |
Flow is seasonal | Seasonal |
Reproduction
| Analysis started | 2024-05-12 18:16:59.729856 |
|---|---|
| Analysis finished | 2024-05-12 18:17:01.432532 |
| Duration | 1.7 second |
| Missing | Q_Station_NA_23167010_ok_Missing.csv |
| Download configuration | config.json |
Flow
Numeric time series
MISSING  NON STATIONARY  SEASONAL 
| Distinct | 4931 |
|---|---|
| Distinct (%) | 47.2% |
| Missing | 5787 |
| Missing (%) | 35.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2892.5135 |
|---|---|
| Minimum | 732.6 |
| Maximum | 8562 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 253.7 KiB |
Quantile statistics
| Minimum | 732.6 |
|---|---|
| 5-th percentile | 1363.36 |
| Q1 | 2033 |
| median | 2732 |
| Q3 | 3576 |
| 95-th percentile | 4995.22 |
| Maximum | 8562 |
| Range | 7829.4 |
| Interquartile range (IQR) | 1543 |
Descriptive statistics
| Standard deviation | 1108.1123 |
|---|---|
| Coefficient of variation (CV) | 0.38309667 |
| Kurtosis | 0.075370999 |
| Mean | 2892.5135 |
| Median Absolute Deviation (MAD) | 752 |
| Skewness | 0.66117762 |
| Sum | 30226766 |
| Variance | 1227912.9 |
| Monotonicity | Not monotonic |
| Augmented Dickey-Fuller test p-value | 2.815390935 × 10-23 |
Histogram with fixed size bins (bins=50)
Gap statistics
| number of gaps | 97 |
|---|---|
| min | 4 days |
| max | 5 years, 9 weeks and 5 days |
| mean | 8 weeks, 4 days and 1 hour |
| std | 30 weeks, 2 days and 4 hours |
| Value | Count | Frequency (%) |
| 1982 | 33 | 0.2% |
| 2445 | 23 | 0.1% |
| 2188 | 17 | 0.1% |
| 3208 | 16 | 0.1% |
| 3010 | 14 | 0.1% |
| 3030 | 14 | 0.1% |
| 3163 | 14 | 0.1% |
| 3265 | 14 | 0.1% |
| 1841 | 14 | 0.1% |
| 1944 | 13 | 0.1% |
| Other values (4921) | 10278 | |
| (Missing) | 5787 |
| Value | Count | Frequency (%) |
| 732.6 | 1 | < 0.1% |
| 776.9 | 2 | |
| 793.44 | 1 | < 0.1% |
| 824.59 | 1 | < 0.1% |
| 835.3 | 1 | < 0.1% |
| 846.31 | 1 | < 0.1% |
| 854.3 | 2 | |
| 865.28 | 1 | < 0.1% |
| 873.56 | 1 | < 0.1% |
| 880.2 | 3 |
| Value | Count | Frequency (%) |
| 8562 | 1 | |
| 8182 | 1 | |
| 6796 | 2 | |
| 6788 | 1 | |
| 6745 | 2 | |
| 6712 | 1 | |
| 6690 | 1 | |
| 6649 | 1 | |
| 6618 | 2 | |
| 6562 | 1 |
ACF and PACF
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| Flow | |
|---|---|
| Date | |
| 1977-02-01 | 1202.0 |
| 1977-02-02 | 1284.0 |
| 1977-02-03 | 1189.0 |
| 1977-02-04 | 1270.0 |
| 1977-02-05 | 1307.0 |
| 1977-02-06 | 1264.0 |
| 1977-02-07 | 1001.0 |
| 1977-02-08 | 1110.0 |
| 1977-02-09 | 1164.0 |
| 1977-02-10 | 1129.0 |
| Flow | |
|---|---|
| Date | |
| 2021-07-07 | 2154.6 |
| 2021-07-08 | 1995.3 |
| 2021-07-09 | 1957.8 |
| 2021-07-10 | 1859.7 |
| 2021-07-11 | 2131.8 |
| 2021-07-12 | 2981.7 |
| 2021-07-13 | 2701.1 |
| 2021-07-14 | 2700.9 |
| 2021-07-15 | 2397.8 |
| 2021-07-16 | 2213.0 |
Most frequently occurring
| Flow | # duplicates | |
|---|---|---|
| 2087 | NaN | 5787 |
| 468 | 1982.0 | 33 |
| 791 | 2445.0 | 23 |
| 609 | 2188.0 | 17 |
| 1302 | 3208.0 | 16 |
| 375 | 1841.0 | 14 |
| 1184 | 3010.0 | 14 |
| 1197 | 3030.0 | 14 |
| 1279 | 3163.0 | 14 |
| 1342 | 3265.0 | 14 |