A Medium publication sharing concepts, ideas and codes.


In欧宝全站登录 . More onMedium.

Moving window aggregation strategies with core PySpark and visualizations with Plotly


There areatonof time series metrics out there, and many of these metrics have the same preprocessing steps and use cases. In order to limit redundancy, I will focus on three neat metrics with different use cases:

  • Outlier Detection with Rolling Z-Scores
  • Rolling Correlation Matrices
  • Trend Detection with…

Why correlation matrix is over-used and why you need a non-linear correlation matrix

Photo byJJ JordanonUnsplash

Even though there are uncountable data science techniques and algorithms, sometimes you still have the feeling that something is missing. One such thing which is missing or not much talked about is a non-linear correlation matrix.

We are used to the famous correlation matrix. However, in this article, you will…

Photo bySam MoqadamonUnsplash

In my previous two articles, I talked about how to measure correlations between the various columns in your dataset and how to detect multicollinearity between them:

However, these techniques are useful when the variables you are trying to compare with arecontinuous. How do you compare them if your variables…

Understand how to discover multicollinearity in your dataset

Photo byValentino FunghionUnsplash

In my previous article, you learned about the relationships between data in your dataset, be it within the same column (variance), or between columns (covarianceandcorrelation).

Another two additional terms that you usually encounter when you embark on your machine learning journey are:

  • Collinearity
  • Multicollinearity

In this article, I…

了解你的数据和之间的关系know the difference between Pearson Correlation Coefficient and Spearman’s Rank Correlation Coefficient

Photo byonUnsplash

One of the topics that a data scientist must understand is the relationships that exist in your dataset. Before you start the machine learning process, it is critical to prepare your data so that only the relevant parts of your dataset is used for training. …

Image Processing Essentials

From linear (Correlation and Convolution) and non-linear spatial filtering to special kernels for smoothing, sharpening, noise removal, and edge detection

Part 2.1

Spatial operations are performed directly on the pixels of a given image and we classify these operations in three categories. “Spatial domain operationsis another word you can come across for this topic, these are the same terms!

  1. Single-pixel operations
  2. Neighborhood operations
  3. Geometric transformations
Figure 1 “Image by Author”

We already saw what the single-pixel…

A geometrical approach to Pearson correlation helps to understand it in depth and to interpret its outcome more accurately.

[Image by Author]

Every now and then, someone comes and says “I’ve finally found a replacement for Pearson correlation”. The truth is that — despite its shortcomings — Pearson correlation (a.k.a.rcoefficient) is surprisingly hard to replace, due to its simplicity, robustness and reliability.

However, the cold hard formula may be a…


A Medium publication sharing concepts, ideas and codes.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store