欧宝全站登录
A Medium publication sharing concepts, ideas and codes.

Data Science

In欧宝全站登录 . More onMedium.

Hands-on

Finally, start practicing SQL with your own database

Writing SQL is important. Being able to efficiently query a database is often considered one of the most essential skills to develop as an aspiring data analyst/scientist.

SQL is not only important but also quite commonly used. According to theStackoverflow Developer Survey 2021, SQL ranks among the top five…

Adapting course activities to prepare for a competitive job market

I teach introductory courses on data analysis and visualization. Some of my students will pursue a career in research, but most of them will be using their data skills in community-based agencies where there is a strong demand for practical data skills. I see more job opportunities than ever before…

All you should know about clustering in 11 minutes

In this article, you will find a complete clustering cheat sheet. In eleven minutes you will be able to know what it is and to refresh your memory of the main algorithms.

聚类(也称为cluster analysis)是一个分组si的任务milar instances into clusters. More formally, clustering is…

Learn a couple of useful probabilistic tricks to safely navigate the sea of incomplete information

Sometimes we wish we knew something that we don’t. Unfortunately, in many cases, there is no time or even no way to learn what we need. Nevertheless, decisions and assessments need to be made with only disposable knowledge. While to many of us navigating in the mist of incomplete information…

I’ll make you a promise, you can make this dashboard as fast as you can make a standard visualisation of the same calibre. This will look way better than your Matplotlib or ggplot plot.

Instead of sending a visualization to a colleague, why not send a dashboard? Want to impress your boss, ask him to open one of these in his or her browser.

I know what you’re thinking, that sounds like way more effort — it’s really not.

You can even send it…

Machine Learning

A reflection on how the use of the train_test_split () function could lead to wrong results, coupled with a practical demonstration

Surely almost all data scientists have tried to use thetrain_test_split()function at least once in their life. Thetrain_test_split()function is provided by the scikit-learn Python package. Usually, we do not care much about the effects of using this function, because with a single line of code we obtain…

How we improved the randomized assignment algorithm for online experiments at Wish

Contributors:Qike (Max) Li,Samir Jamkhande

Recently at Wish, we discovered asample ratio mismatch(SRM) in anA/A test. SRM refers to the mismatch between the sample ratio set by the experimenter and the observed sample ratio. SRM indicates potential data quality problems that change the outcomes of your…

欧宝全站登录

A Medium publication sharing concepts, ideas and codes.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
一个按钮说'获得它,谷歌播放',如果点击它将导致您到Google Play商店
Baidu
map