Algorithmic Trading Python

K-Means Clustering of STI Component Stocks

In this tutorial of machine learning on algorithmic trading, we will show how to perform k-means clustering of STI component stocks.
First we will import all the necessary python packages such as pandas, talib, yahoo finance 
Next, we use pandas read_html to scrape the web and get all the  STI components  from the Wikipedia page.
Next, we use yahoo finance package to extract 3 features of financial fundamental ratios such as beta, eps and PE ratio. You can get more features if you need more clustering. In general, the more features you have, the more clusters will be needed. 
Next, we will scale the data for K-Means clustering using scikit learn standard scalar.
Next, we determine the minimum k needed to cluster the data. It seems that the minimum is 4. We will choose 5 for this tutorial.
Next, we will perform the K-Means clustering on the features using scikit learn package.
We can check the cluster assignment here. We note that there is only stock for label 3 and 4. Label 4 only has Jardine Matheson Holding and Label 3 only has STATS.
Next, we visualize the clusters on the 2D plots of eps vs beta. We see that generally there are about 4 quadrants, high beta-low eps, low beta-low-eps, high beta-high eps, and high beta-high eps.  It is interesting that OCBC is not the same cluster as DBS and UOB.

The code can be downloaded below

Recommended trading platforms


Recommended courses on Algorithmic Trading

1. Basic Algorithmic Trading with Python

2. Machine Learning for Algorithmic Trading

Other training courses from Tertiary Courses

June 12, 2022