Friday, August 19, 2016

Filled Under: ,

Download Free Nairobi Securities Exchange (NSE) Historical Data

Friday, August 19, 2016

 
 
Accessing historical data of listed companies in the Nairobi Securities Exchange is an expensive affair.Other than buying the data here, I have not come across any other way to access the data without pay. It is for this reason that I devised a way to parse the provided past data from the financial times webpage. 
The idea is to open the webpage of lets say, Safaricom Ltd, download the page and extract the data contained in that page using the pythonlibrary, Beautiful Soup running on Matlab .The cleaned data is presented as a table variable or, written in a format supported in Excel, like CSV.

Assumptions  

You have an installed a Matlab engine and python library, Beautiful Soup. Read on how to install Beautiful Soup. In this method, Beautiful Soup is executed from the Matlab interface and not in the python console. Matlab R2014b and Beautiful Soup 4.4.5 are used in this demo. You have working knowledge of Matlab, HTML and Python.

Url Format 


First specify the url to the webpage of the specific company data. The url is separated into two parts.  The first part specifies the path to the general historical data.  The second part identifies the specific security symbol.  Let say we want to download Safaricom data, the first part is 'http://markets.ft.com/data/equities/tearsheet/historical?s='. The second part is the symbol 'SCOM:NAI'.  The full url is http://markets.ft.com/data/equities/tearsheet/historical?s=SCOM:NAI. Using the input interface, you can specify what symbol you wish to download directly from the Matlab interface

Search For the Table 

After downloading the webpage, we convert the page to a Beautiful Soup object.In the object we find  where the data table is specifically in the HTML tags. The webpage has only one table. The table tag consists of  table rows () and data()  that  format the data into a clear way. Each row in the table is extracted and the ResultSet converted to a Matlab cell array Before extracting the data, prepare the data containers that will store the data after parsing from the table.Each of the column: date, open, high, low , close and volume will be stored in a cell. Get Data From Each Row   Having extracted the rows and prepared the data containers next is to search for the data stored in the tags within the rows.  A for-loop that excludes the first row of the table, first row consists of the data headers, literates to get data from each row. Using the get_text() from beautiful soups, the data is parsed from each tag  and stored in the respective data container.   At this stage, the parsed data is assumed to be in the matlab workspace ready for the cleaning process.  Since the data is taken from the tags as strings, the OHLC data is converted to a numeric vector. Data Cleaning The date and volume data comes in a form that is messed up. Lets start with volume, the figures merges the whole number with a two decimal places number that has an ‘m’ or ‘k’ at the end denoting million or thousand respectively. For instance, if today, 19,517,400 shares were traded, then the figure obtained is in this format:   ‘19,517,40019.52m'. After cleaning the code removes the ‘19.52m’ part and remains with the 19,517,400 part. The same case applies to the dates. The raw format extracted is in this form: 'Wednesday, July 20, 2016Wed, Jul 20, 2016'. The code below obtains the cleaned data as 20-Jul-2016. The datetime variable is effective to use during any sort of analysis. The last stage is to combine the data vectors to a table. Alternatively the data can be written to a csv file. First convert the date column to dates that are Excel understands. Then create a matrix of all the data columns. Finally write the data using the csvwrite() function. The csv is saved in the current directory.  This script is limited to download data that is available on the webpage only. Usually, it is the last 30 days  of data for each symbol. However if the user wants all the available historical data, the process would be as follows: 1.  Open the financial times web page manually on your browser. 2. Search for the companies’ historical data in the market data tab  3. On the historical data page, scrolls down the data table till you get a show more . 4. Keep clicking the show more tab till all the data you want is loaded on the web page.  5. Right click the webpage and save it in your device. 

Parsed Locally-Saved HTML File 

The script can parse data from the locally saved file in a similar way. The only change you make is  specifying the file path and replacing the url to the online page with the file path. For instance: html_file =urlread('file:///Users/wachiranguni/Desktop/python/Safaricom Ltd, SCOM_ NAI historical prices - FT.com.csv'); Later the script will be packaged into an excel addin that will make it more useful to the wider audience.  Note this script can be used to download all the NSE listed companies data and any other historical price data set on the financial times website  

Update: 2/07/2018

This method is no longer functional because ft.com no longers displays historical data on the web pages without subscribing. However, investing.com is an alternative site you can download free NSE data.

0 comments:

Post a Comment