Use Excel to Scrape Data (NO CODES REQUIRED)

 

Photo by cyda

Motivation

Despite the fact that Python and R are growing extremely fast, no one can deny that Excel is still one of the major tools for analysts to do tabular exploration and I found that there is a large proportion of the audience is interested in leveraging Excel to further study the data.

One of the major tasks is to regularly update the Excel spreadsheet based on the latest data. In my previous post, I have illustrated how to use Python to do so, and in case you have not yet seen it, you may check the link below.

However, I have received some feedback telling me that they have no clues on how to write Python code. The only tool they or their companies would use is Excel and this is the reason why I write this article.


Goal

A few years ago when I was new to Python and R, I have no ideas how to collect data from the Internet, let alone knowing what is data scraping. The only way I download the information is to copy and paste it directly. But along with the time, I think this is quite tedious and non-sense to do this day by day and therefore I try to look for other solutions. In this article, I will detailly illustrate how to easily use Excel, a common tool we always use, to scrape the data without writing a line of code.


Data to Scrape

Basically, just for demonstration purposes, I have chosen the stock price as the target example. You may check the link below to see how the page looks like.

https://finance.yahoo.com/most-active/?offset=0&count=250
Photo by cyda

Simply speaking, we would like to get the table to capture the most active stocks and their corresponding statistics.


Step 1 — Import data from web page

Photo by cyda

Step 2 — Input the URL

Photo by cyda

Step 3 — Select and load the table

Photo by cyda

Result

Photo by cyda

Congratulations! You have just learned how to use Excel to get data from the Internet. I hope you will not find any difficulties but before you leave, here is one crucial trick to remind you.


Tips — Regular Updates

In fact, the data you just scraped is in one-off nature. In the other words, the table will not update accordingly when the figures are changed. To set the timely refresh, you may follow the below procedure:

Photo by cyda
Photo by cyda

There are two types of updating patterns people usually use. The first one is to refresh every k minutes. This is useful when you have to keep tracking the movement or trend for the figures. The second one is to refresh data when opening the file. This is to ensure you would have up-to-date data when you need to use the data.


Conclusion

This is a relatively short article and I hope this article can help those who have no experience with Python but require to regularly update the Excel spreadsheet. If you would like to know more office tricks with programming, you can subscribe to my medium and I will update and share more skills with you all. =)

Comments