The above deals with manual importation of the data, but what about the page number issue?
The data still needs some cleaning up however, you can learn how to do that by keeping up with our Power Query Pointers series! To launch Power Query, we’ll use Excel 2016 and select ‘New Query’ from the ‘Data’ tab: It does not matter which year we wish to begin with for this example, we shall begin with 2017 (it’s a little too early in the year for 2018!): For example, we wish to retrieve the gross earnings of all of the movies released in a particular year, along with their current rank and their studio. Let’s work through a simple example to illustrate Custom Function’s utility. The benefit of having a custom function is that we can repeat the same steps to a refreshed dataset if need be. A custom function is a query that is run by other queries, for those of you who know Java from coffee beans, and is similar to what is known as an Object Method. One proposed solution, proposed by MVP Reza Rad, is to utilise Custom Functions in Power Query. the inability to manually retrieve all of the data just by importing it using Power Query. Let’s address the first issue, then, i.e.
Essentially, the website uses JavaScript code to dynamically refresh the list of players on one page, thus enabling the webpage to dynamically refresh the player list in one page, without changing the webpage’s URL. This is because Power Query retrieves data based on the URL and in this case our Power Query friendly hockey statistics website displays data using JavaScript.
To manually import the data from this hockey statistics site using Power Query, first open Excel then navigate to the ‘Data’ tab and click on the ‘New Query’ option, then select the ‘Other Sources’ option followed by ‘Web’. You should note that this method does not yield the complete list this will be detailed later on. So how may we extract all of the data? To answer this, let’s get there in five steps.īefore we move on to our proposed solution, we should first cover how to manually import data from the hockey statistics site. This seemingly defeats Power Query (or Power BI) as URLs for each page of table data are required. When you click on the second or subsequent page of data, the URL for the website does not change. The thing is, the embedded table actually has 17 pages of data and let’s say we wish to extract all of this data for analysis elsewhere. The webpage is nicely set out and contains a table of hockey player statistics.