Posted on December 23, 2014 · Posted in Powershell

Parsing HTML Webpages with Powershell

In PowerShell 3.0 it became possible to directly access and parse HTML web pages. To do it, a special cmdlet Invoke-WebRequest has been introduced. This cmdlet can provide number of scenarios: from uploading/downloading a file to/from any website using HTTP(S)/FTP ending with HTML page parsing, web services monitoring, filling and sending web-forms. In general, the new cmdlet provides all necessary methods to navigate the DOM tree of an HTML document. In this article, we will cover basic examples of using Invoke-WebRequest cmdlet in PowerShell.

Tip. This cmdlet works in Windows PowerShell 3.0, so before you start, make sure that you are using this or newer.

How to Get a List of All Links on a Page

Invoke-WebRequest allows to get the content of any web page and returns the collection of forms, links, images or other important elements of an HTML document.

Let’s open the main page of our site and get the list of the links on it

$SiteAdress = ""

$HttpContent = Invoke-WebRequest -URI $SiteAdress

$HttpContent.Links | Foreach {$_.href }

Invoke-WebRequest in Powershell 3.0

To get the text of the link (contained in the InnerText element), use this command:

$HttpContent.Links | fl innerText, href

Posh: get hrefs on webpage

How to Download a File Via HTTP Using Powershell

Invoke-WebRequest can work as Wget or cURL for Windows and allowing to download a files from a web page or ftp site. Suppose, we need  to download a file via HTTP using Powershell (in this case installation file of Mozilla Firefox). Execute this command:

Invoke-WebRequest "” -outfile “c:\too1s\firefox setup 34.0.5.exe”

Download file via http in Powershell

This cmdlet downloads a file from the specified URL and saves it to c:\tools\ under the name “firefox setup 34.0.5.exe”. If you need to download a file from FTP, just substitute http: // with ftp://.

Checking the status of Web Server Response and HTTP Headers

Using Invoke-WebRequest, you can get the code of web server response and the status of an HTML page

$HttpContent = Invoke-WebRequest ""


As you can see, the server has returned the response 200, i. e. the request has been successful, and the web server is available and works correctly.

The other HTTP headers of a web page can be obtained as follows:


HTML Parsing Using Powershell

Invoke-WebRequest cmdlet allows to parse the content of any web pages quickly and conveniently. When processing an HTML page, the collections of links, forms, images, scripts, etc., are created.

Let’s get the content of the home page of our website using Powershell:

$Img = Invoke-WebRequest ""

Then display a list of all images on this page


Create a collection of full URL paths to these images:

$images = $Img.Images | select src

Initialize a new instance of WebClient class.


$wc = New-Object System.Net.WebClient

And download all images from the page (with their original names) to c:\too1s\.

$images | foreach { $wc.DownloadFile( $_.src, ("c:\tools\"+[io.path]::GetFileName($_.src) ) ) }

Grab images from html webpage

Filling and Sending Forms in Powershell

Many web services require various data to be entered into HTML forms. With Invoke-WebRequest, you can access any HTML form, fill in the necessary fields and send the filled form back to the server, In this example we’ll show how to log on Facebook via its standard web form using Powershell.

Fill and send Facebook login form with Powershell

With the following command, save the information about connection cookies in a separate session variable.

$fbauth = Invoke-WebRequest -SessionVariable session

Using the next command, display the list of the fields to be filled in the login HTML form (login_form).


Assign the necessary values to all fields:

$fbauth.Forms["login_form"].Fields["email"] = ""

$fbauth.Forms["login_form"].Fields["pass"] = "Coo1P$wd"


To send the filled form to the server, call the action attribute of the HTML form.

$Log = Invoke-WebRequest -method POST -URI ("" + $fbauth.Forms["login_form"].Action) -Body $fbauth.Forms["login_form"].Fields -WebSession $session

Related Articles