Filtering an html table with .net core


I was looking for a quick and dirty way to filter a table that I got as the result of a search  on a webpage.  For a one-time effort, I could print it to a pdf. However, I wanted to only keep certain rows in that table. I could load it to an excel file and remove the rows I didn’t want. But, in this case, I wanted an automated procedure.  Creating a small .net project turned out to be the quickest way for me to accomplish this.  People more familiar with Perl and other languages may find those easier, but if you know .net, this type of project is very easy.

The first consideration is that the file you are working with is in html and the table of data uses the table html tag. So, we need to filter an html table.

Here is a sample html table and I want to select just the dogs and cats.


I used a razor page to upload the file with the table and then to display the results. Here is the razor view.



For the backend, use the IFormFile class to upload the file. Then read it into a Stream. You can open the Stream with a StreamReader.



I created a list of the animals I wanted to keep so that I could compare with the animals in the table.


Now you want to parse the table in the html so that you can process each row one at a time. HtmlAgilityPack provides an easy way to run through each tr.



Once you have a tr, you need to loop through the tds.


As you are going through the loop, you will compare each animal found with the animals in your selected list using the Contains method.



Save just the rows that have the right animals and here is the result.


Leave a Reply

Your email address will not be published.