Data analysis/webscapping

I’m working on a project to extract data from a website which has column of data continuously updating. Anyone have idea how do I go through the entire column and find value on a particular cell of that column based on certain condition?

1 Like

Hi @IkolR,
I would suggest you to use Beautifulsoup for scrapping that particular webpage, and then use Pandas to read that HTML as a dataframe.
Then you’ll be able to look for a value in any column in one go using pandas.

Yes understood but how about massive data? I think Pandas will slow down or stop working if it’s massive data. I’m looking for solutions which will lower the latency. I’m thinking Julia might help me as it can call Python library inside it or may be Dask which works like Pandas. But I’m looking for suggestions.

Yes you can look for efficient alternatives after finalizing a path to do the desired task.

Hi @Ishvinder
How do we go about it if the web page is in node js format?

It happens the same way, by loading an HTTP request library, or bs4, or selenium.
You can refer to this link for a reference

1 Like

Thanks for the reference. I will check and see if it works for me.