I’m trying to understand IOT data in which there are millions of rows consisting of about 100 different features that show various performance, power etc parameters as reported by these devices in the form of CSV file.
Objective is to find out what all rows have anomalies … given so many features and their variations, I am trying to see if there are any statistical methods I could employ here to identify abnormalities.
One way is to parse each cell and check its range etc, and then do some normalization/standardization etc.
I came as Z-test but not sure if that is the right approach here.
Any thoughts on how to approach this problem - any pointers greatly appreciated.