Blaaaaaaaaargh! That’s probably the response of many a data analyst or scientist when faced with the task of cleaning the data. And many think that it’s a boring, mindless task. While I admit that some of it is grunge work, it’s not a waste of time. Data scientists are not wasting their ability at this task.As you know, whenever you embark on a data project, the first step is to gain an understanding of the topic at hand. And the best way to do that is to gain a bird’s eye view of the data!
Of course, at this early stage, relationships among data are not well known and only can be guessed at. This is where the data scientist’s expertise comes in. Their ability to choose a fruitful direction to work in is almost uncanny.
However, I must admit that repetitive tasks at this stage are involved. These can be delegated to non-experts. But it seems to be a waste of time as data scientists can do this work way more efficiently than others. So don’t try to take data munging away from data scientists!
David Johnston makes the case for this far better than I can.