Skip to main content

Datasets

Datasets

Def: A dataset is a collection of data or a dataset consists of all of the information that needs analysis, generally the data is gathered during a survey.
 
You have great idea. You want to conduct some experiments. You collect your own data, test your idea and report results. Most often the data collected locally is small (to suit your experimentation).

You submit your results and then there is a comment from a reviewer saying that  (a) your experiments have not been tested on a large public datasets and (b) you need to compare your results with existing literature.

You can address (b) by implementing the existing algorithms and then running these algorithms on your small dataset. Obviously more work (need to implement an already known algorithm!) without much returns.

Use of publicly available datasets is a a good idea. It addresses both (a) and (b)!
  1. You get to test you idea and experiment on a large public dataset
  2. You can compare your results with existing literature (assuming that they have also reported results on the same dataset)
Here are a list of datasets that can be of use
All said and done, as a user of these datasets one should also strive to create datasets and put them up for public use!

Comments

Apurv said…
Nice information Sir. Thanks
oops. saw it only today. Thanks.