Datasets
Def: A dataset is a collection of data or a dataset consists of all of the information that needs analysis, generally the data is gathered during a survey.You have great idea. You want to conduct some experiments. You collect your own data, test your idea and report results. Most often the data collected locally is small (to suit your experimentation).
You submit your results and then there is a comment from a reviewer saying that (a) your experiments have not been tested on a large public datasets and (b) you need to compare your results with existing literature.
You can address (b) by implementing the existing algorithms and then running these algorithms on your small dataset. Obviously more work (need to implement an already known algorithm!) without much returns.
Use of publicly available datasets is a a good idea. It addresses both (a) and (b)!
- You get to test you idea and experiment on a large public dataset
- You can compare your results with existing literature (assuming that they have also reported results on the same dataset)
- Public Data Sets Amazon
- UCI Machine Learning Repository: Data Sets
- Small TSP Dataset
- Climate and other datasets
- List of dataset for Data Mining
- Statistics Dataset (includes breakfast cereal dataset!)
- Stanford Large Network dataset
- Large list of dataset
- ChestXray dataset from NIHCC
Comments