Computer and Information SciencesOther

DataCite Blog - DataCite

DataCite Blog - DataCite
Connecting Research, Advancing Knowledge
Home PageAtom Feed

CSV in many ways is for data what Markdown is for text documents: a very simple format that is both human- and machine-readable, and that – despite a number of shortcomings – is widely used. Given the popularity of Markdown for writing blog posts, using CSV to publish blog posts with tabular data should be an obvious thing to do, and we have just published our first blog post using CSV data.


One of my personal highlights in last week’s Research Data Alliance (RDA) 6th Plenary Meeting in Paris was the Data Packages Birds of a Feather (BoF), organized by Rufus Pollock from the Open Knowledge Foundation (OKFN). He highlighted the urgent need for packacking data in a standard format to facilitate reuse, and described the extensive work the OKFN has done on data packages.


CSV (comma-separated values) is a popular file format for data. It is popular because it is very simple: CSV is text-based and any application that can open text files can read or write CSV. This makes it a good fit for digital preservation. We don’t know how many of the datasets in DataCite use CSV because the format metadata attribute is not used much (this query gives you some examples), but we know that the number is big.