CERN’s Large Hadron Collider Publicly Releases Huge 300 TB Experiment Data Bank

Good luck downloading it.
This article is over 8 years old and may contain outdated information

Recommended Videos

CERN’s Large Hadron Collider, the biggest, most powerful particle accelerator in the world, made headlines when it facilitated the discovery of the Higgs boson particle, but it’s produced so much more data than just that one attention-grabber. In the interest of preserving data from the countless particle collisions the LHC has performed, the CMS Collaboration at CERN has now released over 300 TB of experiment data to the public—provided that public knows what to do with it.

For most of us, the data is essentially useless (unless you happen to be unusually particle physics-inclined), and it’s doubtful that many people have the hard drive space to store it all, but CERN believes the transparency is not only important for keeping their data alive, but also for helping and inspiring a new generation of physicists. While the massive data dump includes “primary datasets” in the same format the Compact Muon Solenoid experiment team analyzes at CERN, which take significant computer processing power to work with, there are also “derived datasets” that are simpler, ready for analysis by high school or university students, according to the press release.

This open-door data policy has also led to research partnerships with scientists outside of CERN. The CMS Collaboration researchers can only do so much work with their time, but experiment data contains tons of information that they might not even be interested in, which leaves plenty of room for other scientists to make use of it. They’ve already assisted some MIT researchers in using the data to study the substructure of jets (“showers of hadron clusters recorded in the CMS detector”). They see the full data release and the alternate types of research it allows as a benefit to the scientific community overall, with CMS Collective’s Salvatore Rappoccio saying, “As scientists, we should take the release of data from publicly funded research very seriously. In addition to showing good stewardship of the funding we have received, it also provides a scientific benefit to our field as a whole.”

(via The Verge, image via Steffen Georg Weber/Anton Andronic/CERN)

—The Mary Sue has a strict comment policy that forbids, but is not limited to, personal insults toward anyone, hate speech, and trolling.—

Follow The Mary Sue on Twitter, Facebook, Tumblr, Pinterest, & Google+.


The Mary Sue is supported by our audience. When you purchase through links on our site, we may earn a small affiliate commission. Learn more about our Affiliate Policy
Author
Image of Dan Van Winkle
Dan Van Winkle
Dan Van Winkle (he) is an editor and manager who has been working in digital media since 2013, first at now-defunct Geekosystem (RIP), and then at The Mary Sue starting in 2014, specializing in gaming, science, and technology. Outside of his professional experience, he has been active in video game modding and development as a hobby for many years. He lives in North Carolina with Lisa Brown (his wife) and Liz Lemon (their dog), both of whom are the best, and you will regret challenging him at Smash Bros.