NYU ACCIDENTALLY EXPOSED MILITARY CODE-BREAKING COMPUTER PROJECT TO ENTIRE INTERNET
by Sam Biddle
IN EARLY DECEMBER 2016, Adam was doing what he’s always doing, somewhere between hobby and profession: looking for things that are on the internet that shouldn’t be. That week, he came across a server inside New York University’s famed Institute for Mathematics and Advanced Supercomputing, headed by the brilliant Chudnovsky brothers, David and Gregory. The server appeared to be an internet-connected backup drive. But instead of being filled with family photos and spreadsheets, this drive held confidential information on an advanced code-breaking machine that had never before been described in public. Dozens of documents spanning hundreds of pages detailed the project, a joint supercomputing initiative administered by NYU, the Department of Defense, and IBM. And they were available for the entire world to download.
The supercomputer described in the trove, “WindsorGreen,” was a system designed to excel at the sort of complex mathematics that underlies encryption, the technology that keeps data private, and almost certainly intended for use by the Defense Department’s signals intelligence wing, the National Security Agency. WindsorGreen was the successor to another password-cracking machine used by the NSA, “WindsorBlue,” which was also documented in the material leaked from NYU and which had been previously described in the Norwegian press thanks to a document provided by National Security Agency whistleblower Edward Snowden. Both systems were intended for use by the Pentagon and a select few other Western governments, including Canada and Norway.
Adam, an American digital security researcher, requested that his real name not be published out of fear of losing his day job. Although he deals constantly with digital carelessness, Adam was nonetheless stunned by what NYU had made available to the world. “The fact that this software, these spec sheets, and all the manuals to go with it were sitting out in the open for anyone to copy is just simply mind blowing,” he said.
He described to The Intercept how easy it would have been for someone to obtain the material, which was marked with warnings like “DISTRIBUTION LIMITED TO U.S. GOVERNMENT AGENCIES ONLY,” “REQUESTS FOR THIS DOCUMENT MUST BE REFERRED TO AND APPROVED BY THE DOD,” and “IBM Confidential.” At the time of his discovery, Adam wrote to me in an email:
All of this leaky data is courtesy of what I can only assume are misconfigurations in the IMAS (Institute for Mathematics and Advanced Supercomputing) department at NYU. Not even a single username or password separates these files from the public internet right now. It’s absolute insanity.
The files were taken down after Adam notified NYU.
Intelligence agencies like the NSA hide code-breaking advances like WindsorGreen because their disclosure might accelerate what has become a cryptographic arms race. Encrypting information on a computer used to be a dark art shared between militaries and mathematicians. But advances in cryptography, and rapidly swelling interest in privacy in the wake of Snowden, have helped make encryption tech an effortless, everyday commodity for consumers. Web connections are increasingly shielded using the HTTPS protocol, end-to-end encryption has come to popular chat platforms like WhatsApp, and secure phone calls can now be enabled simply by downloading some software to your device. The average person viewing their checking account online or chatting on iMessage might not realize the mathematical complexity that’s gone into making eavesdropping impractical.
The spread of encryption is a good thing — unless you’re the one trying to eavesdrop. Spy shops like the NSA can sometimes thwart encryption by going around it, finding flaws in the way programmers build their apps or taking advantage of improperly configured devices. When that fails, they may try and deduce encryption keys through extraordinarily complex math or repeated guessing. This is where specialized systems like WindsorGreen can give the NSA an edge, particularly when the agency’s targets aren’t aware of just how much code-breaking computing power they’re up against.
Adam declined to comment on the specifics of any conversations he might have had with the Department of Defense or IBM. He added that NYU, at the very least, expressed its gratitude to him for notifying it of the leak by mailing him a poster.
While he was trying to figure out who exactly the Windsor files belonged to and just how they’d wound up on a completely naked folder on the internet, Adam called David Chudnovsky, the world-renowned mathematician and IMAS co-director at NYU. Reaching Chudnovsky was a cinch, because his entire email outbox, including correspondence with active members of the U.S. military, was for some reason stored on the NYU drive and made publicly available alongside the Windsor documents. According to Adam, Chudnovsky confirmed his knowledge of and the university’s involvement in the supercomputing project; The Intercept was unable to reach Chudnovsky directly to confirm this. The school’s association is also strongly indicated by the fact that David’s brother Gregory, himself an eminent mathematician and professor at NYU, is listed as an author of a 164-page document from the cache describing the capabilities of WindsorGreen in great detail. Although the brothers clearly have ties to WindsorGreen, there is no indication they were responsible for the leak. Indeed, the identity of the person or persons responsible for putting a box filled with military secrets on the public internet remains utterly unclear.
An NYU spokesperson would not comment on the university’s relationship with the Department of Defense, IBM, or the Windsor programs in general. When The Intercept initially asked about WindsorGreen the spokesperson seemed unfamiliar with the project, saying they were “unable to find anything that meets your description.” This same spokesperson later added that “no NYU or NYU Tandon system was breached,” referring to the Tandon School of Engineering, which houses the IMAS. This statement is something of a non sequitur, since, according to Adam, the files leaked simply by being exposed to the open internet — none of the material was protected by a username, password, or firewall of any kind, so no “breach” would have been necessary. You can’t kick down a wide open door.
The documents, replete with intricate processor diagrams, lengthy mathematical proofs, and other exhaustive technical schematics, are dated from 2005 to 2012, when WindsorGreen appears to have been in development. Some documents are clearly marked as drafts, with notes that they were to be reviewed again in 2013. Project progress estimates suggest the computer wouldn’t have been ready for use until 2014 at the earliest. All of the documents appear to be proprietary to IBM and not classified by any government agency, although some are stamped with the aforementioned warnings restricting distribution to within the U.S. government. According to one WindsorGreen document, work on the project was restricted to American citizens, with some positions requiring a top-secret security clearance — which as Adam explains, makes the NYU hard drive an even greater blunder:
Let’s, just for hypotheticals, say that China found the same exposed NYU lab server that I did and downloaded all the stuff I downloaded. That simple act alone, to a large degree, negates a humongous competitive advantage we thought the U.S. had over other countries when it comes to supercomputing.
The only tool Adam used to find the NYU trove was Shodan.io, a website that’s roughly equivalent to Google for internet-connected, and typically unsecured, computers and appliances around the world, famous for turning up everything from baby monitors to farming equipment. Shodan has plenty of constructive technical uses but also serves as a constant reminder that we really ought to stop plugging things into the internet that have no business being there.
The WindsorGreen documents are mostly inscrutable to anyone without a Ph.D. in a related field, but they make clear that the computer is the successor to WindsorBlue, a next generation of specialized IBM hardware that would excel at cracking encryption, whose known customers are the U.S. government and its partners.
Experts who reviewed the IBM documents said WindsorGreen possesses substantially greater computing power than WindsorBlue, making it particularly adept at compromising encryption and passwords. In an overview of WindsorGreen, the computer is described as a “redesign” centered around an improved version of its processor, known as an “application specific integrated circuit,” or ASIC, a type of chip built to do one task, like mining bitcoin, extremely well, as opposed to being relatively good at accomplishing the wide range of tasks that, say, a typical MacBook would handle. One of the upgrades was to switch the processor to smaller transistors, allowing more circuitry to be crammed into the same area, a change quantified by measuring the reduction in nanometers (nm) between certain chip features. The overview states:
The WindsorGreen ASIC is a second-generation redesign of the WindsorBlue ASIC that moves from 90 nm to 32 nm ASIC technology and incorporates performance enhancements based on our experience with WindsorBlue. We expect to achieve at least twice the performance of the WindsorBlue ASIC with half the area, reduced cost, and an objective of half the power. We also expect our system development cost to be only a small fraction of the WindsorBlue development cost because we carry forward intact much of the WindsorBlue infrastructure.
Çetin Kaya Koç is the director of the Koç Lab at the University of California, Santa Barbara, which conducts cryptographic research. Koç reviewed the Windsor documents and told The Intercept that he has “not seen anything like [WindsorGreen],” and that “it is beyond what is commercially or academically available.” He added that outside of computational biology applications like complex gene sequencing (which it’s probably safe to say the NSA is not involved in), the only other purpose for such a machine would be code-breaking: “Probably no other problem deserves this much attention to design an expensive computer like this.”
Andrew “Bunnie” Huang, a hacker and computer hardware researcher who reviewed the documents at The Intercept’s request, said that WindsorGreen would surpass many of the most powerful code-breaking systems in the world: “My guess is this thing, compared to the TOP500 supercomputers at the time (and probably even today) pretty much wipes the floor with them for anything crypto-related.” Conducting a “cursory inspection of power and performance metrics,” according to Huang, puts WindsorGreen “heads and shoulders above any publicly disclosed capability” on the TOP500, a global ranking of supercomputers. Like all computers that use specialized processors, or ASICs, WindsorGreen appears to be a niche computer that excels at one kind of task but performs miserably at anything else. Still, when it comes to crypto-breaking, Huang believes WindsorGreen would be “many orders of magnitude … ahead of the fastest machines I previously knew of.”
But even with expert analysis, no one beyond those who built the thing can be entirely certain of how exactly an agency like the NSA might use WindsorGreen. To get a better sense of why a spy agency would do business with IBM, and how WindsorGreen might evolve into WindsorOrange (or whatever the next generation may be called), it helps to look at documents provided by Snowden that show how WindsorBlue was viewed in the intelligence community. Internal memos from Government Communications Headquarters, the NSA’s British counterpart, show that the agency was interested in purchasing WindsorBlue as part of its High Performance Computing initiative, which sought to help with a major problem: People around the world were getting too good at keeping unwanted eyes out of their data.
Under the header “what is it, and why,” one 2012 HPC document explains, “Over the past 18 months, the Password Recovery Service has seen rapidly increasing volumes of encrypted traffic … the use of much greater range of encryption techniques by our targets, and improved sophistication of both the techniques themselves and the passwords targets are using (due to improved OPSec awareness).” Accordingly, GCHQ had begun to “investigate the acquisition of WINDSORBLUE … and, subject to project board approval, the procurement of the infrastructure required to host the a [sic] WINDSORBLUE system at Benhall,” where the organization is headquartered.
Among the Windsor documents on the NYU hard drive was an illustration of an IBM computer codenamed “Cyclops,” (above) which appears to be a WindsorBlue/WindsorGreen predecessor. A GCHQ documentprovided by Snowden (below) describes Cyclops as an “NSA/IBM joint development.”
In April 2014, Norway’s Dagbladet newspaper reported that the Norwegian Intelligence Service had purchased a cryptographic computer system code-named STEELWINTER, based on WindsorBlue, as part of a $100 million overhaul of the agency’s intelligence-processing capabilities. The report was based on a document provided by Snowden:
The document does not say when the computer will be delivered, but in addition to the actual purchase, NIS has entered into a partnership with NSA to develop software for decryption. Some of the most interesting data NIS collects are encrypted, and the extensive processes for decryption require huge amounts of computing power.
Widespread modern encryption methods like RSA, named for the initials of the cryptographers who developed it, rely on the use of hugely complex numbers derived from prime numbers. Speaking very roughly, so long as those original prime numbers remain secret, the integrity of the encoded data will remain safe. But were someone able to factor the hugely complex number — a process identical to the sort of math exercise children are taught to do on a chalkboard, but on a massive scale — they would be able to decode the data on their own. Luckily for those using encryption, the numbers in question are so long that they can only be factored down to their prime numbers with an extremely large amount of computing power. Unluckily for those using encryption, government agencies in the U.S., Norway, and around the globe are keenly interested in computers designed to excel at exactly this purpose.
Given the billions of signals intelligence records collected by Western intelligence agencies every day, enormous computing power is required to sift through this data and crack what can be broken so that it can be further analyzed, whether through the factoring method mentioned above or via what’s known as a “brute force” attack, wherein a computer essentially guesses possible keys at a tremendous rate until one works. The NIS commented only to Dagbladet that the agency “handles large amounts of data and needs a relatively high computing power.” Details about how exactly such “high computing power” is achieved are typically held very close — finding hundreds of pages of documentation on a U.S. military code-breaking box, completely unguarded, is virtually unheard of.
A very important question remains: What exactly could WindsorBlue, and then WindsorGreen, crack? Are modern privacy mainstays like PGP, used to encrypt email, or the ciphers behind encrypted chat apps like Signal under threat? The experts who spoke to The Intercept don’t think there’s any reason to assume the worst.
“As long as you use long keys and recent-generation hashes, you should be OK,” said Huang. “Even if [WindsorGreen] gave a 100x advantage in cracking strength, it’s a pittance compared to the additional strength conferred by going from say, 1024-bit RSA to 4096-bit RSA or going from SHA-1 to SHA-256.”
Translation: Older encryption methods based on shorter strings of numbers, which are easier to factor, would be more vulnerable, but anyone using the strongest contemporary encryption software (which uses much longer numbers) should still be safe and confident in their privacy.
Still, “there are certainly classes of algorithms that got, wildly guessing, about 100x weaker from a brute force standpoint,” according to Huang, so “this computer’s greatest operational benefit would have come from a combination of algorithmic weakness and brute force. For example, SHA-1, which today is well-known to be too weak, but around the time of 2013 when this computer might have come online, it would have been pretty valuable to be able to ‘routinely’ collide SHA-1 as SHA-1 was still very popular and widely used.”
A third expert in computer architecture and security, who requested anonymity due to the sensitivity of the documents and a concern for their future livelihood, told The Intercept that “most likely, the system is intended for brute-forcing password-protected data,” and that it “might also have applications for things like … breaking older/weaker (1024 bit) RSA keys.” Although there’s no explicit reference to a particular agency in the documents, this expert added, “I’m assuming NSA judging by the obvious use of the system.”
Huang and Koç both speculated that aside from breaking encryption, WindsorGreen could be used to fake the cryptographic signature used to mark software updates as authentic, so that a targeted computer could be tricked into believing a malicious software update was the real thing. For the NSA, getting a target to install software they shouldn’t be installing is about as great as intelligence-gathering gifts come.
The true silver bullet against encryption, a technology that doesn’t just threaten weaker forms of data protection but all available forms, will not be a computer like WindsorGreen, but something that doesn’t exist yet: a quantum computer. In 2014, the Washington Post reported on a Snowden document that revealed the NSA’s ongoing efforts to build a “quantum” computer processor that’s not confined to just ones and zeroes but can exist in multiple states at once, allowing for computing power incomparable to anything that exists today. Luckily for the privacy concerned, the world is still far from seeing a functional quantum computer. Luckily for the NSA and its partners, IBM is working hard on one right now.
Repeated requests for comment sent to over a dozen members of the IBM media relations team were not returned, nor was a request for comment sent to a Department of Defense spokesperson. The NSA declined to comment. GCHQ declined to comment beyond its standard response that all its work “is carried out in accordance with a strict legal and policy framework, which ensures that our activities are authorised, necessary and proportionate, and that there is rigorous oversight.”
Documents published with this story:
- IBM: Pages From WindsorGreen ASIC Status Report 12 07 2012 (from NYU files)
- GCHQ: Excerpt from Hpc Overview 1011 (from Snowden files)