What happened to Healthcare.gov?

Also, 3 tips to make sure it doesn't happen to your site

By Mike Miliard , Executive Editor | October 11, 2013 | 12:02 PM

It's been almost two weeks since Obamacare's federal insurance exchange website went live, was inundated with traffic, went weird, was taken down for maintenance, then came back online still filled with glitches. Why did such a crucial site fail at such a critical moment? And what are the lessons that can be learned?

When President Barack Obama rode to reelection in 2012, his campaign was widely praised for the preternatural talents of its IT team, for the innovative data platform (codename: Narwhal) that helped pinpoint and make the most of every last potential vote.

So why, after making it the centerpiece of his presidency and spending untold time and effort defending it from judicial and legislative challenge, has the administration seen the main website for Obama's signature achievement fail so badly at its moment of truth?

When even the usually-anodyne USA Today is calling the site an "inexcusable mess," there are big problems.

In a laudatory profile in The Atlantic last year, the Obama tech team was described as an "elite" squad of "senior" talent (which for techies meant that "most of them were in their 30s") who came from Silicon Valley blue chips such as Google, Twitter and Facebook. They were called a "dream team of engineers" who "built the software that drove Barack Obama's reelection."

Of course, it was a different bunch of folks who set up the Obamacare website. The chief contract went to CGI Group, a Canadian firm that help set up that country's single-payer sites. Columbia, Md.-based Quality Software Services was also enlisted by the Centers for Medicare & Medicaid Services to build data infrastructure, as were several other contractors.

Recognizing that the federal bureaucratic procurement process operates differently than a lean and creative campaign technology team, it's still worth asking how an ostensibly forward-thinking and IT-savvy administration could get the technology so right in November 2012 and so catastrophically wrong in October 2013. After all, couldn't they see what a big deal this would be? Why weren't they ready?

Well, technology can be complicated. And unpredictable, even with the best planning. As that Atlantic profile put it, even as the 2012 campaign was in its home some stretch, the much-vaunted Obama techies realized their best-laid plans could go astray.

"I know we had the best technology team I've ever worked with, but we didn't know if it would work," the campaign's chief technology officer Harper Reed told the magazine. "I was incredibly confident it would work. … We had time. We had resources. We had done what we thought would work, and it still could have broken. Something could have happened."

The Healthcare.gov debacle has shown that even with time and resources – the Supreme Court upheld Obamacare back in June of 2012; all told, CMS is on the hook for nearly $400 million for the website so far – things do indeed sometimes happen. On Oct. 1 something sure did. What was supposed to be the dawning of a new era turned into a nightmare for the Obama administration as the site wilted under the pressure.

And so it was that Jon Stewart threw down the gauntlet to HHS Secretary Kathleen Sebelius on The Daily Show this week: "We're going to do a challenge: I'm going to try and download every movie ever made, and you are going to try to sign up for Obamacare – and we'll see which happens first."

The website did not recognize some users; for others, it prevented them from setting up accounts; some were asked to confirm email addresses, but couldn't because of faulty links. Within a couple days, the entire site was taken down for maintenance. Even now that it's back online, it's having problems.

Indeed, the issues have been irksome – and potentially damaging to the larger goal of Obamacare. How many of those reported 9 million visitors from 36 states simply decided not to get coverage – decided that it wasn't worth the headaches – never to return?

"The biggest thing we've learned is that, in retail – which, granted, isn't the same case as here – but in retail, if you don't get the user experience right the first time, consumers will go somewhere else," Tom Lounibos, chief executive officer of Mountain View, Calif.-based SOASTA, which specializes in testing high-traffic websites before go-live, tells Healthcare IT News. "Those brands lose revenue if it's not a quality user experience."

So what happened? It wasn't the underreported fact that the website was initially built in a garage (presumably not from duct tape, plywood and fishing line). No, the failure appears to have come from any number of technical causes, coupled with an unfortunate lack of foresight.

As Government Health IT's Anthony Brino showed earlier this week, a crowdsourced group of computer geeks on Reddit have been doing some amateur sleuthing, unearthing bad bits of front-end coding, snafus with Javascript and CSS files and HTML tags, etc.

Experts interviewed by the Wall Street Journal found a "glut of stray software code that served no purpose they could identify."

But as was argued in Slate this earlier this week, the site's "biggest problems are most likely not in the front-end code of the site's Web pages, but in the back-end, server-side code that handles – or doesn't handle – the registration process, which no one can see."

As Reuters put it: "One possible cause of the problems is that hitting 'apply' on HealthCare.gov causes 92 separate files, plug-ins and other mammoth swarms of data to stream between the user's computer and the servers powering the government website."

Lounibos inclined to agree. "In this case, I believe there's a back office where they have to go and verify people's names as individuals," he says. "Sometimes testing is done, but they don't test across all environments, and all components that are servicing a particular website. Components might have been tested, but they might not have brought it all together into a single environment: the back-end and the front-end. It seems like, from what I'm reading, that a lot of the problems have been in the identity checking area."

SOASTA specializes in scale and performance testing to nip problems with websites in the bud before launch. Able to simulate the simultaneous clicks of many millions of visitors, it's been enlisted by the websites of the 2012 London and upcoming Sochi Olympics, NASA's Mars Curiosity rover program and "pretty much all the major e-commerce brands these around the world, companies that do billions of dollars every year through their websites," says Lounibos.

He too is baffled by the Healthcare.gov site's many failings so far.

"Intuitively, for all of us consumers, it does seem like a logic test, and everybody's flunking," he says.

'We can do better'
Even the unceasingly energetic and optimistic Todd Park – known to most healthcare IT observers as the cofounder of athenahealth and the former chief technology officer of the Department of Health and Human Services – sounded frustrated when he spoke to the New York Times earlier this week.

These days, Park is CTO of the United States. The site, he said, was a victim of its own success. "At lower volumes, it would work fine," Park told the Times. "At higher volumes, it has problems."

"Right now," he said, "we've got what we think we need. The contractors have sent reinforcements. They are working 24-7. We just wish there was more time in a day."

The newspaper also spoke to Park's predecessor as CTO, Aneesh Chopra. He was the first man to hold the position Obama established in 2009.

"This is par for the course for large-scale IT projects," Chopra told the Times. "We wish we could launch bug-free, but in reality that's not that easy to do. The reality is that if you have a product that people want, people will tolerate glitches because they expect them."

Still, it seems to defy belief that such a hugely important and anticipated site could be rolled out so sloppily. Its failure has chagrinned Health and Human Services officials, who have rushed to make big improvements to its site design and infrastructure.

"We can do better and we are working around the clock to do so," HHS press secretary Joanne Peters told the Wall Street Journal.

This all could have been avoided, Lounibos tells Healthcare IT News.

"There's really no excuse," he says. "They've had enough time, from what I can tell, to fully test. Our retail and e-commerce customers continuously test – not waiting until the end to test but literally testing every day, all the time, new code that's being generated. It doesn't appear that was that was done here.

"It's amazing how folks miss the best practices," he says. "And a lot of it – easy even for those of us who are not engineers to understand – is the issue of speed versus quality. Developers that are building websites, even if they have a long run time to get to the launch date, are rapidly developing and adding features and functions. Every day they're trying to reach a milestone – in this case a roll-out to the live audience – and they're up against a deadline and they have a lot of things to do and usually the last thing that's done is testing. So quality goes out the door and speed takes the advantage."

And sure, a lot of people came to the site at once. But shouldn't that have been expected? At any rate, that's something that could and should have been tested for, too.

"No one really can predict how many users will come into a site, what we call 'load,' but the tools are available to literally recreate millions of concurrent users hitting a site. So there's really no excuse for that," says Lounibos.

"There's clearly a break-down here. It might be at the design level, it could be at the simple testing element, where they tested it one day, did some more code build and then didn't test those changes. It could be that they built the application and then never tested it on the infrastructure, the servers that they were going to use, and it might have been a configuration setting that might have broken."

Certainly, the average website for a hospital or a medical practice will never experience as much traffic as Healthcare.gov. But there are some lessons they can take away from this, best practices to keep in mind when launching new sites, says Lounibos.

"There's three of them," he says. "First, do continuous testing. Test through the development cycle. Test the first day, and test everyday.

"Second, test across environments. Make sure you're testing the back end and the front end together, and not separately. A lot of the problems occur at the connection points – the cartilage to the bone, so to speak."

And finally, "Make sure you do load testing all the time," he says. "It doesn't have to be millions of users. A lot of problems will occur with 100 users on a site. You just have to do some sort of load testing and you have to do it consistently."

In general, "just recognize that you've always got to take a quality first initiative, as opposed to speed," says Lounibos. "And ensure that the user experience – especially when you're trying to enroll people, as you are here – is easy and simple and fast. The user experience is the whole gig. You want to make that experience as painless as possible."

Topic:

Network Infrastructure,

Privacy & Security,

Government & Policy

What happened to Healthcare.gov?

More Regional News

Digital Maturity Forum

WHX Tech