Cloud or on-premises? For Dropbox, the answer is yes
A little over a year ago, Dropbox completed one of the largest reverse cloud migrations ever undertaken, moving some 600 petabytes of data from Amazon’s cloud to its own data centers. It was a moonshot-caliber undertaking that’s paying dividends for Dropbox in faster performance and lower costs. But it’s not an either-or proposition: The company is going back to Amazon to provide major new cloud services in Europe. When it comes to hybrid cloud strategy, Dropbox has learned that striking the right balance is key.
Why leave the cloud for on-prem?
Dropbox, a popular provider of cloud-based document storage and collaboration services, developed a fast-growing business by relying on Amazon S3 (Simple Storage Service) to house data, while keeping metadata on premises. That hybrid architecture worked well for a time. “It seemed we had a competitive advantage running the metadata ourselves, but not the storage, since there was a huge volume of data,” says James Cowling, principal engineer for Dropbox. “We could do the things we did best and outsource the rest.”
But over time, that assessment changed. Despite smooth sailing, company leaders saw ominous signs on the horizon. Over-dependence on Amazon risked leaving Dropbox with an unfeasibly high cost structure. Meanwhile, Amazon was poised to offer services similar to those of Dropbox in Amazon WorkDocs.
Taking over storage could potentially lower costs. More important, making the move would give Dropbox greater control over the data and the ability to offer improved storage services to its customers—a potentially critical competitive edge. One of these services turned out to be Project Infinite, which relies on the Dropbox infrastructure to serve up an unlimited amount of data to desktop users.
Dropbox executives weighed the options and decided that moving data storage in-house was the best choice, despite daunting challenges that included improving on the S3 storage architecture, moving vast quantities of data in a short amount of time, and running large numbers of data storage devices economically and reliably.
“It’s not easy to do something better than Amazon. They do an incredible job. But as we got bigger, storage seemed to be an area where we might have some advantage,” Cowling explains.
Although most companies will find that public cloud services suffice for most of their needs, Dropbox may not be alone in seeking an in-house infrastructure edge, according to Robert Mahowald, IDC group vice president for applications and cloud: “We are beginning to see a time when for many businesses, infrastructure all of a sudden has tremendous value, if you need very high performance, high I/O, seek/access speed for disk, or optimized hardware.”
In contrast to building in-house data storage in the U.S., Dropbox found that in Europe, the Amazon Web Services (AWS) cloud is still the best fit. In a service that was rolled out in September 2016, Dropbox is partnering once again with Amazon to store data in Germany for European business customers that request it.
In a 2016 interview, Drew Houston, Dropbox's CEO, had this to say: “We have always had a hybrid infrastructure and we have flexibility to dial it up or down in one direction or the other."
For most companies, the cloud vs. on-premises decision is one of ongoing adjustment and balance. In the November 2016 report “Justify Your Hybrid Cloud Future With a Solid Business Case,” Forrester Research wrote, “Moving to the cloud isn’t an all-or-nothing proposition. That is, the public cloud isn’t always the ultimate destination. Even when it is, customers may feel the need for intermediary steps such as virtual private clouds or other models until their comfort and confidence grow.”
New architecture: Hardware and software
To take back storage, Dropbox needed a new approach to hardware and software, as well as sophisticated project management processes to guide the initiative from start to completion without interrupting Dropbox operations.
“It was like changing the engines on a jet while the plane is in flight, without the passengers noticing,” said Houston in the 2016 interview. (As part of the agreement, Hewlett Packard Enterprise signed up as a Dropbox Enterprise customer and as a reseller partner.) Houston and other Dropbox officials have indicated the move has paid dividends, but as a privately held company, Dropbox will only say that the move has been worthwhile, declining to quantify efficiency gains and cost savings.
The software, known as Magic Pocket, controls the disks, handling scheduling and buffering. It was written in the Rust language from Mozilla with the goal of taking up as little memory space as possible. “It achieves a step function increase in storage efficiency and cost performance,” says Cowling.
An important lesson to learn from Magic Pocket, according to IDC’s Mahowald, is that it is safe to rely on open source software for highly strategic initiatives. “I think vendors can learn to trust open source projects like Mozilla Rust,” he says.
On the hardware side, Drobox collaborated with HPE on project Diskotech, to custom-configure servers with more than 100 disks on a single chassis, holding more than one petabyte of data.
On the personnel side, Dropbox created dedicated teams for software, networking, and hardware, adding dozens of new hires to its team. It was a big change from the days when only a handful of staffers were responsible for storage. Talented engineers were brought on board and then given responsibility for technical decisions. “That’s how you get the best people," says Cowling. “There’s a gulf between setting up a RAID array and setting up a geographically distributed storage system. It’s a non-trivial undertaking."
Lesson: Hybrid design is challenging
Although Dropbox’s business model is unique, any company contemplating a hybrid cloud architecture might face some of the same issues Dropbox did, such as moving data between on-premises data centers and third-party cloud services. Cowling counsels consistency. “If you want to deploy software, it should be the same on your own hardware as on Amazon, Google, or Microsoft’s cloud services. That will give you the flexibility to move information in and out as needed, without being locked into a particular cloud provider.”
At the same time, it's important not to become a prisoner of your own data center. “You don’t want to be locked into your own infrastructure." Cowling adds. "You want to take advantage of a new service someone has built—it does not make sense to build it yourself.”
Dropbox also learned that it would have to develop new competencies in order for the project to succeed. Site reliability engineering was a case in point. Using very large numbers of disk drives meant dealing with far more disk failures than Dropbox engineers were accustomed to handling. Manual processes were holding them back.
“We handled things manually for some time," says Cowling. "But at some point, you don’t have enough people to handle all the failures.” In response, Dropbox hired experts to build an automated maintenance system. “Now when machines are taken out of service, it’s all automated," Cowling says. "Things got smoother once we built mature systems. I wish we had done it earlier."
Lesson: Expect the unexpected
Any project of such magnitude is likely to uncover some hidden “gotchas.” Faced with the need to bring up as many as 40 racks of hardware per day, Dropbox took delivery of a large number of servers. In the process, the Dropbox team learned just how many servers will fit on a loading dock. Hint: It’s not enough. “It got to the point where we could not get the hardware off the loading dock and into the data centers in time,” Cowling recalls. To make matters worse, two delivery trucks crashed in one week, taking a number of servers down with them.
Moving the data from S3 to the Dropbox data centers required such speed it was called Project BASE Jump. “We had to move half a terabyte of data per second at peak—and multiple per day," Cowling explains. "There was very little room for error." At one point, a cluster router in the data center choked on the traffic. “We basically gave ourselves a DoS [denial of service] attack,” he adds.
What’s next: Storage enhancements
The moonshot having been completed, Dropbox is not standing still. The company has partnered with HPE to implement several new storage technologies:
- Shingled Magnetic Recording (SMR). This technology stores data with greater density on hard disk drives by writing new tracks that overlap part of the previously written track—a bit like shingles on a roof.
- Cold storage. Data that’s infrequently accessed, such as for regulatory compliance or backup, is stored on slower, less expensive media.
- Flash storage. Although the bulk of data is stored on spinning media, which is still much cheaper than solid-state storage, solid-state drives are used for cache storage. Likewise, metadata is kept on flash storage to improve performance.
- Data resiliency. Each user’s data is stored in at least two distinct geographic locations. In addition, erasure coding algorithms, which minimize network overhead compared with RAID, are used to enable the reconstruction of missing data.
What’s next: Cloud and on-prem
While cloud-based services are hard to beat for most mainstream purposes, there will always be companies like Dropbox that must go it alone to gain competitive advantage. “It has been well documented that cloud-scale economics benefit almost all companies,” says IDC’s Mahowald. “Almost no companies can do this as well as dedicated mega-scale providers—this is not drinking the Kool-Aid; this is reality. But this is Dropbox's province because delivering their services is their core intellectual property. Doing it less than optimally means they will fail."
Although Dropbox’s new European storage services rely on AWS, the company is keeping its options open for the future. “Currently we do not have any of our own data centers in Europe, but we haven’t ruled anything out,” said the company in a statement. “We’re continuing to invest in our own infrastructure as well as partner with Amazon where it makes sense for our users, particularly globally."
The lesson that Dropbox has learned, and which applies to many IT leaders as they weigh on-premises versus cloud deployments, is that each might be right, depending on the data that’s being stored and the needs of the users who access the data.
In the September 2016 report “Q&A: Understanding Private Clouds That Failed,” Forrester wrote, “It’s hard to get private cloud right, but those that have done it find that it’s powered a customized internal innovation center. Private cloud has transformed the way they interact with their businesses and has moved their organizations into the age of the customer.”
Although that statement applies to Dropbox, Cowling says he remains mindful that when it comes to hybrid cloud, balance is key. “It was hard to build Magic Pocket, but it worked out in our favor. It’s important to recognize your strengths and use the cloud where it makes sense."
On-prem vs. cloud storage: Lessons for leaders
- If you are considering building a private cloud, make sure it’s strategically critical to your corporate mission.
- To move data between an on-premises data center and a public cloud service, use the same software in both places.
- Even after building your own infrastructure, keep in mind that a public cloud service might offer advantages down the road. Embrace standards that will enable you to move applications and data to the cloud should the need arise.
- If your private cloud project is a big one, and it probably will be, strange things can happen. Be prepared to deal with the unexpected.
This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.