INAP IS NOW HORIZONIQ.
Click here to LEARN more.

Sep 25, 2014

5 ways to prepare for skyrocketing data center storage needs

Ansley Kilgore

Data-center-storageData center storage requirements are changing quickly as a result of the increasing volumes of big data that must be stored, analyzed and transmitted. The digital universe is doubling in size every two years, and will grow by a factor of 10 between 2013 and 2020, according to the recent EMC Digital Universe study. So, clearly storage needs are skyrocketing.

Fortunately (for all of us buyers out there), the cost per gigabyte of storage is falling rapidly, primarily because disk and solid state drives continue to evolve to support higher areal densities. Alas, the volume of data being stored seems to be outpacing our ability to cram more magnetic bits (or circuits in the case of flash) per nanometer of surface area.

So clearly, storage costs are likely to become a larger component of overall IT budgets in the coming years. Here are five things to consider when planning for your future storage needs.

1. High power density data centers
With increasing storage needs and a greater sophistication of the storage devices in use, power needs for each square foot in a data center are increasing rapidly. As a result, high power density design is a critical component of any modern data center. For example, if an average server rack holds around 42 servers and each of those servers uses 300W of power, the entire rack will require 12-13kW in a space as small as 25 square feet. Some data center cabinets can be packed with even more servers; for example, some blade server systems can now support more than 10x the number of servers that might exist in an average rack. This increasing demand for higher power density is directly related to the need for higher storage densities in data centers.

2. Cost-efficient data center storage
Choosing an energy-efficient data center from the start can help control costs in the long run. Facilities designed for high density power can accommodate rising storage needs within a smaller space, so you can grow in place without having to invest in a larger footprint.

Allocating your storage budget across different tiers is another way to help control costs. Audit your data to determine how it is used, and how often particular files are accessed during a given period, and categorize the data into tiers so that the type of data is matched with the appropriate storage type. The most-accessed data will require a more expensive storage option while older, less-accessed data can be housed in less-expensive storage. Some examples of different storage types, from most to least expensive, include RAM, solid state drive, spinning hard disk drives (SATA or SAS drives) and tape backup.

3. Scalability
Infrastructure should be designed with scalability in mind; otherwise, costs can become unmanageable and possibly result in poor performance or even outages. Scalability allows you to grow your infrastructure at a pace that matches the growth in data, and also gives you the ability to scale back if needed. Distributed or “scale-out” architectures can provide the perfect foundation for multi-petabyte storage workloads because of their ability to quickly expand and contract according to compute, storage, or networking needs. Also, a hybrid infrastructure that connects different types of environments can enable customers to migrate data between cloud and colocation; if an unexpected need for storage occurs, customers can then shift their budget between opex and capex if needed.

4. Security
Strict security or compliance requirements for data, particularly for companies in the healthcare or payment processing industries, can increase the complexity of data management and storage processes. For example, some data need to be held in dedicated, 3rd party-audited environments and/or fully encrypted at rest and in motion.

5. Backup and replication
When planning your infrastructure, it must support backup and replication in addition to your application requirements. Online backup handles unpredictable failures like natural disasters, while replication deals with predictable hardware failures that may occur during planned maintenance. Establishing adequate replication and backup requirements can more than double the storage needs for your application.

Your data center storage needs will continue to increase over time, as the digital universe continues to expand in alignment with Moore’s Law. Careful planning is required to create a cost-efficient, secure, reliable infrastructure that can keep up with the pace of data growth. Service providers can draw on their experience to help you find the right storage options for different storage needs.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

Ansley Kilgore

Read More
Sep 17, 2014

Internap wins Stevie Award for big data solution at 2014 American Business AwardsSM

Ansley Kilgore

Stevie_Award-Internap-ABA14Internap was honored to receive a bronze Stevie® Award in the New Product or Service of the Year – Software – Big Data Solution category at the 12th Annual American Business Awards on September 12, 2014. The award honors Internap and Aerospike for creating the industry’s first “fast big data” platform, which runs Aerospike’s hybrid NoSQL databases on Internap’s bare-metal servers.

The combined solution from Internap and Aerospike enables developers to quickly deploy applications that demand predictable, high performance in a cost-effective hosted environment. Big data workloads and other computationally intensive applications require higher levels of performance and throughput than traditional virtualized cloud infrastructure can provide.

Benchmark tests comparing similar virtual and bare-metal cloud configurations show Internap’s bare-metal cloud yields superior CPU, RAM, storage and internal network performance. In many cases, organizations require 8x fewer bare-metal servers than virtualized servers, resulting in decreased IT equipment cost, less power usage and a smaller data center footprint.

The eXelate use case
eXelate is the smart data company that powers smarter digital marketing decisions worldwide for marketers, agencies, platforms, publishers and data providers. eXelate’s platform provides accurate, actionable, and agile data and analytics on online household demographics, purchase intent and behavioral propensities.

Aerospike’s hybrid NoSQL database allows eXelate to use at least 12x fewer servers than in-memory database solutions with lower storage capacity (740GB of storage as opposed to 64GB available with in-memory database solutions). This enables massive volume real-time data storage and continual read/write back to the Aerospike cluster..

By running the Aerospike NoSQL databases on Internap bare-metal servers in four data centers around the world, eXelate is able to process 2 TB of data per day and ingest over 60 billion transactions per month for more than 200 publishers and marketers across multiple geographic regions.

Details about The American Business Awards and the lists of Stevie Award winners who were announced on September 12 are available at www.StevieAwards.com/ABA.

Learn more about the integrated fast big data solution from Internap and Aerospike.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

Ansley Kilgore

Read More
Mar 18, 2014

Customer spotlight: eXelate powers smarter digital marketing decisions for advertisers

Ansley Kilgore

Watch Mark Zagorski, CEO, and Brent Keator, Director of Infrastructure for eXelate, discuss how Internap helps accelerate global data delivery and real-time application deployment of the eXelate platform.

eXelate is a data technology company that strives to make digital media more relevant for consumers and more effective for advertisers. The company requires a strong infrastructure backbone to ingest and manage trillions of data points on consumer behavior each month.

Internap’s top-notch technology, including bare-metal servers, provides a high-performance cloud solution that can grow and scale to accommodate eXelate’s global expansion. As a result, the eXelate platform enables marketers and data providers to make smarter digital marketing decisions.

Download the complete eXelate case study here.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

Ansley Kilgore

Read More
Oct 22, 2013

Customer spotlight: Treato uses big data to reveal health insights

Ansley Kilgore

healthcare colocationIn today’s Internet-driven society, many of us choose to “ask Google” for information on countless topics – including questions about our health and prescription drugs. A staggering amount of data exists in online health communities where patients compare notes about their conditions, medications and treatments. While the Internet is no substitute for consulting with a medical professional, a startup company called Treato is using big data analytics to bridge the gap between patients, pharmaceutical companies and healthcare providers.

Treato, a big data startup based in Israel, aims to provide meaningful insights from the plethora of information in online health forums. By extracting, aggregating and analyzing data from blogs and other qualified health websites, Treato “creates the big picture of what people say about their medications and conditions.” The resulting analytics are available for free to consumers and as a brand intelligence service for pharmaceutical marketers.

Treato is experiencing rapid growth, and traffic on their website has already surpassed 100,000 visits per day. In addition to acquiring new site users, the amount of data from source content has expanded. To handle increased site traffic and content, Treato recently added a new Internap data center near Dallas, Texas.

Data centers are a critical aspect of Treato’s strategy to expand its world-class SaaS infrastructure. In addition to supporting Treato.com and the Treato Pharma applications, data centers process and store the content collected from online health blogs and forum posts. Advanced Natural Language Processing (NLP) analysis is used to extract relevant information from this data, ultimately providing insights that can influence future patient experiences.

High availability – Expanding its data center footprint allows Treato to reduce the risk of downtime for the site and backend processing capabilities. With a growing number of users relying on their analytics, resilient infrastructure helps ensure a positive online experience.

Increased capacity – Thanks to the expanded data center space, Treato has enlarged their capacity by 150%. With their new 200-terabyte Hadoop cluster, Treato can process 150% more patient conversations per day.

Scalability – As a result of this increased capacity, Treato can expand its sources of content even further, and expects to invest significantly in this area moving forward. More than a million patient conversations are added each day, perpetually expanding the knowledge base available to Treato visitors.

Using big data analytics to glean consumer insights is a rapidly growing business strategy that is still evolving today. By successfully applying this concept to real-time healthcare data, Treato is opening new doors for the advancement of healthcare. The use of data centers for expanding storage and processing capacity will be an essential factor in achieving analytics goals. While the Internet still can’t diagnose your ailment, it can work together with big data to create better, healthier lives.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

Ansley Kilgore

Read More
Sep 10, 2013

How to make IaaS work for your big data needs

INAP

How to make IaaS work for your big data needsIn our previous blog, we discussed two main classes of big data that we have observed in our customer base: “needle in the haystack” style data mining and mass-scale NoSQL style “big” database applications. In this blog, I wanted to talk about the importance of choosing the right infrastructure services for your needle in the haystack big data workloads and needs.

The needle in the haystack approach to big data involves searching for relationships and patterns within a static or steadily growing mountain of information, hoping to find insights that will help you make better business decisions. These workloads can be highly variable with constant changes in scope and size, especially when you’re just starting out. These workloads normally require large, backend processing power to analyze the high volume of data (tweet this). To effectively crunch this type of data and find meaningful needles in your haystack, you need an infrastructure that can accommodate:

Dynamically changing, periodic usage – Most big data jobs are processed in batches, and require flexible infrastructure that can handle unpredictable, variable workloads.
Large computational needs – “Big” data requires serious processing power to get through your jobs in a reasonable amount of time and provide effective analysis.

So what kind of infrastructure options can support these requirements? While the multi-tenant virtual cloud platforms offer a great economic model and can handle the variable workloads, performance demands become extremely difficult to manage as your use cases evolve and grow. Big data mining technologies such as Hadoop may work at acceptable levels in virtual environments when you’re just starting out, but they tend to struggle at scale due to high storage I/O, network and computational demands. The virtual, shared and oversubscribed aspects of multi-tenant clouds can lead to problems with noisy neighbors. Big data jobs are some of the noisiest, and ultimately everyone in the same shared virtual environment will suffer, including your big data jobs. An alternative is to build out dedicated infrastructure to alleviate these problems.

This leaves you with two bad options: either deal with subpar performance of virtual pay-as-you-go cloud platforms, or start building your own “expensive” infrastructure. How do you get both the flexibility you need and the high level of performance required to efficiently process big data jobs?

Bare-metal cloud can provide the dedicated storage and compute that you need, along with flexibility for unpredictable workloads. In a bare-metal cloud platform, all compute and direct-attached storage are completely dedicated to your workloads. There are no neighbors, let alone noisy ones, to adversely impact your needs. Best of all, you can get and pay for what your workload specifically needs, and then spin down the whole thing. One caveat – even with dedicated servers and storage, the network layer is still shared among multiple tenants, which could be a limiting factor for some large-scale Hadoop jobs where wire-speed performance is a must. Even though bare metal is one of the best price for performance cloud options, your workload may not be able to tolerate such limitations as your big data needs grow. Managed hosting or private cloud to the rescue.

Managed hosting or private cloud is a better option in some cases, as the infrastructure is dedicated to you on a private network and can be customized to accommodate your specific needs. These options deliver wire-speed network performance along with dedicated compute, storage and reasonable agility. Of course, this won’t be the most economic option, but if your workload requirements demand this, the tradeoff is well worth it.

Whether you begin your big data endeavor with virtual cloud or bare-metal cloud, it’s important to recognize that your infrastructure needs will change over time. When starting out, a virtual cloud or a bare-metal cloud can suffice, with bare metal providing better performance and scale capabilities. But as your big data needs expand, a fully dedicated, managed private cloud may fit better, without the limitations of a shared network.

Given that change is the only constant in big data, choosing a provider that offers more options and allows you to adjust as your needs change is key. Talk to Internap about your “needle in the haystack” big data needs and we will help you find the right options now and for the future.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

INAP

Read More
Aug 20, 2013

Is your cloud noisy and slow?

Ansley Kilgore

Is your cloud noisy and slowNow that most IT organizations have transitioned some of their infrastructure to the cloud, the game has changed yet again. While you may have already moved your email applications, disaster recovery, ERP or CRM systems to the cloud, now your CEO wants to incorporate big data, business intelligence and predictive analytics into the corporate strategy. But these large enterprise applications require more computing power than your current cloud architecture can support. How can you accommodate the CEO’s requests without sacrificing performance and inviting problems from noisy neighbors?

In an effort to not throw out the proverbial baby with the bath water, IT is faced with the challenge of using the current cloud infrastructure to meet the requirements of these new systems. A one-size-fits-all cloud solution doesn’t work for most enterprises, and few businesses can afford to sacrifice the automation and flexibility of the cloud and start manually provisioning physical servers again. Diversifying your infrastructure to include bare-metal cloud can help fill this gap. Bare metal provides the high performance processing capabilities of a dedicated environment, with the service delivery model of the cloud.

Establishing a mixed cloud environment
Understanding the requirements of your use case will help determine which mix of cloud is right for you. Workloads that require high disk I/O are usually better suited for physical, dedicated servers. Bare-metal cloud provides a new way to leverage cloud technology for high-performance, data-intensive workloads, such as big data applications and media encoding. Since bare-metal servers do not run a hypervisor, are not virtualized and are completely dedicated, including storage, you don’t have to worry about noisy neighbors or overhead delays.

Bare-metal cloud can be used in conjunction with virtualized cloud infrastructure to meet a wider range of business requirements. IT managers can balance the capabilities of various cloud models to create a cost-effective operating environment. This reduces capital costs, is operationally efficient and establishes a foundation for agility through adaptable hosting models. At the same time, businesses investigating virtualized clouds as their only hosting solution often prefer to host many of their high-performance, and most complex, applications internally. The bare-metal cloud offers an alternative to virtualized clouds and in-house environments, positioning IT managers to maximize the value of their application and service architectures.

The value of diversity
The ability to create a mixed cloud environment means cloud computing now offers more options than traditional virtualization, while still providing flexibility, scalability and utility-based pricing models. Using different types of cloud together provides organizations with exponentially more opportunities for cost-effective infrastructure.

As the cloud has evolved to include public, private, hybrid and now bare-metal options, IT now has more opportunities to create the right cloud mix to meet the needs of the enterprise. Taking a workload-centric approach can help establish a more strategic, cost-effective cloud solution. The bare-metal cloud is an integral part of an agile infrastructure that allows IT to efficiently meet the demands of business-critical applications.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

Ansley Kilgore

Read More
Aug 13, 2013

Big data: Two critical definitions you need to know

INAP

Down in Cyberspace 01Big data is (clearly) a broadly defined and overused term. It’s been used to describe everything from general “information overload” to specific data mining and analytics to large-scale databases. In Internap’s hosting and cloud customer base, we see two main approaches to big data. In order to make better decisions about the infrastructure required to achieve your goals, you need to understand these different approaches and know where your needs fall.

There is a haystack, go find needles
One class of big data can be thought of as the “needle in a haystack” type. In this scenario, you have mountains of data already, and a very broad idea about the possibility of insights, analytics, and interrelationships within the data. Therefore, your goal is to crunch the data and find the relationships that allow you to understand and gain insight about the data over time. This type of static “big” data requires big backend processing power from technologies such as Hadoop. These applications tend to be mostly batch jobs with sporadic and often unpredictable infrastructure needs.

Massive real-time “big” database
The term “big data” is also used to describe the more mainstream, real-time database applications that have a scale problem to solve well beyond the means of traditional SQL databases. Real-time big data applications, such as Mongo DB, Cassandra and others deliver needed scale and performance for modern scale-out applications. Relational databases are often too limiting for large amounts of unstructured data. NoSQL and key value databases are better suited for the task, but they require high performance storage, high IOPs and the ability to rapidly scale in place. These requirements are vastly different from those of the data-crunching needle in a haystack type of big data, yet the same term is often used to describe both.

The performance question
Performance isn’t unimportant in the first type of big data, but it has a different meaning versus the real-time database scenario. For large data-mining applications, real-time data insertion isn’t as important, because you already have the data. The importance of performance in this case is the ability to extract the data fast enough and process it quickly, and this depends on the type of data you are mining and the business application of it. With that said, the type of infrastructure has a big impact on how long it takes to process your “big data” job. If you can reduce the processing time from three days to two days thanks to a more powerful cloud infrastructure, that can change how you define your business model.

For real-time big database applications, I/O becomes critical. For example, mobile advertising technology companies require real-time data insertion and performance in order to capture the right data at the right time and subsequently deliver timely, relevant ads. What really happens when millions of users simultaneously “check in” at their favorite restaurants and then at the movies via a social media mobile app? Extracting and capturing this information relies on real-time data insertion, but quickly processing and learning from that data relies on compute performance. The ads you see are formulated and delivered based on your real-time location information, behavior patterns and preferences. Dynamic, real-time data requires high I/O storage and superior compute performance in order to provide such targeted ads.

From the proverbial needle-in-a-haystack backend processing to modern, real-time database applications, the term “big data” is used for both. Once you understand the distinct qualities of each type, you can make better decisions regarding the infrastructure and IaaS (Infrastructure-as-a-Service) models that fit one versus the other. Your organization likely has both types of “big data” challenges. Talk to Internap to find out how we can help you meet the needs of both.

Next: How to make IaaS work for your big data needs

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

INAP

Read More
Jul 24, 2013

Is your IT team ready for the big data challenge?

Ansley Kilgore

Is your IT team ready for the big data challenge?Big data is increasingly important across all industries, and C-level executives are placing more demands on their IT departments to provide the necessary data and analytics. Collecting vast amounts of information and drawing meaningful conclusions creates opportunities for improved business agility, visibility and most importantly, profits. Having such valuable information allows business leaders to make more informed, real-time decisions and gain competitive advantage.

While big data may be part of an overall corporate strategy, businesses also need a technology strategy to meet the demands of this new challenge. Hint: it requires more than purchasing additional data storage.

In this webinar, guest speaker Brian Hopkins, Principal Analyst, Forrester Research, Inc. will present his latest research aimed at helping you get past the non-sense, understand what big data is all about and leverage the concepts to lower cost, increase agility and deliver more data and analytics to your business.

Attend this webinar to learn:

  • What is big data, really?
  • How does big data relate to an overall data management strategy?
  • What are the architectural components of solutions that exploit big data concepts?
  • What is a real-world example of addressing a big data challenge?
  • What should you do with this information right now?

Watch the webinar recording to learn more about preparing your business for the new challenges of big data.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

Ansley Kilgore

Read More
May 16, 2013

Retail’s big data evolution

INAP

The amount of data collected at every part of the supply chain has steadily increased, enabling retailers to make extremely informed decisions about what to sell, when to sell it and who to sell it to. The science of collecting these vast amounts of data and trying to draw meaningful conclusions falls under the term “big data.” This vast amount of information quickly becomes complex, unruly and difficult to store; and it’s nearly impossible to extract meaningful insight using traditional databases and computational techniques. The ability to compare Apple iPads to orange t-shirts in a meaningful way is a recent development, and this kind of knowledge is more than power – it’s profit.

Fifty years ago, if you walked into a local retailer to purchase a record player and a copy of Please Please Me, the debut album of a then-unknown band from England, you would have simply walked up to the register with your items, paid the cashier, received a receipt and been on your way (to having your musical world turned upside down, but that’s outside the scope of this blog post). Once a month, every store employee would stay late to manually look through every item on the shelves to take inventory.

Twenty years ago, if you walked into a local retailer to purchase a cassette player and a copy of The Cranberries’ debut album Everybody Else Is Doing It, So Why Can’t We?, you would have walked up to the register, paid the cashier, the now-computerized register would have automatically removed the items from the store’s inventory system, you’d be furnished with an itemized receipt, and you’d be on your way in record time. At the end of each week, the store’s computer would provide the manager with a report showing current inventory levels along with statistics on what was sold during the week. The manager would use that basic data to make informed decisions about what to order and when to order it.

Ten years ago, if you walked into a local retailer to purchase a CD player and a copy of The Strokes’ Room on Fire, you would walk up to the register, the cashier would scan your items and a prompt would pop up on their screen suggesting that perhaps you might want to preorder Radiohead’s upcoming album, Hail To The Thief. You’d ponder for a moment and remember that you used to enjoy Radiohead, so you add it to your order, pay the cashier, get your itemized receipt and preorder voucher, and leave the store, not only excited about your current purchase, but rife with anticipation for the new Radiohead album. At the end of the week, the store’s computer system would automatically place an order to replenish stock on items that were sold, and your preorder would be secured along with those of 50 other patrons – 30 of whom, like you, had entered the store unaware that Radiohead had a new album coming out.

Today, thanks to big data and predictive analytics, retailers like Walmart know what their consumers are going to buy before they even enter the store. When it rains on a Sunday in San Diego California, Walmart knows that on the following Monday they’re going to sell three times as many iPods than normal. They can even identify specific genres of music that will see a temporary boost in sales. Does your business have the ability to translate this big data knowledge into profit?

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

INAP

Read More
May 8, 2013

Bare metal cloud fits big data

INAP

Big data is the buzzword in the IT industry these days. While traditional data warehousing involves terabytes of human-generated transactional data to record facts, big data involves petabytes of human and machine-generated data to harvest facts. Big data becomes supremely valuable when it is captured, stored, searched, shared, transferred, deeply analyzed and visualized.

The platform that is frequently cited as the enabler for all of these things is Hadoop, the open source project from Apache that has become the major technology movement for big data. Hadoop has emerged as the preferred way to handle massive amounts of not only structured data, but also complex petabytes of semi-structured and unstructured data generated daily by humans and machines.

The major components of Hadoop include Hadoop Distributed File System (HDFS) as well as implementation of MapReduce. HDFS distributes and replicates files across a cluster of standardized computers/servers. MapReduce parses the data into workable portions across the cluster, so they can be concurrently processed based on a map function configured by the user. Hadoop relies on each compute node to process its own chunk of data allowing for efficient “scaling-out” without degrading performance.

Hadoop’s popularity is largely due to its ability to store, analyze and access large amounts of data, quickly and cost effectively across these clusters of commodity hardware. Some use cases include digital marketing automation, fraud detection and prevention, social network and relationship analysis, predictive modeling for new drugs, retail in-store behavior analysis, mobile device location-based marketing within an almost endless variety of verticals. Although Hadoop is not considered a direct replacement for traditional data warehouses, it enhances enterprise data architectures with potential for deep analytics to attain true value big data.

When building and deploying big data solutions with scale-out architecture, cloud is a natural consideration. The value of a virtualized IaaS solution, like our own AgileCLOUD is clear – configuration options are extensive, provisioning is fast and easy, and the use cases are wide-ranging. When considering hosting solutions for Hadoop deployments, shared public cloud architectures usually have performance trade-offs to reach scale, such as I/O bottlenecks that can arise when MapReduce workloads scale. Moreover, virtualization and shared tenancy can impact CPU and RAM performance. Purchasing larger and larger virtual instances or additional services to reach higher IOPS to compensate for those bottlenecks can get expensive and/or lack the desired results.

Hence the beauty of on demand bare metal cloud solutions for many resource intensive use cases: Disks are local and can be configured with SSDs to achieve higher IOPS. RAM and storage are fully dedicated and server nodes can be provisioned and deprovisioned programmatically depending on demand. Depending on the application and use case, a single bare-metal server can support greater workloads than multiple similarly sized VMs. Under the right circumstances, the use of both virtualized and bare metal server nodes can yield significant cost savings and better performance.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

INAP

Read More