Cloud and Docker Grow Closer To Bring Hybrid IT To Enterprises

Cloud computing and Linux containers (ala Docker and LXC) are two hot trends in the modern software delivery paradigm. They are both backed by strong and active global open communities, have rich tools, and new start up companies and projects are building their solutions with them.

But enterprises, while acknowledging the new paradigms, are still struggling with implementing them. One of the biggest concerns of enterprises is the vast infrastructure, which spans multiple systems, technologies, vendors, data centers, and even (private/public) clouds. Enterprises therefore require support for hybrid IT, with adequate automation to manage it. Gartner research shows that 75% of clients will have some type of hybrid strategy in place by the end of the year.

Puppet, the popular DevOps tool used for automating complex deployment environments, identified the need for provisioning and managing such mesh of cloud, containers and even bare-metal. This week on its latest Puppet Enterprise release it added support for AWS, Docker and bare metal.

IBM on its InterConnect 2015 conference this week announced an update to the container service beta on Bluemix, to provide capabilities such as push and pull containers from on-premises to off-premises service to support private & public clouds uniformly, and hosted private registry for container images for enterprise security and privacy.

The OpenStack community has been making efforts in the past few releases on integrating containers, initially as if they were virtual machine instances and later with full support so you can deploy containers through Heat orchestration just like you deploy applications and services. The Docker driver for Nova (OpenStack Compute), which has been developed out of tree so far, is expected to return to mainline in the upcoming ‘Kilo’ release next month.

Public cloud vendors staying behind either. Google adopted the technology internally, saying that “everything at Google runs in a container”, as well as developing Kubernetes open source project for orchestrating pods of Docker containers, and actively pushing it also into OpenStack via its recently announced partnership with Mirantis.

Amazon on its last re:invent conference announced its EC2 Container Service which lets you start, stop, query and manage container-enabled applications via API and using the rich AWS set of services.

VMware, which rules the traditional enterprise virtualization domain, made moves to adopt both open cloud and containers. First, it started getting actively involved in the communities and contributing code. Also, in the cloud front VMware launched an OpenStack-compatible version of its private cloud. On the containers front VMware partnered with Docker, Google and Pivotal to offer enterprises simplified path for containers over hybrid cloud model.

There are others exploring this field, such as RedHat, Cisco and even Microsoft, offering container integration in all levels from hardware through operating systems, hypervisors, cloud management systems, to orchestration and monitoring tools. We shall be seeing a growing number of such offerings of converged solutions for hybrid IT, more targeted at the complex environments of enterprises, and with enterprise-grade tooling and maturity levels.

Leave a comment

Filed under Cloud, DevOps

Amazon Acquires Internet of Things Startup 2lemetry

Amazon has quietly acquired the Internet of Things startup 2lemetry, which offers ThingFabric – a platform for the integration of connected devices across enterprises. Though no formal press releases clarified the details of the deal, sources at Amazon did confirm the deal to TechCrunch a few days ago. Amazon however did not share what it intends to do with the acquisition, and whether it indeed relates to 2lemetry’s IoT platform or to its retail beacon services or facial recognition technology.


The acquisition comes as no surprise, as 2lemetry has been exploring integration with Amazon’s cloud services (AWS) for a while now. Most recently 2lemetry’s engineering has been exploring how the AWS Lambda service works together with the MQTT IoT communications protocol (an OASIS standard) and 2lemetry’s ThingFabric. Another interesting direction for integration is with Amazon’s Kinesis big data stream processing cloud service, plugging in terabytes of data feeding in from hundreds of thousands of IoT sensors via 2lemetry’s platform.

Amazon has been trying to position its AWS cloud services with respect to the Internet of Things, with analytics services in the cloud or connected devices such as Amazon Echo. AWS mission statement says that

Amazon Web Services provides the services, security, and support to connect the Internet of Things on a global scale.

Amazon’s recent acquisition is probably a response to the moves by its public cloud competitors, most notably Google’s investment in the Internet of Things. Interesting to note that 2lemetry is part of the AllSeen Alliance open standards group for IoT. Amazon’s new arm in the AllSeen Alliance is a match to Google’s own arm in the Thread Group open standards group (via its Nest acquisition). Now awaiting Amazon’s and 2lemetry’s official statements of the joint path of the IoT in Amazon’s cloud.

Leave a comment

Filed under Cloud, IoT

Oracle Boosts Its Big Data Offering

Oracle, the proverbial SQL icon, knows it cannot ignore the trend of big data, nor does it attempt to. On the contrary. Oracle has been promoting big data offerings, both in-house and via acquisitions such as the recent Datalogix acquisition. Yesterday Oracle announced an important milestone with the release of a suite of big data solutions.

In modern organizations data comes from multiple and diverse sources, both structured and unstructured, and across various technologies (e.g. Hadoop, relational databases). Advanced analytics requirements such as real-time counting analytics alongside historical trends analytics further necessitate the different technologies, resulting in a highly advanced solution. Oracle identifies this complexity and offers native integration of the big data solutions with Oracle’s SQL relational databases, with one uniform façade for analytics.

While Hadoop’s advantages are known, it is still non-trivial for analysts and data scientists to extracts analytics and gain insights. Oracle Big Data Discovery comes to address this, providing a “visual face of Hadoop”.

Oracle also extends its GoldenGate replication platform with the release of  Oracle GoldenGate for Big Data, which can replicate unstructured data to Hadoop stack (Hive, HBase, Flume). Another aspect of the uniform façade is Oracle SQL, with queries that can now transparently access data in Hadoop, NoSQL and Oracle Database with Oracle Big Data SQL 1.1.

Oracle’s strategy is to leverage its brand and existing SQL installations within enterprises and offer them enterprise-grade versions of the popular open-source tools, and to provide native integration with Oracle’s traditional installation of SQL databases which already exist within the enterprises. It’s left to see how it catches with the enterprises against the hype of the popular big data -vendors and open source projects.

Leave a comment

Filed under Big Data

Facebook Shares Open Networking Switch Design, Part of its Next Gen Networking

Facebook’s enormous scale comes with enormous technological challenges, which go beyond conventional available solutions. For example, Facebook decided to abandon Microsoft’s Bing search engine and instead develop its own revamped search capabilities. Another important area is Facebook’s massive networking needs, which called for a whole new paradigm, code named data center fabric.


The next step in Facebook’s next-gen networking architecture is “6-pack” – a new open and modular switch announced just a few days ago. Interesting to note that Facebook chose to announce the new switch the same day Cisco reported its earnings. This is more than a hint at the Networking equipment giant, representing the “traditional networking”. As Facebook says in its announcement, it started the quest for next-gen networking due to

the limits of traditional networking technologies, which tend to be too closed, too monolithic, and too iterative for the scale at which we operate and the pace at which we move.

The new “6-pack” is a modular high volume switch built on merchant silicon based hardware. It enables you to build any size switch using a simple set of common building blocks. The design uses Software Defined Networking (SDN) hybrid approach: While classic SDN separates control plane from forwarding plane and centralizes control decisions, in Facebook’s hybrid architecture each switching element contains a full local control plane on a microserver that communicates with a centralized controller.

Facebook made the design of “6-pack” open as part of the Open Compute Project, together with all the other components of its data center fabric. This is certainly not good news for Cisco and the other vendors, but great news for the community. You can find the full technical design details in Facebook’s post.

Faceook is not the only one in the front line of scaling challenges. The open cloud community OpenStack, as well as the leading public cloud vendors Google and Amazon also shared networking strategies to meet the new challenges coming with the new workloads in modern cloud computing environment.

Cloud and Big Data innovations were born out of necessity in IT, driven by companies with the most challenging use cases and backed by open community. The same innovation is now happening with networking, paving the way to simpler, scalable, virtual and programmable networking based on merchant silicon.

Leave a comment

Filed under Cloud, SDN

Google Helps Measuring Cloud Performance With New Open Source Tools

One of the toughest decisions CIOs are facing when deciding to migrate their workloads to the cloud is – which cloud provider is best suited for their workloads? pricing is one aspect which has been extensively reviewed, with fierce war around price reductions, led by Amazon’s constant price cuts and innovative pricing schemes with/without prior commitment and auctioned spot instances.

Another important aspect is the performance measures of the cloud provider and its fit to the company’s specific workloads. Google, Amazon and others have been investing extensively in gaining superior performance with their hyperscale computing. But unlike pricing, performance is much more complex to measure and compare, with lack of standardization around the relevant KPIs (key performance indicators) and benchmark definitions.

Now comes Google to the rescue with a new open-source tool called PerfKit Benchmarker. The new tool, released a couple of days ago, will enable measuring various KPIs, from elementary such as CPU and bandwidth and up to end-to-end provisioning times. Another tool, Perfkit Explorer, provides visualization of the benchmark results for easy interpretation. In its blog post, Google Cloud Platform Performance Team says

You’ll now have a way to easily benchmark across cloud platforms, while getting a transparent view of application throughput, latency, variance, and overhead.

The tools are managed as open source projects hosted in GitHub (ASLv2 license). It’s said over 30 leading researchers, companies, and customers have been involved in this project, including big industry names such as Intel, Cisco, Microsoft and RedHat, and academia endorsements from Stanford and MIT.

Currently the kit contains 20 benchmarks from basic ping and iperf through Hadoop, Cassandra and MongoDB NoSQL databases. Current release supports the three major public cloud providers: Amazon’s AWS, Microsoft’s Azure, and of course Google’s GCE.

Open source is the new de facto standard. With a new open source tool backed by industry leaders and the community, we can gain a standard benchmark to compare different vendors not just on price but also on performance. As the acceptance of the tool grows, more cloud providers will be covered by the benchmarks, and more benchmarks will be added to address the needs of more complex workloads. Then we will be able to truely choose best of breed for our needs.

Leave a comment

Filed under Cloud

Microsoft Expands Open Source and Big Data Involvement, acquires Revolution Analytics

When you think about Microsoft, you probably think the very opposite of Open Source Software (OSS). Some would even go as far as picturing it as Dr. Evil of OSS. But recent moves show Microsoft begs the differ. This week Microsoft announced acquisition of Revolution Analytics, the company behind the open source R programming language that has become popular for statistical analysis on big data. As Joseph Sirosh, Corporate Vice President, Machine Learning as Microsoft writes in his blog post:

We are making this acquisition to help more companies use the power of R and data science to unlock big data insights with advanced analytics

Microsoft’s acquisition comes after showing interest in the language, both using it internally by Microsoft’s data scientists and frameworks, and actively contributing to open source R projects such as ParallelR, and RHadoop.

This joins other contributions by Microsoft to open source, such as Linux kernel contributions (yes, Microsoft, the father of Windows, contributing to Linux). Microsoft also released some of its core assets as open source, such as the .NET Core programming language and the REEF big-data analytics framework for YARN, and other open-source projects.

Microsoft’s recent moves also shows its recognition that Big Data Analytics is where the world is heading. Organizations accumulated data, and are now looking for ways to monetize on that data and leverage the most advanced technologies and languages for that. Microsoft got a painful reality check a couple of months ago, when Facebook decided to dump Microsoft’s Bing and develop its own revamped big data search framework. Facebook, Twitter, Google and the likes have long realized the potential of their big data and have been developing advanced big data analytics technology to address that.

Microsoft opened 2015 with an impressive acquisition, marking important realization around open source software and big data analytics. Such statement hints for more to come down the year.

Leave a comment

Filed under Big Data, Programming Languages

Amazon Adds Auto-Recovery To Its EC2 Cloud Instances

Amazon had its share of outages. These forced the prudent users to factor that into their design. One of the important guidelines which I pointed out after one such major AWS outage is Automation and Monitoring:

Your application needs to automatically pick up alerts on system events, and should be able to automatically react to the alerts.

At the beginning users had to take care of that themselves. Then Amazon added CloudWatch to enable built-in automatic monitoring. Now comes the next stage – automatically reacting to these notifications.

Amazon announced its latest feature to do just that: Auto Recovery for EC2. With this feature enabled, a failed instance will be automatically detected and recovered, and the recovered instance will be identical to the original, including the same Instance ID, IP address and configuration (e.g. elastic IPs and attached EBS volumes). This means developers no longer need to write the mechanism for the recovery of trivial cases.


The Auto-Recovery feature is based on AWS CloudWatch and the status checks which were added in last couple of years and can detect a variety of failure symptoms such as loss of network connectivity, loss of system power, and software/hardware issues on the physical host. Upon failure it will first attempt to recover on the same machine (involving reboot, so in-memory data will be lost), and then on a different machine (while retaining same ID, IP and configuration). In order to keep a fully automated recovery cycle, the developers better make sure that their software is automatically started upon boot-up (e.g. using cloud-init) so that it restarts automatically also on the rebooted instance.

The service is currently launched in Amazon’s US East (N. Virginia) region, and is currently available for C3, M3, R3, and T2 instance types. The service is offered for free, so you’d only have to pay for the usage of CloudWatch alarms to monitor the system.

Unlike traditional data center, in the cloud the base assumption is that infrastructure instances can and will fail and even frequently. As more applications move to the cloud, including more mission-critical applications, the greater the demand is from cloud providers to provide greater levels of built-in resilience. We shall be seeing more of these features for automatic recovery and self-healing, as well as other types of automated monitoring-triggered responses (both predefined and configurable), in infrastructure level as well as application level, among cloud providers, to meet those needs.

Leave a comment

Filed under Cloud