Oracle Boosts Its Big Data Offering

Oracle, the proverbial SQL icon, knows it cannot ignore the trend of big data, nor does it attempt to. On the contrary. Oracle has been promoting big data offerings, both in-house and via acquisitions such as the recent Datalogix acquisition. Yesterday Oracle announced an important milestone with the release of a suite of big data solutions.

In modern organizations data comes from multiple and diverse sources, both structured and unstructured, and across various technologies (e.g. Hadoop, relational databases). Advanced analytics requirements such as real-time counting analytics alongside historical trends analytics further necessitate the different technologies, resulting in a highly advanced solution. Oracle identifies this complexity and offers native integration of the big data solutions with Oracle’s SQL relational databases, with one uniform façade for analytics.

While Hadoop’s advantages are known, it is still non-trivial for analysts and data scientists to extracts analytics and gain insights. Oracle Big Data Discovery comes to address this, providing a “visual face of Hadoop”.

Oracle also extends its GoldenGate replication platform with the release of  Oracle GoldenGate for Big Data, which can replicate unstructured data to Hadoop stack (Hive, HBase, Flume). Another aspect of the uniform façade is Oracle SQL, with queries that can now transparently access data in Hadoop, NoSQL and Oracle Database with Oracle Big Data SQL 1.1.

Oracle’s strategy is to leverage its brand and existing SQL installations within enterprises and offer them enterprise-grade versions of the popular open-source tools, and to provide native integration with Oracle’s traditional installation of SQL databases which already exist within the enterprises. It’s left to see how it catches with the enterprises against the hype of the popular big data -vendors and open source projects.

Leave a comment

Filed under Big Data

Facebook Shares Open Networking Switch Design, Part of its Next Gen Networking

Facebook’s enormous scale comes with enormous technological challenges, which go beyond conventional available solutions. For example, Facebook decided to abandon Microsoft’s Bing search engine and instead develop its own revamped search capabilities. Another important area is Facebook’s massive networking needs, which called for a whole new paradigm, code named data center fabric.

10734294_775986955823316_1197403205_n[1]

The next step in Facebook’s next-gen networking architecture is “6-pack” – a new open and modular switch announced just a few days ago. Interesting to note that Facebook chose to announce the new switch the same day Cisco reported its earnings. This is more than a hint at the Networking equipment giant, representing the “traditional networking”. As Facebook says in its announcement, it started the quest for next-gen networking due to

the limits of traditional networking technologies, which tend to be too closed, too monolithic, and too iterative for the scale at which we operate and the pace at which we move.

The new “6-pack” is a modular high volume switch built on merchant silicon based hardware. It enables you to build any size switch using a simple set of common building blocks. The design uses Software Defined Networking (SDN) hybrid approach: While classic SDN separates control plane from forwarding plane and centralizes control decisions, in Facebook’s hybrid architecture each switching element contains a full local control plane on a microserver that communicates with a centralized controller.

Facebook made the design of “6-pack” open as part of the Open Compute Project, together with all the other components of its data center fabric. This is certainly not good news for Cisco and the other vendors, but great news for the community. You can find the full technical design details in Facebook’s post.

Faceook is not the only one in the front line of scaling challenges. The open cloud community OpenStack, as well as the leading public cloud vendors Google and Amazon also shared networking strategies to meet the new challenges coming with the new workloads in modern cloud computing environment.

Cloud and Big Data innovations were born out of necessity in IT, driven by companies with the most challenging use cases and backed by open community. The same innovation is now happening with networking, paving the way to simpler, scalable, virtual and programmable networking based on merchant silicon.

Leave a comment

Filed under Cloud, SDN

Google Helps Measuring Cloud Performance With New Open Source Tools

One of the toughest decisions CIOs are facing when deciding to migrate their workloads to the cloud is – which cloud provider is best suited for their workloads? pricing is one aspect which has been extensively reviewed, with fierce war around price reductions, led by Amazon’s constant price cuts and innovative pricing schemes with/without prior commitment and auctioned spot instances.

Another important aspect is the performance measures of the cloud provider and its fit to the company’s specific workloads. Google, Amazon and others have been investing extensively in gaining superior performance with their hyperscale computing. But unlike pricing, performance is much more complex to measure and compare, with lack of standardization around the relevant KPIs (key performance indicators) and benchmark definitions.

Now comes Google to the rescue with a new open-source tool called PerfKit Benchmarker. The new tool, released a couple of days ago, will enable measuring various KPIs, from elementary such as CPU and bandwidth and up to end-to-end provisioning times. Another tool, Perfkit Explorer, provides visualization of the benchmark results for easy interpretation. In its blog post, Google Cloud Platform Performance Team says

You’ll now have a way to easily benchmark across cloud platforms, while getting a transparent view of application throughput, latency, variance, and overhead.

The tools are managed as open source projects hosted in GitHub (ASLv2 license). It’s said over 30 leading researchers, companies, and customers have been involved in this project, including big industry names such as Intel, Cisco, Microsoft and RedHat, and academia endorsements from Stanford and MIT.

Currently the kit contains 20 benchmarks from basic ping and iperf through Hadoop, Cassandra and MongoDB NoSQL databases. Current release supports the three major public cloud providers: Amazon’s AWS, Microsoft’s Azure, and of course Google’s GCE.

Open source is the new de facto standard. With a new open source tool backed by industry leaders and the community, we can gain a standard benchmark to compare different vendors not just on price but also on performance. As the acceptance of the tool grows, more cloud providers will be covered by the benchmarks, and more benchmarks will be added to address the needs of more complex workloads. Then we will be able to truely choose best of breed for our needs.

Leave a comment

Filed under Cloud

Microsoft Expands Open Source and Big Data Involvement, acquires Revolution Analytics

When you think about Microsoft, you probably think the very opposite of Open Source Software (OSS). Some would even go as far as picturing it as Dr. Evil of OSS. But recent moves show Microsoft begs the differ. This week Microsoft announced acquisition of Revolution Analytics, the company behind the open source R programming language that has become popular for statistical analysis on big data. As Joseph Sirosh, Corporate Vice President, Machine Learning as Microsoft writes in his blog post:

We are making this acquisition to help more companies use the power of R and data science to unlock big data insights with advanced analytics

Microsoft’s acquisition comes after showing interest in the language, both using it internally by Microsoft’s data scientists and frameworks, and actively contributing to open source R projects such as ParallelR, and RHadoop.

This joins other contributions by Microsoft to open source, such as Linux kernel contributions (yes, Microsoft, the father of Windows, contributing to Linux). Microsoft also released some of its core assets as open source, such as the .NET Core programming language and the REEF big-data analytics framework for YARN, and other open-source projects.

Microsoft’s recent moves also shows its recognition that Big Data Analytics is where the world is heading. Organizations accumulated data, and are now looking for ways to monetize on that data and leverage the most advanced technologies and languages for that. Microsoft got a painful reality check a couple of months ago, when Facebook decided to dump Microsoft’s Bing and develop its own revamped big data search framework. Facebook, Twitter, Google and the likes have long realized the potential of their big data and have been developing advanced big data analytics technology to address that.

Microsoft opened 2015 with an impressive acquisition, marking important realization around open source software and big data analytics. Such statement hints for more to come down the year.

Leave a comment

Filed under Big Data, Programming Languages

Amazon Adds Auto-Recovery To Its EC2 Cloud Instances

Amazon had its share of outages. These forced the prudent users to factor that into their design. One of the important guidelines which I pointed out after one such major AWS outage is Automation and Monitoring:

Your application needs to automatically pick up alerts on system events, and should be able to automatically react to the alerts.

At the beginning users had to take care of that themselves. Then Amazon added CloudWatch to enable built-in automatic monitoring. Now comes the next stage – automatically reacting to these notifications.

Amazon announced its latest feature to do just that: Auto Recovery for EC2. With this feature enabled, a failed instance will be automatically detected and recovered, and the recovered instance will be identical to the original, including the same Instance ID, IP address and configuration (e.g. elastic IPs and attached EBS volumes). This means developers no longer need to write the mechanism for the recovery of trivial cases.

The Auto-Recovery feature is based on AWS CloudWatch and the status checks which were added in last couple of years and can detect a variety of failure symptoms such as loss of network connectivity, loss of system power, and software/hardware issues on the physical host. Upon failure it will first attempt to recover on the same machine (involving reboot, so in-memory data will be lost), and then on a different machine (while retaining same ID, IP and configuration). In order to keep a fully automated recovery cycle, the developers better make sure that their software is automatically started upon boot-up (e.g. using cloud-init) so that it restarts automatically also on the rebooted instance.

The service is currently launched in Amazon’s US East (N. Virginia) region, and is currently available for C3, M3, R3, and T2 instance types. The service is offered for free, so you’d only have to pay for the usage of CloudWatch alarms to monitor the system.

Unlike traditional data center, in the cloud the base assumption is that infrastructure instances can and will fail and even frequently. As more applications move to the cloud, including more mission-critical applications, the greater the demand is from cloud providers to provide greater levels of built-in resilience. We shall be seeing more of these features for automatic recovery and self-healing, as well as other types of automated monitoring-triggered responses (both predefined and configurable), in infrastructure level as well as application level, among cloud providers, to meet those needs.

Leave a comment

Filed under Cloud

Oracle Buys Datalogix, Extends Cloud and Big Data Offering

Oracle yesterday announced buying Datalogix. Datalogix provides big data analytics service, using social media sources to analyze consumer behavior, such as Twitter and Facebook (which in itself invested heavily recently in big data analytics capabilities).

This acquisition joins other acquisitions Oracle made in this field in the past year: the acquisition of BlueKai cloud-based big data marketing platform earlier this year, and the acquisition of Responsys, a cloud-based B2C marketing platform, one year ago (December 2013).

This last move is aligned with Oracle’s ongoing strategy to strengthen its position in the cloud, an area which shows significant growth in revenues for Oracle. This comes after Oracle has been loosing business in past few years as more and more businesses moved to the cloud, served by the likes of Salesforce.com. With their cloud strategy, Oracle can dare forecast for next quarter around 33% revenue growth from SaaS, IaaS, and most interestingly PaaS, an area Oracle only recently entered, with Java and Database as a service. Adapting to the new reality and embracing the cloud appears to be a wise move for the company. The combination with big data analytics and social media is a powerful one, and Oracle identified its potential. I would expect to see this taking a more significant part as the company’s growth engine in 2015.

1 Comment

Filed under Big Data, Cloud

Google Extends Its Internet of Things Strategy, Offers Open Research Grants

A few days ago Google announced the launch of the Open Web of Things, an open innovation and research program around the IoT. As part of the new initiative Google published a call for research proposals on IoT with focus on three main areas:

  1. user interface and application development
  2. privacy & security
  3. systems & protocols research

Google offers the elected participants grants, as well as access to hardware, software and systems from Google. Proposal submission is due next month and kick-off expected coming Spring.

Google has had a rough year 2014. Just this week JPMorgan lowered estimates for Google’s revenues, and shortly after Google stock hit a 52-week low. One of the main reasons for that is that Google’s traditional source of revenue, the web search ads, seems to shrink with the transition from desktop to mobile and related disruptors (such as Amazon and Facebook).

As traditional sources of revenue shrink, Google is investing in developing new sources of revenue, aligned with the emerging trends. Google has been promoting a clear strategy around the Internet of Things. Google’s strategy has several tiers, aimed at tackling IoT from several directions, both horizontally (standards, protocols) and vertically (by use cases).

Part of Google’s IoT strategy is done through internal development such as Google Glass devices and Google Now app.

Another significant part of Google’s IoT strategy is done through M&A, most notably the acquisition of smart thermostat manufacturer Nest, which subsequently acquired Dropcam and the “Smart Home” startup Revolv (and subsequently shut down Revolv’s product line in a somewhat controversial move).

Google also places significance in open collaboration with the community through open standardization such as the Thread Group open alliance and open projects such as the Physical Web.

It has been a hectic year for Google, with many changes and uncertainties, and transition into new areas. Google is betting on The Internet of Things as one of the future directions for the company, and has been investing seriously in that direction. It would be interesting to see what plan holds for 2015.

1311765722_picons03 Follow Dotan on Twitter!

Leave a comment

Filed under Internet of Things, IoT