Category Archives: Big Data

Toyota Launches Automotive Edge Computing Consortium To Address Big Data From Connected and Self Driving Cars

The age of smart connected cars and autonomous vehicles brings with it a new challenge to the automotive industry: Big Data. Japanese auto manufacturer Toyota estimates that the data volume between vehicles and the cloud will reach 10 exabytes (10.7 billion Gigabytes) per month till the year 2025, which is approximately 10,000 times larger than the present volume. This sort of big data challenge calls for Edge Computing.

This challenge brought Toyota to team up with Japanese auto parts maker Denso Corp, Japanese telecoms NTT, Intel and Ericsson to form the Automotive Edge Computing Consortium which was announced a few days ago. This consortium will

develop an ecosystem for connected cars to support emerging services such as intelligent driving and transport, creating of maps with real-time data as well as driving assistance based on cloud computing.

The consortium will use Edge Computing and network design to accommodate automotive big data in a reasonable fashion between vehicles and the cloud.

Last March Toyota showed off its first autonomous test vehicle developed entirely by Toyota Research Institute, following GoogleTeslaUber and others in the race to disrupt transportation. Even consortium member Intel announced last week starting to build a fleet of fully autonomous (level 4 SAE) test cars, based on its acquisition of Mobileye earlier this year.

Toyota states its exploration of autonomous vehicles dates back as far as 2005. Now with edge computing architecture it can also face the associated big data challenge.

1311765722_picons03 Follow Horovits on Twitter!

Leave a comment

Filed under Autonomous Car, Big Data, Cloud, Edge Computing, Internet of Things, IoT

IoT, Big Data and Machine Learning Push Cloud Computing To The Edge

“The End Of Cloud Computing” – that’s the dramatic title for a talk given by Peter Levine at a16z Summit last month. Levine, a partner at Anderssen Horowitz (a16z) VC fund, worked out his investor foresight and tried to imagine the world beyond cloud computing. The result was an insightful and fluent talk, stating that the centralized cloud computing as we know it is about to be superseded by a distributed cloud inherent in a multitude of edge devices. Levine highlights the rising forces driving this change:

The Internet of Things (IoT). Though the notion of IoT has been around for a few decades it seems it’s really taking place now, and that our world will soon be inhabited by a multitude of smart cars, smart homes and smart everything, each with embedded compute, storage and networking. Levine gives a great example of a computer card found in current day’s luxury cars, containing around 100 CPUs in it. having several such cards in a car would make it a mini data center on wheels. Having thousands of such cars on the roads makes it a massive distributed data center.

smart-car-card

Big Data Analytics. The growing amount of connected devices and sensors around us constantly collecting real world input generates massive amount of data of different types, from temperature and pressure to images and videos. And that unstructured and highly variable data stream needs to be processed and analyzed in real time in order to extract insights and make decisions by the little brains of the smart devices. Just imagine your smart car approaching a stop sign, and the need to process the image input, realize the sign and make the decision to stop – all in a matter of a second or less- would you send it over to the remote cloud for the answer?

Machine Learning. While traditional computer algorithms are well suited for dealing with well-defined problem spaces, the real world has a complex, diverse and unstructured nature of data. Levine believes that endpoints will need to execute Machine Learning algorithms to decipher the data effectively and make intelligent insights and decisions to the countless number and permutations of situations that can occur in the real world.

So should Amazon, Microsoft and Google start worrying? Not really. The central cloud services will still be there, but with different focus. Levine sees the central cloud role in curating data from the edge, performing central non-real-time learning which can then be pushed back to the edge, and long-term storage and archiving of the data. In its new incarnation, the entire world becomes the domain of IT.

You can watch the recording of Levine’s full talk here.

1311765722_picons03 Follow Horovits on Twitter!

2 Comments

Filed under Big Data, Cloud, IoT

Microsoft Launches New Big Data Stream Analytics Cloud Service

In the last couple of weeks we saw the fight heating up between Google and Amazon over big data in the cloud. Now Microsoft is calling the bet, announcing the general availability of its Azure Stream Analytics (ASA). The new cloud service, which was launched in public preview late last year, enables real-time stream processing of data from various sources, such as applications, devices, sensors, mobile phones or connected cars. In fact, Microsoft places strong emphasis on the use cases of the Internet of Things, a hot topic these days which Microsoft pioneered back in the 1990 but somehow managed to miss the wave, and is now trying to get back on it.

Earlier this year Microsoft bought Revolution Analytics, the company behind the open source R programming language that has become popular for statistical analysis on big data, as part of Microsoft’s effort to develop its suite of advanced analytics.

Microsoft puts emphasis, same as its competitors, on making its stream analytics service easy for development and operations, so small companies and even start-ups can get into this hot field without massive up-front investment. That while leveraging the power of the cloud to ensure transparent resilience and scalability, security and multi-tenancy.

Another interesting aspect is the built-in integration of Azure Stream Analytics with Microsoft’s Event Hubs, Microsoft’s Publish-Subscribe messaging service, which was made generally availability late last year, and is said to be able to log millions of events per second. Microsoft also targets this service for Internet of Things and telemetry ingestion use cases. This part of Microsoft’s offering is similar to Google’s Pub/Sub and Amazon’s Kinesis.

In a blog post by Joseph Sirosh, Corporate Vice President of Information Management & Machine Learning at Microsoft, he shares customer use cases by Fujitsu, NEC and Aerocrine. Quoting Allen Ganz, Director of Business Development at NEC:

NEC has found that using the Azure IoT Services has enabled us to quickly build compelling intelligent digital signage solutions that meet our customer’s needs and help them transform their business processes

Microsoft, same as its competitors, is aiming at providing a full and organic suite to cover the full cycle of big data ingestion, processing and analytics, to cater for the proliferation of big data use cases and ventures, and especially around the Internet of Things.

You can read more on the new Azure Stream Analytics here.

1311765722_picons03 Follow Dotan on Twitter!

2 Comments

Filed under Big Data, Cloud, IoT

Google-Amazon Fight Over Big Data In The Cloud Is Heating Up

Google today announced that it’s releasing its Cloud Dataflow in open beta. This big data analytics service was launched in closed beta version at Google’s annual developer conference last June, with a major update last December when they released an open source Java SDK to make it easier for developers to integrate with the new service.

Just last month Google announced that it was moving its Cloud Pub/Sub into public beta. This service for real-time messaging is yet another layer in the overall big data and analytics suite that Google has been building up.

Google’s strategy aims to cater for the full big data and analytics cycle of Capture->Store->Process->Analyze from within Google Cloud Platform’s organic services (such as Pub/Sub, Dataflow and BigQuery), as well as with plugging in external popular frameworks such as Hadoop, Spark and Kafka, in a modular way.

Google Cloud Platform BI conf

Google Cloud Platform BI Suite

Google’s offering comes as a response to Amazon’s offering in the big data and analytics area, with services such as KinesisRedShift, Elastic MapReduce and Lambda. Interesting to note that last week at the AWS Summit in San Francisco Amazon announced Lambda service is generally available for production use. Amazon also maintains its smart strategy of tightening integration between their services, now enabling to run AWS Lambda functions in response to events in Amazon Cognito.

Amazon also puts emphasis on optimizing the infrastructure services for big data. A couple of weeks ago AWS launched new type of EC2 instances with high density storage optimized for storing and processing multi-terabyte data sets.

Another very interesting announcement from AWS last week was the announcement of Amazon Machine Learning new service, which gives an important dimension of analytics to their suite.

Amazon and Google are not the only players in the big data cloud services. With big companies such Oracle and Microsoft, this market definitely becomes hot.

1311765722_picons03 Follow Dotan on Twitter!

7 Comments

Filed under Big Data, Cloud, cloud deployment

Big Data Processing on Amazon Cloud with New High Density Storage EC2 Instances

Are you running on Amazon’s cloud and struggling with handling big data? Then check this out: Amazon Web Services today released the new D2 series of EC2 instances, supporting dense storage which can handle multi-terabyte data sets.

The new instances do not only provide better CPU and memory spec. These instances are geared for sustained high rate of sequential disk I/O for access to extremely large data sets, which can get up to 3,500 MB/second read and 3,100 MB/second write performance on the largest instances. These new instances also have enhanced networking and support for placement groups which boosts access to remote instances. This makes the D2 instances classic for use cases such as Hadoop clusters and their MapReduce jobs, Massive Parallel Processing data warehouse, log processing etc.

It’s important to note that the disk I/O boost is for the local ephemeral storage, which is “gone” as the EC2 compute instance is “gone”. So it is up to the user to take care of redundancy of the data as needed, whether in RAID form, or using distributed file systems such as HDFS or GlusterFS. The new instances also come EBS-optimized by default, so you can offload local data as needed to EBS (Amazon’s native block storage) volumes while leveraging dedicated high bandwidth that doesn’t impact your regular network traffic.

Amazon guys did nice work integrating advanced features of the Linux kernel and of the Intel XEON CPUs. If you need to chew through your massive data sets, you’d want to check it out.

1311765722_picons03 Follow Dotan on Twitter!

1 Comment

Filed under Big Data, Cloud

Oracle Boosts Its Big Data Offering

Oracle, the proverbial SQL icon, knows it cannot ignore the trend of big data, nor does it attempt to. On the contrary. Oracle has been promoting big data offerings, both in-house and via acquisitions such as the recent Datalogix acquisition. Yesterday Oracle announced an important milestone with the release of a suite of big data solutions.

In modern organizations data comes from multiple and diverse sources, both structured and unstructured, and across various technologies (e.g. Hadoop, relational databases). Advanced analytics requirements such as real-time counting analytics alongside historical trends analytics further necessitate the different technologies, resulting in a highly advanced solution. Oracle identifies this complexity and offers native integration of the big data solutions with Oracle’s SQL relational databases, with one uniform façade for analytics.

While Hadoop’s advantages are known, it is still non-trivial for analysts and data scientists to extracts analytics and gain insights. Oracle Big Data Discovery comes to address this, providing a “visual face of Hadoop”.

Oracle also extends its GoldenGate replication platform with the release of  Oracle GoldenGate for Big Data, which can replicate unstructured data to Hadoop stack (Hive, HBase, Flume). Another aspect of the uniform façade is Oracle SQL, with queries that can now transparently access data in Hadoop, NoSQL and Oracle Database with Oracle Big Data SQL 1.1.

Oracle’s strategy is to leverage its brand and existing SQL installations within enterprises and offer them enterprise-grade versions of the popular open-source tools, and to provide native integration with Oracle’s traditional installation of SQL databases which already exist within the enterprises. It’s left to see how it catches with the enterprises against the hype of the popular big data -vendors and open source projects.

1311765722_picons03 Follow Dotan on Twitter!

1 Comment

Filed under Big Data

Microsoft Expands Open Source and Big Data Involvement, acquires Revolution Analytics

When you think about Microsoft, you probably think the very opposite of Open Source Software (OSS). Some would even go as far as picturing it as Dr. Evil of OSS. But recent moves show Microsoft begs the differ. This week Microsoft announced acquisition of Revolution Analytics, the company behind the open source R programming language that has become popular for statistical analysis on big data. As Joseph Sirosh, Corporate Vice President, Machine Learning as Microsoft writes in his blog post:

We are making this acquisition to help more companies use the power of R and data science to unlock big data insights with advanced analytics

Microsoft’s acquisition comes after showing interest in the language, both using it internally by Microsoft’s data scientists and frameworks, and actively contributing to open source R projects such as ParallelR, and RHadoop.

This joins other contributions by Microsoft to open source, such as Linux kernel contributions (yes, Microsoft, the father of Windows, contributing to Linux). Microsoft also released some of its core assets as open source, such as the .NET Core programming language and the REEF big-data analytics framework for YARN, and other open-source projects.

MS-buys-RevolutionR

Microsoft’s recent moves also shows its recognition that Big Data Analytics is where the world is heading. Organizations accumulated data, and are now looking for ways to monetize on that data and leverage the most advanced technologies and languages for that. Microsoft got a painful reality check a couple of months ago, when Facebook decided to dump Microsoft’s Bing and develop its own revamped big data search framework. Facebook, Twitter, Google and the likes have long realized the potential of their big data and have been developing advanced big data analytics technology to address that.

Microsoft opened 2015 with an impressive acquisition, marking important realization around open source software and big data analytics. Such statement hints for more to come down the year.

—————————————————————-
Update: Microsoft released its big data Azure Stream Analytics cloud service. check out the details in this post.

1311765722_picons03 Follow Dotan on Twitter!

3 Comments

Filed under Big Data, Programming Languages