• Azure HDInsight

    Azure HDInsight is a fully managed Apache Hadoop service, using the Hortonworks Data Platform (HDP) Hadoop distribution. HDInsight also includes other popular platforms that are commonly deployed alongside Hadoop, such as Apache HBase, Apache Storm, Apache Spark, and R Server for Hadoop.

    HDInsight is used in big data scenarios. In this case, big data refers to a large volume of collected—and likely continually growing—data that is stored in a variety of unstructured or structured formats. This can include data from web logs, social networks, Internet of Things (IoT), or machine sensors—either historical or real-time. For such large amounts of data to be useful, you have to be able to ask the right question. To ask the right question, the data needs to be readily accessible, cleansed (removing elements that may not be applicable to the context), analyzed, and presented. This is where HDInsight comes into the picture.

    The Hadoop technology stack has become the de facto standard in big data analysis. The Hadoop ecosystem includes many tools—HBase, Storm, Pig, Hive, Oozie, and Ambari, just to name a few. You can certainly build your own custom Hadoop solution using Azure VMs. Or you can leverage the Azure platform, via HDInsight, to provision and manage one for you. You can even deploy HDInsight clusters on Windows or Linux. Provisioning Hadoop clusters with HDInsight can be a considerable time saver (versus manually doing the same).

    Source of Information : Microsoft Azure Essentials Fundamentals of Azure Second Edition


0 comments:

Leave a Reply