Saturday, August 26, 2017

Evaluate Big Data with Hadoop

Of all recent IT innovations, Hadoop has certainly brought the biggest upheavals for companies. The solution promises to make the steadily growing data flood gain. In my industry, media and telecommunications, Hadoop offers a whole range of analyzes that can be used in such diverse areas as network planning, customer service, IT security, fraud detection, and targeted advertising Code>


The second generation


So far, however, many normal companies have found it difficult to exploit this data potential. Many have experimented with some of the 13 Apache Hadoop functional modules, a set of technologies that Hadoop users of the first hour, including eBay, Facebook and Yahoo, used to deploy large teams and invest for several years Br>


All purpose weapon for Big Data


First-generation Hadoop technology (1.x-) was neither easy to implement nor easy to handle. New users had difficulties configuring the different components of a Hadoop cluster. Apparently slight and therefore easily overlooked details such as patch versions proved to be extremely important. The result was that the service failed more frequently than expected and many problems only became apparent at high capacity. Businesses are still lacking in knowledge, although leading providers such as Hortonworks are performing well.


Data parking space ade


Many of these gaps are fortunately closed by the second generation of Hadoop tools called Hortonworks HDP 2.0, which were the subject of lively discussions at the recent Hadoop Summit in Amsterdam in 2017.


Also interesting is


One of the key expectations of customers is that the system is good to handle. This is particularly true of the business-critical applications that service providers have to deal with. With the intuitive web interface Ambari, Hadoop has made a big step forward here. With Ambari, Hadoop clusters are much easier to set up, manage, and monitor.


Ambari provides automated initial installation, as well as ongoing upgrades without service interruption coupled with high availability and emergency recovery - all factors essential to efficient IT operations.


In addition, the ecosystem of independent software vendors, which Hadoop is developing, is growing. This is important for two reasons: Firstly, when buying, Hadoop depends on how Hadoop can be integrated into the existing IT environment, which in most cases includes business intelligence solutions and data warehouses from traditional providers. Secondly, this raises concerns about the lack of knowledge in one's own team.


For example, Deutsche Telekom has more than 600 IT employees with SQL knowledge. Although many of these people will now gain even more comprehensive knowledge about and with Hadoop, thanks to the integration on product level, such as Microsoft and Teradata offer, such employees also ask Hadoop who are (yet) not Hadoop specialists.


Improved security and optimized data lifecycle management also play a major role for companies that want to build a general purpose platform for Big Data that can serve different departments, applications, and data policies. For security, the Knox system provides a single, secure access for the entire Apache Hadoop cluster. Falcon controls the data life cycle management framework by using a declarative programming language (similar to XML) to control data movement, coordinate data pipelines, and set policies for the life cycle and data processing Br>


Perhaps the most important point, however, is that with the increasing popularity of Hadoop in companies, the system has shown that the system must support a wide range of processing models - even beyond batch processing - in order to offer a wider range of applications to typical companies. Most companies want to store data in Hadoop's distributed data system (Hadoop Distributed File System, HDFS) and have different, concurrent access points while maintaining the same service level.


Hadoop 2.0 also includes the resource management tool Yarn, which separates applications from multiple applications and supports a variety of other applications, including interactive processing, online processing, streaming, and graph processing. So you can say without exaggeration that Hadoop has evolved from low-cost data parking to a platform that supports fast and well-founded decisions.


A good example of this is Gigaset, a company known for its cordless telephones, formerly a business unit of the Siemens Group. With the intelligent solution for networked living "Gigaset Elements" the company fully exploits the possibilities of modern big data technologies. With the help of Hadoop, Gigaset is opening up a completely new market in which further business models are likely to become possible.


Elements consists of a cluster of small sensors that can be installed quickly and easily in every house - they can be easily attached to doors or windows. The robust and easy-to-use element sensors monitor the home and send the data to the Hadoop Cloud via a base station.


An example from the practice


This may seem relatively simple, but the various warnings, events, and pings sent by Elements add up to ten million messages per day within a short time. The traffic volume of millions of doors opened and closed under the watchful eye of Elements is equivalent to a denial-of-service attack.


This sea of ​​raw data is only sorted according to statistical relevance. How they are to be interpreted and what decisions they make is left to the individual customer who sees the visualized data on his or her smartphone or computer. Customers can, for example, alert external service providers, such as rescue or security services.


View


This new real-time information system for consumers, rooted in the growing Internet of Things, is years away from the traditional end-user business.


As far as the history of a company that makes a leap forward with Hadoop. But when do others follow this example? My prediction is that by 2017, more than half of the world's 2,000 largest corporations will use Hadoop and make productive use. I also assume that in five years we will see significantly higher profitability in many industries. Companies that rely heavily on Hadoop have the edge.

No comments:

Post a Comment