Experience in deploying small/large scale Hadoop clusters.
Hadoop enables to create clusters of machines and coordinates work among them.
Clusters can be built and scaled out with inexpensive computers
Administering and managing the Hadoop core software packages
The Hadoop core software package includes the robust, reliable Hadoop Distributed File System (HDFS) - which splits user data across servers in a cluster and replication to ensure that even multiple node failures will not cause data loss.
The MapReduce – which is a parallel, distributed processing system designed for clusters of commodity, shared-nothing hardware. It offers clean abstraction between data analysis tasks and the underlying systems challenges involved in ensuring reliable large-scale computation .
Deploy algorithms to perform various tasks.
No special programming techniques are required to run analyses using Hadoop MapReduce; most existing algorithms work without changes.
Hadoop MapReduce takes advantage of the distribution, reliability and replication of data in Hadoop Distributed File System to spread execution of any job across many nodes in a cluster.
Some existing algorithms are in the following areas:
Recommendations or decision support
Log Processing – Application or web server, phone logs, help desk logs, etc.