VK Cloud Solutions logo

Component versions

Description

VK CS Hadoop is based on the Hortonworks Data Platform (HDP) version 3.1. HDP is a ready-to-use, reliable and proven distribution of Apache Hadoop components that meets all your data processing, storage and analysis needs, leveraging the full power of Apache Hadoop's mature ecosystem.

Templates and versions

Within the VK Cloud Solutions Big Data service, Hadoop templates of versions 2.6 and 3.1 are available.

Below are the component versions according to the specified templates:

Hadoop 2.6

Component
Version
Status
Description
Ambari Metrics
0.1.0
Installed
A system for collecting cluster metrics with the ability to store and retrieve collected metrics.
HDFS
2.7.3
Installed
Apache Hadoop Distributed File System.
Hive
1.2.1000
Installed
A data warehouse for analyzing large datasets and ad-hoc queries using SQL.
Jupyter
1.0.0
Installed
Jupyter
Kafka
1.0.0
Installed
High bandwidth distributed messaging system.
MapReduce2
2.7.3
Installed
Data processing service using the MapReduce paradigm.
Oozie
4.2.0
Installed
A system for managing workflows and performing repetitive tasks in the Hadoop ecosystem. Includes installation of the optional Oozie Web Console using the ExtJS library.
Pig
0.16.0
Installed
A platform for processing big data with scripts.
Slider
0.92.0
Installed
A framework for deploying existing distributed applications using YARN with management and monitoring capabilities.
Spark2
2.3.0
Installed
A fast and flexible platform for processing large amounts of data.
Sqoop
1.4.6
Installed
A tool for transferring arrays of data between Apache Hadoop and structured data stores (e.g. relational databases).
Superset
0.15.0
Installed
A platform for conducting interactive exploratory data analysis.
Tez
0.7.0
Installed
A query processing framework that runs on top of YARN.
YARN
2.7.3
Installed
Hadoop Ecosystem Resource Planner.
Zeppelin Notebook
0.7.3
Installed
Web notebook for interactive data analysis. Lets you create beautiful interactive, co-authoring documents containing elements of SQL, Scala, etc.
ZooKeeper
3.4.6
Installed
Centralized service for highly reliable distributed coordination.
Accumulo
1.7.0
Manual addition
Reliable, scalable, high-performance distributed key-value storage.
Airflow
1.9.0
Manual addition
A workflow planner that helps you plan complex workflows and provides an easy way to maintain them.
Ambari Infra
0.1.0
Manual addition
The main shared service used by all Ambari-controlled components.
Atlas
0.8.0
Manual addition
A platform for managing cluster metadata.
Druid
0.10.1
Manual addition
Fast distributed columnar data store.
Falcon
0.10.0
Manual addition
A platform for data management and processing.
Flume
1.5.2
Manual addition
Distributed service for collecting, aggregating and transferring large amounts of streaming data to HDFS.
HBase
1.1.2
Manual addition
Non-relational (NoSQL) distributed database, plus high-performance SQL layer for low latency applications.
Kerberos
1.10.3-10
Manual addition
A network authentication protocol based on the concept of tickets. Allows nodes communicating over an unsecured channel to securely identify each other.
Knox
0.12.0
Manual addition
A service that provides a single point of authentication and access to Hadoop cluster services.
Log Search
0.5.0
Manual addition
A log aggregation, analysis and visualization tool used in services powered by Ambari. At the Technical Preview stage.
Mahout
0.9.0
Manual addition
A platform for creating free implementations of distributed or otherwise scalable machine learning algorithms. Used primarily for collaborative filtering, clustering and classification tasks.
Ranger
0.7.0
Manual addition
Service for providing comprehensive security of the Hadoop cluster.
Ranger KMS
0.7.0
Manual addition
Security key management server.
SmartSense
1.4.5.2.6.2.2-1
Manual addition
A tool for quickly collecting settings, metrics and logs from Hadoop cluster services. Provides recommendations for a specific cluster and helps with prompt resolution of problems.
Spark
1.6.3
Manual addition
A fast and flexible platform for processing large amounts of data.
Storm
1.1.0
Manual addition
A framework for processing streaming data.

Hadoop 3.1

Component
Version
Status
Description
Ambari Metrics
0.1.0
Installed
A system for collecting cluster metrics with the ability to store and retrieve collected metrics.
HBase
2.0.0.3.1
Installed
Non-relational (NoSQL) distributed database, plus high-performance SQL layer for low latency applications.
HDFS
3.1.1.3.1
Installed
Apache Hadoop Distributed File System.
Hive
3.0.0.3.1
Installed
A data warehouse for analyzing large datasets and ad-hoc queries using SQL.
Jupyter
1.0.0
Installed
Jupyter
Kafka
1.0.0.3.1
Installed
High bandwidth distributed messaging system.
MapReduce2
3.0.0.3.1
Installed
Data processing service using the MapReduce paradigm.
Oozie
4.4.0
Installed
A system for managing workflows and performing repetitive tasks in the Hadoop ecosystem. Includes installation of the optional Oozie Web Console using the ExtJS library.
Pig
0.16.1.3.1
Installed
A platform for processing big data with scripts.
Spark2
2.3.0
Installed
A fast and flexible platform for processing large amounts of data.
Sqoop
1.4.7
Installed
A tool for transferring arrays of data between Apache Hadoop and structured data stores (e.g. relational databases).
Tez
0.9.0.3.1
Installed
A query processing framework that runs on top of YARN.
YARN
3.1.0
Installed
Hadoop Ecosystem Resource Planner.
Zeppelin Notebook
0.8.0
Installed
Web notebook for interactive data analysis. Lets you create beautiful interactive, co-authoring documents containing elements of SQL, Scala, etc.
ZooKeeper
3.4.9.3.1
Installed
Centralized service for highly reliable distributed coordination.
Accumulo
1.7.0
Manual addition
Reliable, scalable, high-performance distributed key-value storage.
Airflow
1.10.11
Manual addition
A workflow planner that helps you plan complex workflows and provides an easy way to maintain them.
Atlas
0.7.0.3.1
Manual addition
A platform for managing cluster metadata.
Druid
0.12.1
Manual addition
Fast distributed columnar data store.
Infra solr
0.1.0
Manual addition
The main shared service used by managed Ambari components.
Kerberos
1.10.3-30
Manual addition
A network authentication protocol based on the concept of tickets. Allows nodes communicating over an unsecured channel to securely identify each other.
Knox
0.5.0.3.1
Manual addition
A service that provides a single point of authentication and access to Hadoop cluster services.
Log Search
0.5.0
Manual addition
A log aggregation, analysis and visualization tool used in services powered by Ambari. At the Technical Preview stage.
NiFi
1.9.0
Manual addition
Apache NiFi is an easy-to-use, powerful and reliable system for processing and distributing data.
NiFi Registry
0.3.0
Manual addition
NiFi Registry is an add-on application that provides a central location for storing and managing shared resources in one or more NiFi and / or MiNiFi instances.
Ranger
1.2.0.3.1
Manual addition
Service for providing comprehensive security of the Hadoop cluster.
Ranger KMS
1.2.0.3.1
Manual addition
Security key management server.
Schema Registry
0.7.0
Manual addition
The Hortonworks Registry provides a schema registry, a machine learning registry, and a platform for object versioning.
SmartSense
1.5.1.2.7.3.0-139
Manual addition
A tool for quickly collecting settings, metrics and logs from Hadoop cluster services. Provides recommendations for a specific cluster and helps with prompt resolution of problems.
Storm
1.2.1
Manual addition
A framework for processing streaming data.
Streaming Analytics Manager
0.6.0
Manual addition
Hortonworks Streaming Analytics Manager makes it easy to create streaming applications and perform streaming analytics.
Superset
0.23.0
Manual addition
A platform for conducting interactive exploratory data analysis.

Updating versions

Component versions are subject to change without notice. Use the Ambari web interface to view the latest component versions.