发布网友 发布时间:2022-04-23 20:27
共2个回答
热心网友 时间:2023-07-08 10:33
安装spark
tar -zxvf spark-1.3.0-bin-hadoop2.3.tgz
mkdir /usr/local/spark
mv spark-1.3.0-bin-hadoop2.3 /usr/local/spark
vim /etc/bashrc
export SPARK_HOME=/usr/local/spark/spark-1.3.0-bin-hadoop2.3
export PATH=$SCALA_HOME/bin:$SPARK_HOME/bin:$PATH
source /etc/bashrc
cd /usr/local/spark/spark-1.3.0-bin-hadoop2.3/conf/
cp spark-env.sh.template spark-env.sh
vim spark-env.sh
export JAVA_HOME=/java
export SCALA_HOME=/usr/lib/scala/scala-2.10.5
export SPARK_HOME=/usr/local/spark/spark-1.3.0-bin-hadoop2.3
export SPARK_MASTER_IP=192.168.137.101
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/home/hadoop/hadoop/etc/hadoop
export SPARK_LIBRARY_PATH=$SPARK_HOME/lib
export SCALA_LIBRARY_PATH=$SPARK_LIBRARY_PATH
cp slaves.template slaves
vim slaves
hd1
hd2
hd3
hd4
hd5
7、分发
scp /etc/bashrc hd2:/etc
scp /etc/bashrc hd3:/etc
scp /etc/bashrc hd4:/etc
scp /etc/bashrc hd5:/etc
scp -r /usr/local/spark/spark-1.3.0-bin-hadoop2.3 hd2:/usr/local/spark/
scp -r /usr/local/spark/spark-1.3.0-bin-hadoop2.3 hd3:/usr/local/spark/
scp -r /usr/local/spark/spark-1.3.0-bin-hadoop2.3 hd4:/usr/local/spark/
scp -r /usr/local/spark/spark-1.3.0-bin-hadoop2.3 hd5:/usr/local/spark/
7、 启动
在hd1,启动
cd $SPARK_HOME/sbin
./start-all.sh
热心网友 时间:2023-07-08 10:34
分布式是一个概念,一个理论。任何需要大数据处理的领域都可以应用分布式,而Hadoop、spark等是分布式计算框架,只要需要应用到分布式处理的系统,都可以使用它们。现在互联网上这么多信息,安全领域也早就进入了大数据阶段了,身份认证,授权