mac安装scala
brew cask install java
brew install scala
本地安装scala环境:~/.zshrc
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home
export SCALA_HOME=/Library/Scala/scala-2.10.6
PATH=$PATH:${SCALA_HOME}/bin:${JAVA_HOME}/bin
Hello World
➜ ~ scala
Welcome to Scala 2.12.7 (OpenJDK 64-Bit Server VM, Java 11.0.1).
Type in expressions for evaluation. Or try :help.
scala> print("hello world!")
hello world!
scala> :quit
Scala IDEA和MAC版安装
IDEA 运行Scala程序出现无法加载主类问题的解决
+添加Library的Scala SDK(运行不报错),覆盖原来modules的dependencies里的Scala SDK(编译不报错,运行报错:找不到或无法加载主类
)
IDEA 运行scala程序
object Test {
def main(args: Array[String]): Unit = {
println("Hello World~ ~ ~")
}
}
eclipse 配置scala
插件
下载插件(一定要对应eclipse版本下载)
http://scala-ide.org/download/prev-stable.html
将features和plugins两个文件夹拷贝到eclipse安装目录中的”dropins/scala”目录下。
进入dropins,新建scala文件夹,将两个文件夹拷贝到“dropins/scala”下
Scala官网6个特征
- Java和scala可以混编
- 类型推测(自动推测类型)
- 并发和分布式(Actor)
- 特质,特征(类似java中interfaces 和 abstract结合)
- 模式匹配(类似java switch)
- 高阶函数
Scala的WordCount
导入spark-assembly-1.6.0-hadoop2.6.0.jar
包;项目中创建words.txt
文件
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
import org.apache.spark.rdd.RDD.rddToPairRDDFunctions
object WordCount {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
conf.setMaster("local").setAppName("WC")
val sc = new SparkContext(conf)
val lines :RDD[String] = sc.textFile("./words.txt")
val word :RDD[String] = lines.flatMap{lines => {
lines.split(" ")
}}
val pairs : RDD[(String,Int)] = word.map{ x => (x,1) }
val result = pairs.reduceByKey{(a,b)=> {a+b}}
// result.sortBy(_._1, false).foreach(println)
result.sortBy(_._1,true).foreach(println)
// 简化写法
// lines.flatMap { _.split(" ")}.map { (_,1)}.reduceByKey(_+_).foreach(println)
}
}
flatMap:1对多
map:来一个String出1个String,1对1
reduceByKey:相同key分在1组;对每1组的key进行累加
先分组,后对每一组的key对应的value去聚合
输出结果
(c++,2)
(hbase,2)
(hello,17)
(hive,1)
(java,5)
(matlab,3)
(mongodb,1)
(mysql,3)
(objective-c,2)
(oracle,1)
(pig,1)
(python,8)
(redies,2)
(sqoop,3)
(swift,3)
(word,4)
(zookeeper,1)
参考资料
Scala学习笔记(一) - 简书
hive找出掉线率最高的前10基站&WordCount