通常情况下,使用spark-submit来提交作业。
是否有办法使用代码动态地提交作业?
本文使用SparkLauncher提供了一种解决方法。
提交Spark Application
val handle: SparkAppHandle = newSparkLauncher()
.setSparkHome("/path/to/spark/home")
.setAppResource("/path/to/your/spark/program/jar")
.setConf("spark.driver.memory", "2")
.setConf("spark.driver.extraClassPath", "/your/class/path")
.setConf("spark.executor.extraClassPath", "/your/class/path")
.setConf("spark.executor.memory", "2")
.setConf("spark.executor.cores", "10")
.setConf("spark.num.executors", "5")
.setMainClass("XXXXCLASS")
.setVerbose(true)
.setMaster("yarn")
.setDeployMode("cluster")
.startApplication(newSparkAppHandle.Listener {
override def stateChanged(handle: SparkAppHandle): Unit = {
}
override def infoChanged(handle: SparkAppHandle): Unit = {
}
})
监控所提交的作业
返回值类型的为SparkAppHandle, 可以随时获取作业的状态
handle.getAppId
handle.getState
提交作业时, 传入的Listener可以监听状态事件
.startApplication(newSparkAppHandle.Listener {
override def stateChanged(handle: SparkAppHandle): Unit = {
println(handle.getState)
}
override def infoChanged(handle: SparkAppHandle): Unit = {
println(handle.getState)
}
控制所提交的作业
利用返回值handle
handle.stop
handle.kill
注意:本文归作者所有,未经作者允许,不得转载