Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.
DataSetCreator관련 job돌리면 "Illegal character in fragment at index.."라는 오류가 발생하며
작업이 실해하는 경우가 있는데 이는 HDFS에 만들려는 인자에 '/'가 포함되어 있어서 문제가 발생하는 경우이다.
Helper.scala의 createDirInHDFS() 함수의 hdfs dfs -mkdir 명령에 이어서 -p옵션을 추가해주고 컴파일해서 실행한다.
그리고 URI자체에 영문과 숫자 '/', '.'만 포함하고 '<', '>', '#', '//', ':', ')', '(', ',', '&', '^'등이 들어가지 않도록 특정문자로 치환해준다.
---------------------------오류내용------------------------
Showing 4096 bytes. Click here for full log
st4/ExtVP/SO/_L_http__//data.nasa.gov/qudt/owl/qudt#conversionOffset_B_/_L_http__//data.nasa.gov/qudt/owl/qudt#systemPrefixUnit_B_.parquet at java.net.URI$Parser.fail(URI.java:2848) at java.net.URI$Parser.checkChars(URI.java:3021) at java.net.URI$Parser.parse(URI.java:3067) at java.net.URI.<init>(URI.java:588) at java.net.URI.create(URI.java:850) ... 31 more 16/06/17 13:51:12 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Illegal character in fragment at index 112: test4/ExtVP/SO/_L_http__//data.nasa.gov/qudt/owl/qudt#conversionOffset_B_/_L_http__//data.nasa.gov/qudt/owl/qudt#systemPrefixUnit_B_.parquet) 16/06/17 13:51:12 INFO yarn.ApplicationMaster: Invoking sc stop from shutdown hook 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null} 16/06/17 13:51:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null} 16/06/17 13:51:12 INFO ui.SparkUI: Stopped Spark web UI at http://gsda3:37016 16/06/17 13:51:12 INFO scheduler.DAGScheduler: Stopping DAGScheduler 16/06/17 13:51:12 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors 16/06/17 13:51:12 INFO cluster.YarnClusterSchedulerBackend: Asking each executor to shut down 16/06/17 13:51:12 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorActor: OutputCommitCoordinator stopped! 16/06/17 13:51:12 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 16/06/17 13:51:12 INFO storage.MemoryStore: MemoryStore cleared 16/06/17 13:51:12 INFO storage.BlockManager: BlockManager stopped 16/06/17 13:51:12 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 16/06/17 13:51:12 INFO spark.SparkContext: Successfully stopped SparkContext