Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: http://hdl.handle.net/11701/6970
Полная запись метаданных
Поле DCЗначениеЯзык
dc.contributor.authorPotapov, Vadim P.-
dc.contributor.authorKostylev, Mikhail A.-
dc.contributor.authorPopov, Semen E.-
dc.date.accessioned2017-07-19T11:21:09Z-
dc.date.available2017-07-19T11:21:09Z-
dc.date.issued2017-06-
dc.identifier.citationPotapov V. P., Kostylev M. A., Popov S. E. The streaming processing of sar data in distributed environment with Apache Spark. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2017, vol. 13, iss. 2, pp. 168–181.en_GB
dc.identifier.other10.21638/11701/spbu10.2017.204-
dc.identifier.urihttp://hdl.handle.net/11701/6970-
dc.description.abstractThis article presents a modern approach to creating a distributed program complex based on mass-parallel technology for pre- and postprocessing of SAR images. The unique features of the system is the ability to work in real time mode with huge amounts of streaming data and applying existing algorithms that are not used for distributed processing on multiple nodes without changing the algorithms’ implementation. A comparison has been made of distributed processing technologies based on which we have selected Apache Spark. The ability to organise automatic processing of input SAR images as a sequence of operations which should be performed based on defined conditions is demonstrated. The results of processing store in the system as fault tolerant distributed collections of data (RDD-Resilient Distributed Data), which allows getting and saving the intermediate results in the distributed file system HDFS as and when new space images became available and processed by the sequence of algorithms. This article described the implementation for the specific tasks of SAR data processing based on the suggested approach is described (phase estimation, coregistration, interferogram creation and phase unwrapping with region growing method). A scheme of the phase unwrapping algorithm with the ability to use GPU and NVIDIA CUDA technology is presented. An adaptation of the algorithm for the mass-parallel systems is shown. The algorithm implementation focused on processing pair of SAR images on one node. Performance growth is achieved by simultaneous processing multiple images whose number is equal to cluster nodes count. An example of methods implementation for working with streaming binary data (BinaryRecordStream) which perform monitoring of new SAR data in distributed file system HDFS and readingthis data as binary files with fixed bytes size is shown. A directory and size of one record are used as the input parameters. The results of testing developed algorithms on demonstration cluster is presented. A possibility of getting up to eight times better processing speed using eight nodes in a cluster for the same images count in comparison with sequential processing on one node is shown. Results of testing provide the ability to improve the performance of presented algorithms without any changes in implementation and this in turn justifies the utility of applying distributed approach for SAR data processing. Refs 26. Figs 4. Tables 3.en_GB
dc.language.isoruen_GB
dc.publisherSt Petersburg State Universityen_GB
dc.relation.ispartofseriesVestnik of St Petersburg University. Applied Mathematics. Computer Science. Control Processes;Volume 13; Issue 2-
dc.subjectApache Sparken_GB
dc.subjectApache Hadoopen_GB
dc.subjectdistributed information systemsen_GB
dc.subjectsar interfometryen_GB
dc.subjectprocessing algorithmsen_GB
dc.titleThe streaming processing of sar data in distributed environment with Apache Sparken_GB
dc.typeArticleen_GB
Располагается в коллекциях:Issue 2

Файлы этого ресурса:
Файл Описание РазмерФормат 
04-Potapov.pdf428,44 kBAdobe PDFПросмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.