-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d8ffaa9
commit 47662a9
Showing
1 changed file
with
44 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,55 @@ | ||
Node arrange | ||
=============== | ||
|
||
Arranging the BandMaths Operator results | ||
The application goal is to produce daily binned products so the binning processing step needs to have its inputs well organized so that it aggregates in time and space only the products of a given day. | ||
|
||
The application goal is to produce daily binned products so the binning processing step needs to have its inputs well organized so that it aggregates in time and space only the products of a given day. In terms of job template, you will define the path to the streaming executable, one parameter: the period (a day) and instruct the framework that only one task has to be run. | ||
In terms of job template, you will define the path to the streaming executable, one parameter: the period (a day) and instruct the framework that only one task has to be run. | ||
|
||
<pre> | ||
As the second job in this workflow, the expression processing step implements a streaming executable that: | ||
|
||
<jobTemplate id="arrange"> | ||
* Create an R data frame with all references to the data produced by the node expression | ||
* Split the references by period based in the acquisition start time of the input product into groups of references | ||
* Write the groups to the local filesystem in Tab separated files | ||
* Stage-out the Tab separated files to the distributed file system | ||
|
||
<streamingExecutable>/application/arrange/run.R</streamingExecutable> | ||
The job template includes the path to the streaming executable. | ||
|
||
<defaultParameters> | ||
<parameter id="period">day</parameter> | ||
.. code-block:: xml | ||
</defaultParameters> <defaultJobconf> | ||
<streamingExecutable>/application/arrange/run.R</streamingExecutable> | ||
The streaming executable source is available here: `/application/arrange/run.R <https://github.com/Terradue/BEAM-Arithm-tutorial/blob/master/arrange/run.R>`_ | ||
|
||
The job template defines a single parameter: | ||
|
||
<property id="ciop.job.max.tasks">1</property> | ||
* The period for the temporal aggregation (daily) | ||
|
||
</defaultJobconf> | ||
.. code-block:: xml | ||
</jobTemplate> | ||
<defaultParameters> | ||
<parameter id="period">day</parameter> | ||
</defaultParameters> | ||
</pre> | ||
The job template sets the ciop.job.max.tasks to one instance since the streaming executable has to process all inputs at once | ||
|
||
.. code-block:: xml | ||
<defaultJobconf> | ||
<property id="ciop.job.max.tasks">1</property> | ||
</defaultJobconf> | ||
.. The property mapred.task.timeout is not set and uses the defautl value (10 minutes). | ||
Here's the job template including the elements described above. | ||
|
||
.. code-block:: xml | ||
<jobTemplate id="arrange"> | ||
<streamingExecutable>/application/arrange/run.R</streamingExecutable> | ||
<defaultParameters> | ||
<parameter id="period">day</parameter> | ||
</defaultParameters> | ||
<defaultJobconf> | ||
<property id="ciop.job.max.tasks">1</property> | ||
</defaultJobconf> | ||
</jobTemplate> |