Difference between revisions of "Guide to roll-your-own python hddm transforms"

From GlueXWiki
Jump to: navigation, search
Line 1: Line 1:
 
At one time, if you wanted to catenate two similar hddm files, you might do it with "hddmcat", or maybe with "hddmcp", or "hddmcp_c". Or if you have been in the collaboration long enough, you might remember our old friend "hddm_merge_files". Each of these utilities had a specific purpose at one time, no two of them are quite identical. There are others that do slightly different things, such as "hddm-xml", "xml-hddm", "hddm_cull_events", etc. The purpose of this page is to convince you that ALL of these are obsolete, now that the hddm python module is available. Using the python shell, you can perform a much more general set of actions on hddm input files, including merging, splitting based on event inspection, and general transforms of the event content. All of this is illustrated with examples below.
 
At one time, if you wanted to catenate two similar hddm files, you might do it with "hddmcat", or maybe with "hddmcp", or "hddmcp_c". Or if you have been in the collaboration long enough, you might remember our old friend "hddm_merge_files". Each of these utilities had a specific purpose at one time, no two of them are quite identical. There are others that do slightly different things, such as "hddm-xml", "xml-hddm", "hddm_cull_events", etc. The purpose of this page is to convince you that ALL of these are obsolete, now that the hddm python module is available. Using the python shell, you can perform a much more general set of actions on hddm input files, including merging, splitting based on event inspection, and general transforms of the event content. All of this is illustrated with examples below.
  
First, you need to make sure that your shell environment is properly configured for using the standard GlueX software stack. To verify this, try the following command at your linux shell prompt.
+
First, you need to make sure that your shell environment is properly configured for using the standard GlueX software stack. To verify this, try the following command at your linux shell prompt. If this command returns without an error, you are all set to go. If it complains about the hddm_s module not being found, you need to find the appropriate setup_env script for your login shell, source it and try again.
 
<pre>
 
<pre>
 
python -c 'import hddm_s'
 
python -c 'import hddm_s'
 
</pre>
 
</pre>
If that command returns without an error, you are all set to go. If it complains about the hddm_s module not being found, you need to find the appropriate setup_env script for your login shell, source it and try again.
 
  
In the next example, we learn how to catenate two hddm files named "t1.hddm" and "t2.hddm" into a single output file "tcat.hddm"
+
In the first example, we learn how to catenate two hddm files named "t1.hddm" and "t2.hddm" into a single output file "tcat.hddm" with a single command.
 
<pre>
 
<pre>
 
python -c 'import hddm_s;fout=hddm_s.ostream("tcat.hddm");[[fout.write(rec) for rec in hddm_s.istream(fin)] for fin in ("t1.hddm","t2.hddm")]'
 
python -c 'import hddm_s;fout=hddm_s.ostream("tcat.hddm");[[fout.write(rec) for rec in hddm_s.istream(fin)] for fin in ("t1.hddm","t2.hddm")]'
 
</pre>
 
</pre>

Revision as of 12:22, 24 July 2018

At one time, if you wanted to catenate two similar hddm files, you might do it with "hddmcat", or maybe with "hddmcp", or "hddmcp_c". Or if you have been in the collaboration long enough, you might remember our old friend "hddm_merge_files". Each of these utilities had a specific purpose at one time, no two of them are quite identical. There are others that do slightly different things, such as "hddm-xml", "xml-hddm", "hddm_cull_events", etc. The purpose of this page is to convince you that ALL of these are obsolete, now that the hddm python module is available. Using the python shell, you can perform a much more general set of actions on hddm input files, including merging, splitting based on event inspection, and general transforms of the event content. All of this is illustrated with examples below.

First, you need to make sure that your shell environment is properly configured for using the standard GlueX software stack. To verify this, try the following command at your linux shell prompt. If this command returns without an error, you are all set to go. If it complains about the hddm_s module not being found, you need to find the appropriate setup_env script for your login shell, source it and try again.

python -c 'import hddm_s'

In the first example, we learn how to catenate two hddm files named "t1.hddm" and "t2.hddm" into a single output file "tcat.hddm" with a single command.

python -c 'import hddm_s;fout=hddm_s.ostream("tcat.hddm");[[fout.write(rec) for rec in hddm_s.istream(fin)] for fin in ("t1.hddm","t2.hddm")]'