This thread is a great idea.  I think it's going to at least give a ballpark estimate of what people can expect to see and determine what they need based on some kind of background instead of just doing the same tests over and over again that others have done. 
While it's probably impossible to make identical rigs and eliminate every single variable that could be introduced by having different people attempt to do the same tests, there probably should be some description or definition of what makes a 'test'.
I think most people are testing in a 'tank' which is typically a tub with water in it and a thruster attached to some kind of arm that extends up out of the water to a force measurement device. 
Alternately I just saw njs552 posted one that used an arm pulling on a force meter (a great jig, just different). 
Some things that may affect the setup off the top of my head are: 
1. The two parts of the lever arm should be the same distance from the pivot point or you need to show your math for how you arrived at the final result (if one arm is twice as long you need to divide or multiply depending on which way the force is going...)
2. Gravity should be eliminated from the calculation so measuring a force where one arm is horizontal and the other is vertical you should ensure the measuring device is zeroed out and that you compensate for the fact that the horizontal arm will be helped by gravity pulling down for the test, but not in a real situation.
3. Swirling: I think if all of the tests are done in a tub, the size of the tub should be listed because once the motor starts moving you are setting up a water 'circuit' that will help push the thruster.  For most people a large enough test tank (like a pool or lake or whatever) is not an option but a small test tank will give readings that include the force of swimming 'with the current' if you will.  It will be as if you're measuring the force of someone paddling a canoe downstream from shore instead of in a lake.  The amount of 'pull' with relation to someone on shore will include the force that the river is exerting on the canoe relative to the shore as well.
I suggest all tests should include a description of the test equipment used so if someone is testing a different way or wants to compare them he can.  This should include a description of the volt, amp, and force measuring devices in addition to a description of the testing procedures.  It doesn't have to be longwinded or really detailed, just:
Tub test: Fluke Digital VOM meter, 15 gallon test tub, using 90 degree arm and digital scale.  Distance from thruster to pivot point and pivot point to scale is 80cm...
EDIT: this is the test jig they are using at OpenROV:
