Hi,
Jeremy Hinegardner wrote:
If you want to describe your data needs a bit, and what operations you
need to operate on it, I'll be happy to play around with an ruby/sqlite3
program and see what pops out.
I've created a small tolerance DSL, and coupled with the Monte Carlo
Method[1] and the Pearson Correlation Coefficient[2], I'm performing
sensitivity analysis[2] on some of the simulation codes used for our
Orion vehicle[3]. In other words, jiggle the inputs, and see how
sensitive the outputs are and which inputs are the most influential.
The current system[5] works, and after the YAML->Marshal migration,
it scales well enough for now. The trouble is the entire architecture
is wrong if I want to monitor the Monte Carlos statistics to see
if I can stop sampling, i.e., the statistics are converged.
The current system consists of the following steps:
1) Prepare a "sufficiently large" number of cases, each with random
variations of the input parameters per the tolerance DSL markup.
Save all these input variables and all their samples for step 5.
2) Run all the cases.
3) Collect all the samples of all the outputs of interest.
4) Compute running history of the output statistics to see
if they have have converged, i.e., the "sufficiently large"
guess was correct -- typically a wasteful number of around 3,000.
If not, start at step 1 again with a bigger number of cases.
5) Compute normalized Pearson correlation coefficients for the
outputs and see which inputs they are most sensitive to by
using the data collected in steps 1 and 3.
6) Lobby for experiments to nail down these "tall pole" uncertainties.
This system is plagued by the question of "sufficiently large"?
The next generation system would do steps 1 through 3 in small
batches, and at the end of each batch, check for the statistical
convergence of step 4. If convergence has been reached, shutdown
the Monte Carlo process, declare victory, and proceed with steps
5 and 6.
I'm thinking this more incremental approach, and my lack of database
experience would make a perfect match for Mongoose[6]...
Since there's no Ruby Quiz this weekend, we all need something to work
on :-).
Regards,
···
--
Bil Kleb
http://fun3d.larc.nasa.gov
[1] Monte Carlo method - Wikipedia
[2] Pearson correlation coefficient - Wikipedia
[3] Sensitivity analysis - Wikipedia
[4] Crew Exploration Vehicle - Wikipedia
[5] The current system consists of 5 Ruby codes at ~40 lines each
plus some equally tiny library routines.
[6] http://mongoose.rubyforge.org/