Errorbars and fitting
Intro
Here, we're going to take our plotting demo a little further and try to fit data with a curve, and plot data with errorbars.
We'll be looking at data from galcrash here; we'll consider the dataset below, taken with periapse of 0.5 kpc (note that the homework requires 1.0 kpc):
#perapse = 0.5 kpc #mass time (Myr) uncertainty (Myr) 0.1 1100 100 0.2 520 50 0.3 350 25 0.4 300 25 0.5 280 20
We're going to want to fit this data with something of the form t=m-1/2.
Scilab
With data this small, entering it into SciLab is pretty straightforward:
-->masses = [0.1 0.2 0.3 0.4 0.5]; -->time = [1100 520 350 300 280]; -->uncertainty = [100 50 25 25 20];
We'll now plot in mostly the same way as before, using a style to choose plotting data points rather than lines, and using rect to set the range of our plot so that it doesn't hide the first and last data points. We'll end by adding error bars; errbar plots the data with y-errorbars ranging from (y-3rd argument) to (y+4th argument); for symmetric error bars like we have here, we'll just use the same uncertainty twice:
--> clf; plot2d(masses,time,style=-1,rect=[0,200,0.6,1200]);
--> xtitle("","red galaxy mass / green galaxy mass",
"Merger time (Myr)", "");
--> errbar(masses,time,uncertainty,uncertainty);
Now we want to find a value of C which fits the data. A perfectly acceptable way of exploring the data to find out what parameters make sense is a little trial-and-error, choosing C yourself, plotting the function yourself and seeing how it works:
--> // plot the data. --> clf; plot2d(masses,time,style=-1,rect=[0,200,0.6,1200]); --> // Mass values for the fitted curve --> finem = [0.05:0.01:0.55]; --> // A trial "C" value --> c = 1.; --> // Calculate the expected merger times using this C --> fittedt = (453./c)*sqrt((0.5)^3)*(finem)^(-1); --> // Plot ... --> plot2d(finem, fittedt); --> //oops, C is too low (times are too low); try a new C --> c = 2.; --> fittedt = (453./c)*sqrt((0.5)^3)*(finem)^(-1); --> plot2d(finem, fittedt); --> //...etc
To do the fitting, we need to do something we haven't done before - create scilab functions. One that gives the fitted data, given some parameters, and one that calculates the error between that and the real data.
-->function g=error(p,z) --> realtimes=z(1), masses=z(2); --> c = abs(p(1)+1.0e-6); --> fittimes = 453./c * sqrt((0.5)^3) * masses^(-1), --> g = realtimes-fittimes -->endfunction
(Note that we've made C be the fit parameter plus a tiny little bit, just so that during the fitting procedure, if the fitter tries c=0, things won't blow up.)
Now we do the fit:
-->p0 = [1]; // initial guess of 1
-->z = [time; masses];
-->[p,err] = datafit(error, z, p0);
-->p
p =
1.458
So the fitting function found the best fit was 1.458. We can now plot this best fit;
--> clf; plot2d(masses,time,style=-1,rect=[0,200,0.6,1200]);
--> finem = [0.05:0.01:0.55];
--> fittedt = (453./p)*sqrt((0.5)^3)*(finem)^(-1);
--> xtitle("","red galaxy mass / green galaxy mass", "Merger time (Myr)", "");
--> errbar(masses,time,uncertainty,uncertainty);
--> plot2d(finem, fitteddt)
Gnuplot
Gnuplot isn't as powerful as scilab, but it is also a little easier to use.
If we have a file called "data.txt" which contains the following,
#perapse = 0.5 kpc #mass time (Myr) uncertainty (Myr) 0.1 1100 100 0.2 520 50 0.3 350 25 0.4 300 25 0.5 280 20then we can plot the data with errorbars as follows:
gnuplot> set xrange [0.05:0.55] gnuplot> plot 'data.txt' with errorbars
Fitting the data is only slightly more complicated. Again, we have to define a function to fit to (and gnuplot assumes the dependant variable is called x, so we have to use that), and then we define an initial guess for the fit parameters, and off we go:
gnuplot> fittime(x)=453./c*sqrt((0.5)**3)*x**(-1)
gnuplot> c = 1
gnuplot> fit fittime(x) 'data.txt' using 1:2 via c
[...lots of iterations...]
Final set of parameters Asymptotic Standard Error
======================= ==========================
c = 1.45809 +/- 0.04775 (2.76%)
correlation matrix of the fit parameters:
c
c 1.000
gnuplot>
and now that the parameter is set, you can plot it:
gnuplot> set xlabel "Red mass / Green Mass" gnuplot> set ylabel "Merger time (Myr)" gnuplot> plot fittime(x), 'data.txt' with errorbars
Octave/Matlab
In Octave, the procedure for plotting is very similar to that in SciLab:
octave:1> masses = [0.1 0.2 0.3 0.4 0.5];
octave:2> time = [1100 520 350 300 280];
octave:3> uncertainty = [100 50 25 25 20];
octave:4> errorbar(masses,time,uncertainty);
octave:5> xlabel("Red Mass/Green Mass")
octave:6> ylabel("Merger Time (Myr)")
To do data fitting in Octave (but not Matlab) you may need to install some optional modules; try the below first. If it doesn't work for you, you should be able to download the packages from Octave Forge: you are looking for
and then from within Octave, you can install the above packages given where they are downloaded to:octave:9> pkg install "/Users/ljdursi/Downloads/struct-1.0.9.tar.gz" octave:10> pkg install "/Users/ljdursi/Downloads/miscellaneous-1.0.11.tar.gz" octave:11> pkg install "/Users/ljdursi/Downloads/optim-1.0.16.tar.gz"One that's done, you again have to define the fitting function and invoke the fitting routine:
octave:12> fitfun = inline("453./(abs(p(1)+1.e-6)) * sqrt((0.5)^(3)) .* x.^(-1)","x","p")
octave:13> guess = [1.0, 0,0];
octave:14> [f, p, kvg, iter, corp, covp, covr, stdresid, Z, r2] = leasqr(masses, time, guess, fitfun);
octave:15> hold on #save previous graph
octave:16> plot(masses,f) #plot fitted data
