Example: PG 1115+080

Data

Here is a basic data file, with image positions from Impey et al. (1998 ApJ 509:551), and the time delay from Barkana (1997 ApJ 489:21).

1
-0.382 -1.344 0.003
0.0 10000.
0.0 10000.
0.0 10000.

1
4
-1.328 -2.037 3.88 0.003 0.78 0.0 0.0 A1
-1.478 -1.576 -2.51 0.003 0.50 0.0 0.0 A2
0.341 -1.960 -0.65 0.003 0.13 25.0 1.6 B
0.000 0.000 1.00 0.003 0.20 0.0 0.1 C


SIE+shear model

Here is my input file.

set omega = 0.3
set lambda = 0.7
set hval = 0.7
set hvale = 1.0e4
set zlens = 0.31
set zsrc = 1.72
#
set omitcore = 0.01
data 1115tdel.dat
set chimode = 0
set gridflag = 0
set checkparity = 0
set restart = 3
set PAwrap = 0
#
startup 1 1
  alpha 1 -0.382 -1.344 0.3 10 0.1 10 0 0 1
  1 1 1 1 1 1 1 0 0 0
#
randomize 10 sieg-ran
  0.5 1.5
  -0.392 -0.372
  -1.354 -1.334
  0.0 0.7
  -90.0 90.0
  0.0 0.3
  -90.0 90.0
#
startup sieg-ran.start
optimize sieg
#
quit

It looks very similar to the basic example before, with two exceptions. First, I have specified the cosmology and the lens and source redshifts so the time delay can be used to infer the Hubble constant. Note that the variable hvale specifies the prior uncertainty on the value of h.

The second change is I added the line set PAwrap = 0. The default mode for the code is to wrap all position angles to be in the range -90 to +90. However, this can cause problems when the PA actually needs to be near +/-90. I was finding this to be the case for PG1115, so I told the code not to wrap the position angles.

Here is the result:

LENS PARMS:
alpha 1.137324e+00 -3.683792e-01 -1.341635e+00 1.892179e-01 9.117806e+01 9.897324e-02 4.921371e+01 0.0 0.0 1.000000e+00
SOURCE PARMS:
ptsrc 2.530705e-01 -3.994549e-01 -1.205168e+00 0.0 0.0 0.0 0.0 0.0
CHISQ: 3.434334e+01 1.132735e+01 1.780414e+00 3.612143e-10 2.123557e+01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 # tot pos flux tdel gal plim crv ring weak

Source #1
  posn: -3.994549e-01 -1.205168e+00
  flux: 2.530705e-01
  h: 5.505269e-01
  tscale: 3.255131e+01
images:
  -1.3280 -2.0370 (3.0e-03) 3.8800 ( 7.80e-01) 0.00 (0.0e+00) -> -1.3255 -2.0368 3.1369 15.8523
  -1.4780 -1.5760 (3.0e-03) -2.5100 ( 5.00e-01) 0.00 (0.0e+00) -> -1.4850 -1.5766 -2.8161 16.0706
  0.3410 -1.9600 (3.0e-03) -0.6500 ( 1.30e-01) 25.00 (1.6e+00) -> 0.3351 -1.9613 -0.7134 25.0000
  0.0000 0.0000 (3.0e-03) 1.0000 ( 2.00e-01) 0.00 (1.0e-01) -> -0.0032 -0.0007 0.8979 0.0000

The fit is not very good, with a total chisq of 34.3 for Ndof = 4 degrees of freedom. (Note: you can check that the image plane chisq gives very similar results here.)


Trying a group

We suspect the simple model fails because the galaxy lies in a poor group of galaxies (Kundic et al. 1997), and the environment provides more than a shear term. The next level of complexity is to add one mass component that represents the environment (presumably the common dark matter halo of the group, if it has one). A shear corresponds to putting an infinitely massive perturber infinitely far away. What if we bring the perturber in to a finite distance?

We figure the shear model probably gets the position angle of the perturber about right, so let's assume that value as a starting point. Then let's look at what happens as we vary the distance of the perturber from the lens. This can be done very simply by specifying the perturber position in polar coordinates, using the command:

set galcoords = 2

If we guess some starting distance, say 20", then we can use the relation gamma = b / (2*r) for the shear of an SIS perturber to guess the perturber's mass. Since the shear is about 0.1, this gives b of about 4" for r = 20".

Now, shear is identical if you rotate it through 180 degrees, but a physical perturber may not be. So I will explicitly consider perturbers on both sides of the lens.

After the basic setup as above, the input file looks like this: set galcoords = 2
#
startup 2 1
  alpha 1.137324e+00 -3.683792e-01 -1.341635e+00 1.892179e-01 9.117806e+01 0 0 0 0 1
  alpha 4.0 20.0 49.21371 0 0 0 0 0 0 1
  1 1 1 1 1 0 0 0 0 0
  1 1 1 0 0 0 0 0 0 0
varyone 2 2 5.0 30.0 51 grp1a
#
startup 2 1
  alpha 1.137324e+00 -3.683792e-01 -1.341635e+00 1.892179e-01 9.117806e+01 0 0 0 0 1
  alpha 4.0 20.0 -130.78629 0 0 0 0 0 0 1
  1 1 1 1 1 0 0 0 0 0
  1 1 1 0 0 0 0 0 0 0
varyone 2 2 5.0 30.0 51 grp1b

The product is a plot of chisq versus perturber distance (right), and the locus of possible perturber locations (left).

Notice that the chisq curves are a little jagged. This is reasonably common when running varyone, and it suggests that at certain steps the optimization routine did not find the global minimum. Sometimes it can be mitigated by decreasing the step size (so the code moves through parameter space more adiabatically). Other times it helps to increase restart. In this case the most effective solution was to adjust the tolerance in the optimization routine:

set ftol = 1.0e-5

(The default value is 1.0e-4.) Decreasing the tolerance forces the optimization routine to be more demanding when it evaluates whether it has converged to a minimum. Sure enough, here is what I got when I reran with the smaller tolerance:

Notice that there is a significant difference between the two sides: a Southwest perturber does much better. Also notice that there is a clear minimum in the chisq at a finite distance (about d=10"). The data really want there to be a physical mass component sitting nearby, as opposed to some vague perturber sitting far away. Not only that, the best fit model (now optimizing d as well as the other parameters) is:

LENS PARMS:
  alpha 1.032871e+00 -3.815636e-01 -1.343906e+00 2.891309e-02 3.895330e+01 0.0 0.0 0.0 0.0 1.000000e+00
  alpha 2.142736e+00 1.041993e+01 -1.137773e+02 0.0 0.0 0.0 0.0 0.0 0.0 1.000000e+00
SOURCE PARMS:
  ptsrc 1.696346e-01 1.540926e+00 -2.102938e+00 0.0 0.0 0.0 0.0 0.0
CHISQ: 3.585655e+00 2.251145e-02 3.540982e+00 1.483510e-09 2.216180e-02 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 # tot pos flux tdel gal plim crv ring weak

Source #1
  posn: 1.540926e+00 -2.102938e+00
  flux: 1.696346e-01
  h: 4.515448e-01
  tscale: 3.255131e+01
images:
-1.3280 -2.0370 (3.0e-03) 3.8800 ( 7.80e-01) 0.00 (0.0e+00) -> -1.3278 -2.0370 3.3700 14.0549
-1.4780 -1.5760 (3.0e-03) -2.5100 ( 5.00e-01) 0.00 (0.0e+00) -> -1.4783 -1.5760 -3.2193 14.2175
0.3410 -1.9600 (3.0e-03) -0.6500 ( 1.30e-01) 25.00 (1.6e+00) -> 0.3407 -1.9601 -0.5589 25.0000
0.0000 0.0000 (3.0e-03) 1.0000 ( 2.00e-01) 0.00 (1.0e-01) -> -0.0002 -0.0000 0.8438 0.0000

This is a very good fit. In fact, the model fits all the data very well except the A2/A1 flux ratio -- but this is known to be a flux ratio "anomaly" that is caused by small-scale structure in the lens potential, so we should not expect a smooth mass model to fit it.


MCMC

Let me now use MCMC to characterize the model uncertainties. The one technical challenge here is that the galaxy ellipticity is very small, so a small random step can take it negative, and then things can get weird. It is easy to prevent this by working with the ellipticity in "Cartesian" coordinates:

set shrcoords = 1

Here is my full MCMC input file:

set omega = 0.3
set lambda = 0.7
set hval = 0.7
set hvale = 1.0e4
set zlens = 0.31
set zsrc = 1.72
#
set omitcore = 0.01
data 1115tdel.dat
set chimode = 0
set gridflag = 0
set checkparity = 0
set restart = 3
set ftol = 1.0e-5
set PAwrap = 0
set shrcoords = 1
#
startup 2 1
  alpha 1.032890e+00 -3.815471e-01 -1.343893e+00 5.914963e-03 2.819103e-02 0.0 0.0 0.0 0.0 1.000000e+00
  alpha 2.141952e+00 9.149361e+00 -5.545892e+00 0.0 0.0 0.0 0.0 0.0 0.0 1.000000e+00
  1 1 1 1 1 0 0 0 0 0
  1 1 1 0 0 0 0 0 0 0
#
optimize mcmc-tmp
startup mcmc-tmp.start
#
stepsize
  0.003 0.003 0.003 0.003 0.003 0 0 0 0 0
  0.01 0.01 0.01 0 0 0 0 0 0 0
mcmc mcmc 10
#
quit

Notice that I do a quick optimization first, to make sure the MCMC starts from a minimum in the chisq surface. (This is important for how the code computes the covariance matrix to make the MCMC efficient.)

The results are shown here, for the group halo position (left), and the two "Cartesian" components of the galaxy ellipticity (right).

One current minor limitation of the code is that the MCMC routine does not report complete information on the model; it omits any information about the Hubble constant, for example. However, that is easy to fix by loading the MCMC output models and rerunning them with no further optimization:

(all the usual stuff goes here, then...)
startup mcmc.startN
changevary 0
reopt mcmc-h

We can then make a histogram of the h values. By the nature of MCMC, this provides the full posterior distribution:

PG 1115+080 is well known to be a lens for which isothermal models yield a startlingly low value of h.