Stitching Photo Mosaics
Stitch together different images into one good looking panorama, both manually and automatically.
I started by walking around my lovely campus and taking 3 pairs of photos that would work well.
I also grabbed some testing images from an example project to help me debug compared to the desired results.
Next, I needed to recover the homographies. This involved the ability to select points and then find the homography that results in the best warp. To do this, I turned the problem into a matrix and then solved for the least squares solution.
The next step was to figure out warping. This easily took the most time and years off my life. The way I did this was by warping every pixel of the warping image using the homography and then translating it onto the correct image. I also generate the alpha channel by simply filling a binary map of the same size with a quad whose corners are the warped coordinates of the original corners. I then interpolate the pixels using scipy.interpolate.griddata. This one function takes 98% of my compute time, but is necessary for a good looking result. Below are two images and their rectified results:
Finally came the most annoying part. It took a lot of theory crafting and throwing a substance I would rather not mention by name on a school assignment at the wall until something stuck, and then building off it. My finalized pipeline involved far more steps to complete than is likely needed, but I still love it like my own. I start by computing where the warped and stationary images result in the final image. Then I create both alpha channels as well as their overlap. Then, I use that to create a new inverse mask of where the final image is just the warped and stationary images. Then, I send this to scipy.ndimage.distance_transform_edt to compute the distance in the overlap from where it’s just each image. I then combine everything to create a new map where it is 1 where it should be the warped image, 0 where it should be the stationary image, and then have it interpolate between the two based on its distance to both. I finally actually interpolate between the images. This process sounds annoying, and it was, but the results speak for themselves (source images in same order at the top):
(yes, the last image does have the same person twice. It was very annoying to get this as everyone was being considerate and walking around me taking my photos.)
And my testing images:
To automate the process, we first need to detect key features that we’d like to match. To do this, we first need to locate the Harris corners. This gives us far too many points to be useful, so we then use Adaptive Non-Maximal Suppression (ANMS) to choose a set number of points spatially well distributed over the image.
Next, we want to extract a vector to represent each feature. That is, we take a 40x40 pixel area around each feature and represent it by a 64 dimensional vector based on the grayed image. Then, we compare these features using their distances between them. We then set a threshold using Lowe of thresholding on the ratio between the first and the second nearest neighbors.
We then use RANSAC to compute a homography that matches the most feature descriptors as possible in a reasonable amount of time.
Finally, we get our automatic panoramas. They aren’t as sharp as the manual ones, but they are mostly the same.